Skip Navigation LinksHome > Categories > Code from a Category

Extracting Data from a Source System to History Tables



User Name: codelecturer
Name: Mike Chauhan
Contact Me: www.datawebcoder.com/ContactUs.aspx
Home Page: www.datawebcoder.com
6 years of total IT experience including programming, application development, System Maintenance . 3 years of Experienced and expertise in .Net Framework VB, Visual Basic.Net, ASP.Net 2.0, 3.5 and AD... [More]
Viewed Times: 927
Add Date: 12/09/2012
In this article we will learn about how to extracting data from a source system to history tables.
It is important that the method for extracting data be done in a way that does not hinder performance of the source system.

In this example, the goal is to extract data from a source system, into another database (or server) all while requiring as little resources as possible. This is why I choose to pull from a source system in two separate stages.

First Stage – Staging Import

The first step is to do a very simple select statement into a staging table. This first select statement may do some ETL β€” mostly in regards to lookups that are needed from the source system. There could be multiple select statements pulling data into multiple staging tables. I prefer to pull tables from the source to staging in a one to one relationship. So for every table we need, we also have a corresponding staging table. See the diagram below:



Source to History database diagram The reason for pulling one to one is simple. First of all, the query is a very simple select. Second, it makes troubleshooting very simple. After importing into staging, the next step is to move the records to the history table(s).

Second Stage – Historical Import

In the historical import, we compare what we have in our history table with what is in staging. Each record in staging is joined with the corresponding current record in the history table on the primary key(s). The checksum of each of the columns is then compared to see if an update has happened to the row. If a row has been updated from the source system, we will need to record this in the history table. In order to do this, we close out the previous record by way of effective dating.

Limiting the Pulls

During the production hours, it’s not a good idea to pull every record from the source system in order to compare to the destination. Therefore, we implement a method of running called an incremental refresh. The incremental refresh simply selects all the records from the table who have a change date greater than the time of the last successful pull. We also employ a full refresh, which can be run at the end of the night, which will allow for a complete comparison of source to destination.

Code


Happy Programming!!!

Post a Comment

Name: (Optional)
Email: (Optional, you can get an email if somebody replys your comments)*
Email me if somebody respons my comment below:
Details**:
Enter Text
as Below:
(case insensitive, if hard to read, click the "get a new one" button)
 
    
* Your email address will not be shared with any third parties for any reason.
** Maximum 1000 charactors.