Tuesday, December 22, 2015

Raw File Source

The Raw File source reads raw data from a file.

Since the data is native to the source and requires no translation and almost no parsing. This means that the Raw File source can read data more quickly than other sources like the Flat File and the OLE DB.

The Raw File source is used to retrieve raw data that was previously written by the Raw File destination.


We configure the Raw File by specifying the name of the file that the source reads. This source does not use a connection manager.

 

This source has one output. It does not support an error output.

 

Custom properties

All properties are read/write.

  • AccessMode: Integer (enumeration) - The mode used to access the raw data. The possible values are File name (0) and File name from variable (1). The default value is File name (0).

  • FileName: String - The path and file name of the source file.

 

The following example uses RAW FILE source. In this example, like earlier we will copy data Raw file containing data from Employee table into Employee_new table. Before moving ahead with example, refer these earlier articles:


 Let’s first TRUNCATE or DELETE data from Employee_new table.

 

Open SSISStudy / SSISStudy2012 project created earlier. Open Source Explorer and right click on SSIS Packages and select New SSIS Package and rename it to RawFileSourceExample.

 

Before we use Raw File Source, we need to create Raw File using Raw File Destination. For this, drag and drop Data Flow Task on Control Flow area. Open Data Flow tab, drag and drop OLE DB Source on the Data Flow area. Configure to read data from HumanResources.Employee table. Drag and drop Raw File Destination on the Data Flow area.

  Connect OLE DB Source to Raw File Destination.

  Right click on Raw File Destination and select Edit to open Raw File Destination Editor. Set Access mode as File name, click on Browse button and select the location where the raw file needs to be created and provide file name and under Write option, keep the default Create Always.

Select Columns and check mark all the columns required as Raw File columns.

Click OK. Execute the package to create raw file RawFileSource. Check if raw file RawFileSource with data is created at the specified location.

 

Now returning back, drag and drop another Data Flow Task. Connect the output of the first Data Flow to this new Data Flow. For convenience and understanding, we have renamed the first Data Flow as To Raw File and current Data Flow as From Raw File.

  Open From Raw File and drag and drop Raw File Source on Data Flow area.

  Right click and select Edit to open Raw File Source Editor. Under Access mode, select File name to specify the file directly or File name from variable to read file path saved in variable. For this example, we have selected File name and selected the file created earlier using Browse button.

Select Columns and check mark all the columns required to be copied to destination table Employee_new.

Click OK.

 

Drag and drop OLE DB Destination on the Data Flow area and connect Raw File Source output to OLE DB Destination. Configure OLE DB Destination to copy data to Employee_new table.

Execute the package and it should run provided the file is available at mentioned location.

 

What if the file is deleted by someone who used our system after we went away for some much needed break? Let’s try that by deleting the RawFileSource file and then execute the package. We should run into error. On closer look, we should find the location and reason for the error.

We have highlighted the location “Error at From Raw File [Raw File Source[1]]” and reason “the file is not found.”

   But our setup first creates the file and then we read from the file. So there should not be any error, right? The Data Flow Compiler does not think that way. It sees that before package execution, the file does not exist and hence throws the error. To overcome this, we need to set the DelayValidation property of Data Flow(From Raw File) to True. This signals the compiler that we are sure to create the file before it is read and the compiler does not need to signal this as validation error.

Right click Data Flow(From Raw File) and select Properties. Set DelayValidation to True and then execute the package.

 

Click for NEXT article in the sequence.



Please feel free to correct me by commenting your suggestions and feedback.

No comments:

Post a Comment