Wednesday, December 16, 2015

Flat File Source

The Flat File source reads data from a text file. The text file can be in delimited, fixed width, or mixed format.
  • Delimited format uses column and row delimiters to define columns and rows.
  • Fixed width format uses width to define columns and rows. This format also includes a character for padding fields to their maximum width.
  • Ragged right or mixed format uses width to define all columns, except for the last column, which is delimited by the row delimiter.

The Flat File connection manager must be configured to interpret zero-length strings as nulls when using delimited format.



The output columns include the FastParse property. FastParse indicates whether the column uses the quicker, but locale-insensitive, fast parsing routines that Integration Services provides or the locale-sensitive standard parsing routines. By implementing fast parse, a package forfeits its ability to interpret date, time, and numeric data in locale-specific formats, but the package enhances its performance. For example, fast parse supports only the most commonly used date format representations such as YYYYMMDD and YYYY-MM-DD, does not perform locale-specific parsing, does not recognize special characters in currency data, and cannot convert hexadecimal or scientific representation of integers.

If the data flow in the package requires locale-sensitive parsing, standard parse is recommended instead of fast parse.

 

Output columns also include the UseBinaryFormat property. We use this property to implement support for binary data, such as data with the packed decimal format, in files. By default UseBinaryFormat is set to false. If we want to use a binary format, set UseBinaryFormat to true and the data type on the output column to DT_BYTES. When we do this, the Flat File source skips the data conversion and passes the data to the output column as is. We can then use a transformation such as the Derived Column or Data Conversion to cast the DT_BYTES data to a different data type, or we can write custom script in a Script transformation to interpret the data.



This source has one output and one error output.


Custom properties

FileNameColumnName: The name of an output column that contains the file name. Can be set by using the Advanced Editor.
RetainNulls: A value that specifies whether to retain Null values from the source file as Null values. The default value of this property is False.
FastParse: A value that indicates whether the column uses the quicker, but locale-insensitive, fast parsing routines that DTS provides or the locale-sensitive standard parsing routines. Can be set by using the Advanced Editor.


The following example uses Flat File Source. In this example, we will copy data from Flat File Source containing data from Employee table into Employee_new table. Before moving ahead with example, refer these earlier articles:

  • Create a new SSIS project using BIDS / SSDT

 

Let’s first TRUNCATE or DELETE data from Employee_new table. We want to copy data from Employee table into new flat file (.txt or Text file). This can be done by using Import and Export feature in SSMS. We can create Fixed width, Delimited and Ragged right flat file.
For our demonstration, we have copied data from Employee table to a new fixed width flat file, new comma delimited flat file and new ragged right flat file and renamed the files as FixedWidthTextSource, DelimitedTextSource and RaggedRightTextSource respectively.

   

Open SSISStudy / SSISStudy2012 project created earlier. Open Source Explorer and right click on SSIS Packages and select New SSIS Package and rename it to FlatFileSourceExample.


 

Drag and drop Data Flow Task on Control Flow area. Open Data Flow tab. Drag and drop Flat File Source on Data Flow area.

Rename the Flat File Source to Fixed Width.
Right click Fixed Width and select Edit to open Flat File Source Editor.
Click on New to open Flat File Connection Manager Editor window. This is used to create new Flat File Source Connection manager.
Click on Browse to open File Dialog. Browse to the Fixed Width flat file created earlier. Select and click Open.
We will not change the Locale or Code Page. Under Format, select Fixed Width. Based on the data in your Fixed Width flat file, if the file includes header columns provide the appropriate values in Header row delimiter, Header rows to skip and check mark Column names in the first data row.
Select Advanced. Here we need to specify the columns that would appear as output from source. Basically, we will help the Flat File Source to interpret the data in flat file. In this example, we have a total of 14 columns from Employee table in flat file. Click on New to create 14 columns.
Type FixedWidth under Connect manager name. Rename the columns. 


 

By default the columns will be of type string [DT_STR] with default length. We can use Suggest Types to suggest possible column types. But do not completely rely on it. Compare against the schema of the Employee table to get the idea about column width and type. Change InputColumnWidth and DataType properties for each Column accordingly. Also OutputColumnWidth if enabled, make sure the value is equal or more than InputColumnWidth value.  
Click on Preview and make sure the data from flat file appears in correct expected columns. If any column data appears split across columns, change column width in Advanced. Click OK.
Flat File Source output columns will be by default of type string [DT_STR]. To avoid any conversion or truncation errors, right click Flat File Source and select Show Advanced Editor. Select Input and Output Properties tab. Under Inputs and outputs, expand Flat File Source Output, Output Columns and update DataType property for each output column. The conversion should be to corresponding destination column type. For example, EmployeeID column in destination table is of type Int. So we have converted the column datatype to four-byte signed integer (DT_I4). Click OK.
Drag and drop OLE DB Destination. Connect Flat File Source output to OLE DB Destination. Configure OLE DB Destination to map columns from source to Employee_new table.
Execute the package. We should see data copied from Fixed Width Flat File Source to Employee_New table. 


Now let’s see an example of Comma Delimited Flat File Source. Truncate table Employee_New. Drag and drop Flat File Source. Right click and edit. Select New to create flat file connection manager. Browse to open Comma delimited file created earlier. Under Format, select Delimited. If any text qualifier like “” used in the source file, specify the same under Text qualifier. Specify, Header row delimiter and Header rows to skip according to the source file content. Check mark, Column names in the first data row if column names appear as first row in the source file.
Since we have check marked against Column names in the first data row, Column names from source file appear under Columns and Advanced with default DataType and OutputColumnWidth values.
Either change column type and width under DataType and OutputColumnWidth or do so under Advanced Editor for Flat File Source.
Drag and drop OLE DB Destination. Connect Source to destination and execute the package.
We should see data from Comma-Delimited Flat File copied to Employee_New table.



Now let’s see an example of Ragged Right flat file source. Truncate table Employee_New. Drag and drop Flat File Source. Right click and edit. Select New to create flat file connection manager. Browse to open Ragged Right file created earlier. Under Format, select Ragged right. Specify, Header row delimiter and Header rows to skip according to the source file content. Check mark, Column names in the first data row if column names appear as first row in the source file.
Repeat the steps specified under Fixed Width flat file source explanation.

 

Click for NEXT article in the sequence.

 

Please feel free to correct me by commenting your suggestions and feedback.

No comments:

Post a Comment