Sunday, November 29, 2015
No of Data Flow Task executing simultaneously based on the number of Processor and Parallelization setting
As mentioned earlier, we are experimenting on 4 processors system. If we set MaxConcurrentExecutables to -1, we will be using all the 4 processors in this system.
If we set MaxConcurrentExecutables to value greater than 4, Control Flow Engine will try to meet the request for concurrent executions but this again depends on the processor availability and system load. Truncated table dbo.Employee before executing the package with the changes mentioned below.
If we set MaxConcurrentExecutables to value less than 4, we will be using that number of processors simultaneously. Truncated table dbo.Employee before executing the package with the changes mentioned below.
Thus we can conclude that the number of processors and setting of MaxConcurrentExecutables plays a role in Control Flow Tasks parallelization.
Saturday, November 28, 2015
T-SQL 2012 Interview questions
Q: Given two tables T1 and T2 with same structure. If T1 has 20 records and T2 has 35 records, then the Full Join should result in how many records?
A: 50 records.
Q: What is difference between Equi-Join and NonEqui-Join?
A: If all columns in the ON clause are matched on equality ( using operator = ) the join is called "Equi-Join". If one or more columns are using an inequality comparison (using operators like <>, >, >=, <, <=, != ) the join is called "Nonequi-Join".
Q: Given table Country with one column with values India, Australia and England in specified order. Write a T-SQL query to display knockout matches between teams in the specified order:
India vs Australia
India vs England
England vs Australia
Q: Given table T1 with ID column and row values 3, 10, 19, 20. Find missing values starting with the minimum and maximum from table.
Please leave your suggestions and feedback by commenting
Develop a SSIS package to fetch only Excel files from given folder and read through multiple sheets in each file.
The requirement is to fetch only *.xls/*.xlsx files from given folder. Each file is expected to have random number of sheets but with same metadata i.e. type and number of columns. Read through each of such Worksheet in each file and copy data to common table.
This can be achieved using ForEach Loop Container, OLE DB Source(To read data from Excel 2007 and onwards)/Excel Source(To read data from Excel 97-2003) and any compatible destination in SSIS.
This example explains steps to read from *.xlsx file with some notes to use *.xls file.
Drag and drop another Foreach Loop Container into existing Foreach Loop Container and rename each container according to its purpose.
In the Connection Managers area, right click and select New ADO.NET Connection...
Select New button under Configure ADO.NET Connection Manager. Under Provider, select Microsoft Office 12.0 Access Database Engine OLE DB Provider and click OK.
Under Connection Manager window, select All. Under Extended Properties, type:
EXCEL 12.0; HDR=YES; - For Excel 2007 and onwards
EXCEL 8.0; HDR=YES; - For Excel 97-2003
Click OK.
Click OK on Configure ADO.NET Connection Manager window.
Right click on created ADO.NET Connection Manager and select Properties.
Selecting Expressions property, click ellipsis button to open Property Expressions Editor window.
Select ConnectionString under Property and click on ellipsis button under Expressions corresponding to ConnectionString property row.
Opens Expression Builder window. Type following under Expression:
"Data
Source=" + REPLACE(@[User::ExcelFilePath], "\\\\",
"\\") + ";Provider=Microsoft.ACE.OLEDB.12.0;Extended
Properties=" + "\""+"EXCEL 12.0;HDR=YES;"+
"\"" + ";"
This expression sets the Data Source to the excel file path from variable ExcelFilePath replacing double backslash with single backslash. Further it sets the Provider and Extended Properties.
Evaluate Expression and click OK.
Click OK. Note the ADO.NET Connection Manager icon changes to reflect use of expression(fx).
Edit nested Foreach Loop Container. Selecting Foreach
ADO.NET Schema Rowset Enumerator, select ADO.NET Connection Manager created.
Under Schema, select Tables.
Just to note, click on Set Restrictions. We see that Table_Name is listed 3rd and for this example, it provides us with SheetNames of the excel file.
Do not check mark on any options here and click Cancel.
Select Variable Mappings.
Create new variable SheetName.
Under Index corresponding to this variable, specify 2 to get TABLE_NAME(WorkSheets for this example).
Click OK.
Add breakpoints on internal Foreach Loop Container for each
iteration. Execute the package. Add Watch on SheetName variable to see the different sheets in excel file.
We can use OLE DB Source/Excel Source to read from these excel sheets based on Excel file version.
If you see package execution failure with following message in Progress tab:
Error: The GetEnumerator method of the ForEach Enumerator has failed with error 0x80131509 "(null)". This occurs when the ForEach Enumerator cannot enumerate.
Rectify this by setting Run64BitRuntime to False under SSIS Project Properties -> Configuration Properties -> Debugging.
Please feel to correct me by commenting your feedback and suggestions.
Subscribe to:
Posts (Atom)