Metadata based SSIS using Biml – Part 2

Metadata based SSIS using Biml – Part 2
Diesen Artikel finden Sie hier auch in deutscher Sprache.

So you went through part 1 and thought it was nice, but not overly exciting? Let’s try again!

The main weaknesses of our original solution were:

– It only supported one datasource
– It did not allow any filtering on columns, so effectively it was always getting the full table

To adress these 2 issues, we will need to modify our metdata layer. Let’s do that in a new database called „MyBiml_Destination“.

Let’s create our tables and populate them again with some sample data:

USE [MyBiml_Destination]
GO
CREATE TABLE [dbo].[MyBimlMeta_Connections](
	[ConnectionName] [nvarchar](50) NULL,
	[ServerName] [nvarchar](50) NULL,
	[DatabaseName] [nvarchar](50) NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[MyBimlMeta_Tables](
	[ConnectionName] [nvarchar](50) NULL,
	[SchemaName] [nvarchar](50) NULL,
	[TableName] [nvarchar](50) NULL,
	[Columns] [nvarchar](max) NULL,
	[Disabled] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
INSERT INTO [MyBimlMeta_Connections] values ('AW','localhost','AdventureWorks2014')
INSERT INTO [MyBimlMeta_Tables] values ('AW','Person','Person','*',0)
INSERT INTO [MyBimlMeta_Tables] values ('AW','Person','PersonPhone','PhoneNumber',0) 
 

This will load all columns from Person.Person as well as the PhoneNumber column from Person.PersonPhone.

Also, our codefile will need some minor changes:

Imports Varigence.Biml.CoreLowerer.SchemaManagement
Imports Varigence.Biml.Extensions
Imports Varigence.Languages.Biml
Imports Varigence.Languages.Biml.Connection
Imports System.Data
Imports System.Collections.Generic

Public Class BF
	Public Shared TargetDB As String = "MyBiml_Destination"
	Public Shared TargetConnString As String = "Provider=SQLNCLI11;Server=localhost;Initial Catalog="+ TargetDB + ";Integrated Security=SSPI;" 
	Public Shared columns As String
	Public Shared tblname As String
	Public Shared sql As String
	Public Shared ImportColumns As New List(Of String) 
	Public Shared Function GetTableID (Conn As String, Schema As String, Tbl As String) As String
		Dim tblname As String
		If Schema.toupper = "DBO" Then
			tblname =  (conn+"_" + Tbl).toupper
		Else
			tblname =  (conn+"_" +Schema+"_" +tbl).toupper
		End If
		Return tblname
	End Function
	
	Public Shared Function GetNonEmptyList (Conn As AstDbConnectionNode, SQL As String) As List(Of String)
		Dim tmplist As New List(Of String)		
		Dim DT As DataTable  = ExternalDataAccess.GetDataTable(Conn.ConnectionString,  SQL)
		For Each dr As datarow In dt.rows
			tmplist.add(dr.item(0).ToString())
		Next			
		If tmplist.count = 0 Then tmplist.add ("NONEMPTYSPACER")
		Return tmplist		
	End Function
	Public Shared Function DefaultImportOptions
        Return ImportOptions.ExcludeIdentity Or ImportOptions.ExcludePrimaryKey  Or ImportOptions.ExcludeUniqueKey _ 
        Or ImportOptions.ExcludeColumnDefault Or ImportOptions.ExcludeIndex Or _ 
        ImportOptions.ExcludeCheckConstraint Or ImportOptions.ExcludeForeignKey
    End Function  
End Class

Now let’s look at the necessary changes to our Biml-files:


01_Environment.biml

Instead of using one static destination, we will loop through our connections table and flag all entries there with an annotation IsSource=True.
This will give us the flexibility to maintain as many connections through our metadata-layer as we need.

 <#@ template language="VB" #>
<#@ import namespace="System.Data" #>
<#@ code file="../code/BimlFunctions.vb" #>
<#@ import namespace="BF" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <OleDbConnection Name="Target" ConnectionString="<#= TargetConnString #>" >
            <Annotations>
                <Annotation Tag="DBString"><#= TargetDB #>.dbo.</Annotation>            
            </Annotations>
        </OleDbConnection>
         <# for each row as datarow in ExternalDataAccess.GetDataTable(TargetConnString, "SELECT  ConnectionName, ServerName, DatabaseName  FROM [MyBimlMeta_Connections] where connectionname <> 'target' order by connectionname").rows #>
        <OleDbConnection Name="<#= row.item(0) #>" ConnectionString="Provider=SQLNCLI11;Server=<#= row.item(1) #>;Initial Catalog=<#= row.item(2) #>;Integrated Security=SSPI;">
            <Annotations>
                <Annotation Tag="IsSource">True</Annotation>
            </Annotations>
        </OleDbConnection>
        <#next  #>
    </Connections>
    <Databases>
        <Database Name="<#= TargetDB #>" ConnectionName="Target" />
    </Databases>
    <Schemas>
        <Schema Name="dbo" DatabaseName="<#= TargetDB #>" />
    </Schemas>
</Biml>

02_BuildMeta.biml

This file is the one with the most important changes. We will now:

– Loop through all our connections that are flagged as a source
– Within each connection, get the database schema, filtered on the tables defined in MyBimlMeta_Tables (for that connection)
– Query the source database for each table, using the defined columns from MyBimlMeta_Tables
– Add a table-tag for each table and then a column-tag for each column, that came back during that process

This will result in a filtered view containing only tables and columns that are specified in our metdata – unless the columns are defined as „*“ in which case we will still get all columns.

<#@ template language="VB" tier="2" #>
<#@ import namespace="System.Data" #>
<#@ code file="../code/BimlFunctions.vb" #>
<#@ import namespace="BF" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Tables>
		<# for each SrcConn as AstDbConnectionNode in RootNode.Connections.where(function(c) c.getTag("IsSource") = "True")
		   sql = "SELECT  [TableName]  FROM [MyBimlMeta_Tables] where ConnectionName = '" + srcconn.name +  "' and isnull(disabled,0) = 0"
		   dim importResult as Importresults = srcconn.GetDatabaseSchema(nothing,GetNonEmptyList(rootnode.connections("Target"),sql),DefaultImportOptions) 
		   for each table as asttablenode  in importResult.TableNodes 
		   tblname = GetTableID(srcconn.name,table.Schema.Name,table.Name) #>
		<Table Name="<#= tblname#>" SchemaName="<#= rootnode.schemas(0).Scopedname #>">
			<Columns>
				<# sql = "SELECT Columns FROM [MyBimlMeta_Tables] where ConnectionName = '" + srcconn.name +  "' and tablename = '" + table.name + "' and schemaname = '" + table.Schema.Name + "' and isnull(disabled,0) = 0"  
				   columns = ExternalDataAccess.GetDataTable(TargetConnString, sql).rows(0).item(0)
				   ImportColumns.clear				   
				   for each dc as datacolumn in ExternalDataAccess.GetDataTable(srcconn.connectionstring,"SELECT TOP 0 " + columns + " from " + table.SchemaQualifiedName).Columns
				   	importcolumns.add ( dc.Caption.ToUpper  )
				   next 				   
				   for each column as AstTableColumnBaseNode in table.Columns.where(function(c) importcolumns.IndexOf(c.name.toupper) <> -1) #>
				<#=column.GetBiml()#>
				<#  next #>
			</Columns>
			<Annotations>
				<Annotation AnnotationType="Tag" Tag="SourceSchemaQualifiedName">
					<#=table.SchemaQualifiedName#>
				</Annotation>
				<Annotation AnnotationType="Tag" Tag="Connection">
					<#=srcconn.name#>
				</Annotation>
			</Annotations>
		</Table>
		<# next 
        next#>
	</Tables>
</Biml> 

03_Create_Staging.biml

This file will not see any changes at all compared to our simple approach – except for the package subpath.

<#@ template tier="2" language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="01_CreateStaging" ConstraintMode="Parallel"  ProtectionLevel="EncryptSensitiveWithUserKey" PackageSubpath="MyBiml">
            <Tasks>
                <# for each table as asttablenode in RootNode.Tables  #>
                <ExecuteSQL Name="Create <#=table.Name#>" ConnectionName="Target">
                    <DirectInput>
                        <#=table.GetDropAndCreateDdl()#>
					</DirectInput>
                </ExecuteSQL>
                <# next #>   
            </Tasks>
        </Package>
    </Packages>
</Biml>

04_Populate_Staging.biml

This file will remain almost unchanged as well. The only major modification is, that we will point each dataflow to the corresponding Connection, instead of a static link to „Source“.

<#@ template tier="4" language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="02_Populate Tables" ConstraintMode="Parallel"  ProtectionLevel="EncryptSensitiveWithUserKey" PackageSubpath="MyBiml">
            <Tasks>
                <# for each table as asttablenode in rootnode.tables #>
                <Container Name="Transfer <#=table.Name#>" ConstraintMode="Linear">
                    <Tasks>
                        <ExecuteSQL Name="Truncate <#=table.Name#>" ConnectionName="Target">
                            <DirectInput>truncate table <#=table.name#></DirectInput>
                        </ExecuteSQL>
                        <Dataflow Name="Copy <#=table.Name#>">
                            <Transformations>
                                <OleDbSource Name="Retrieve Rows" ConnectionName="<#=table.GetTag("Connection").trim#>">
                                    <DirectInput>SELECT <#=table.GetColumnList()#> FROM <#=table.GetTag("SourceSchemaQualifiedName")#></DirectInput>
                                </OleDbSource>
                                <OleDbDestination Name="Load Rows" ConnectionName="Target">
                                    <TableOutput TableName="<#=table.ScopedName#>" />
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                    </Tasks>
                </Container>
                <# next #>
            </Tasks>
        </Package>
    </Packages>
</Biml>   

Compared to the basic approach in part 1, this took very little extra effort and yet it made our solution way more flexible.

And again, this is just the beginning – you could enhance that solution by adding:

CallBimlScript or Include Files for easier readability
incremental table creation
house keeping
index management
– meta data for sub-packages instead of loading all tables at once AND SO MUCH MORE! 🙂

Any questions or comments? We’d love to hear from you at biml@solisyon.de!

Further Information on Biml in English can be found on our English Biml Page.

Happy Biml’ing!

Facebook