Metadata based SSIS using Biml – Part 1

Metadata based SSIS using Biml – Part 1
Diesen Artikel finden Sie hier auch in deutscher Sprache.

One of the no brainers (among many others) to use Biml for, is the automation of simple data loads. To take full advantage of that functionality, you will need a solution to manage your metadata in order to “tell Biml” which data to retrieve.

In this first part, we will take a very simple approach by just defining one static data source and only having table names as our meta data. This may feel a bit like “all or nothing” – which it is – but it will give you a good first impression on how to talk to the Biml engine.

In a follow-up post, we will look into enhancing that solution by also filtering on columns as well as by defining multiple data sources which will basically allow you to automate any kind of “just copy my data from here to there” task.

But let’s get not ahead of ourselves and keep it simple for now…

As a baseline, we will need a source database (in this example, we will use AdventureWorks2014) as well as a target.

Our target will be a new database called “MySimpleBiml_Destination” containing one metadata table, which can be created (and populated with some sample data) using the following T-SQL:


CREATE TABLE [dbo].[MyBimlMeta_Tables](
	[TableName] [nvarchar](50) NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[MyBimlMeta_Tables]
SELECT NAME FROM AdventureWorks2014.dbo.sysobjects where name like 'Sales%' and type = 'U'
 

Once you went through the sample code, you can just play with the contents of that table – based on the script above, it will just load all tables from AdventureWorks whose name start with “Sales”.

For reasons of readability and also reusability, we will spread our solution across 4 logical Biml-Files as well as 1 code file.
The code file holds some global definitions and functions to make it a little bit easier for us – as this is very simple VB-code, I will not elaborate it any further. Still, if you have any comments or questions, please feel free to reach out.

This is our code file:

Imports Varigence.Biml.CoreLowerer.SchemaManagement
Imports Varigence.Biml.Extensions
Imports Varigence.Languages.Biml
Imports Varigence.Languages.Biml.Connection
Imports System.Data
Imports System.Collections.Generic

Public Class BF
	Public Shared TargetDB As String = "MySimpleBiml_Destination"
	Public Shared Function GetNonEmptyList (Conn As AstDbConnectionNode, SQL As String) As List(Of String)
		Dim tmplist As New List(Of String)
		If SQL.Contains(" ") = 0 Then sql = "select * from " + sql
		Dim DT As DataTable  = ExternalDataAccess.GetDataTable(Conn.ConnectionString, SQL)
		For Each dr As datarow In dt.rows
			tmplist.add(dr.item(0).ToString())
		Next			
		If tmplist.count = 0 Then tmplist.add ("NONEMPTYFILLER")
		Return tmplist		
	End Function
	Public Shared Function DefaultImportOptions
		Return ImportOptions.ExcludeIdentity Or ImportOptions.ExcludePrimaryKey  Or ImportOptions.ExcludeUniqueKey _ 
		Or ImportOptions.ExcludeColumnDefault Or ImportOptions.ExcludeIndex Or _ 
		ImportOptions.ExcludeCheckConstraint Or ImportOptions.ExcludeForeignKey
	End Function 	
End Class

Way more important than the code file are our Biml-files which are:

01_Environment.biml

In this file, we will just define our connections – to AdventureWorks as well as our staging database.

<#@ template language="VB" tier="1" #>
<#@ code file="../code/BimlFunctions.vb" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Connections>
		<OleDbConnection Name="Target" ConnectionString="Provider=SQLNCLI11;Server=.;Initial Catalog=<#= BF.TargetDB #>;Integrated Security=SSPI;" />
		<OleDbConnection Name="Source" ConnectionString="Provider=SQLNCLI11;Server=.;Initial Catalog=AdventureWorks2014;Integrated Security=SSPI;"/>
	</Connections>
	<Databases>
		<Database Name="<#= BF.TargetDB #>" ConnectionName="Target" />
	</Databases>
	<Schemas>
		<Schema Name="dbo" DatabaseName="<#= BF.TargetDB #>" />
	</Schemas>
</Biml> 


02_BuildMeta.biml

This file will build up our table-metadata. To do so, we will loop through our MyBimlMeta_Tables-Table, get the column specifications for the affected tables from AdventureWorks and create a Table-Tag within our Biml code for it. As we will get all columns from those tables, we can simply use the GetDatabaseSchema-Function, filter it on the tables we want to use, then loop through its result and return its contents using the GetBiml-Function (GetDatabaseSchema accepts 3 parameters which can all or in part be NULL: a list of Schemas to filter on, a list of tables to filter on as well as ImportOptions to exclude indices etc. – we will pass a list of tables as well as a set of appropriate ImportOptions).

In addition, we will make use of an annotation – in case you haven’t used them, think of them as an extended property on a Biml-object – to store the SELECT-statement to be used against AdventureWorks to eventually retrieve the data and populate our staging tables.

<#@ template language="VB" tier="2" #>
<#@ code file="../code/BimlFunctions.vb" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Tables>
		<#  dim AW as astdbconnectionnode =  RootNode.Connections("Source")
			dim importResult as Importresults = AW.GetDatabaseSchema(nothing,BF.GetNonEmptyList(RootNode.Connections("Target"),"MyBimlMeta_Tables"), BF.DefaultImportOptions) 
			for each table as asttablenode in importResult.TableNodes #>
		<Table Name="AW_<#=table.Schema.Name#>_<#=table.Name#>" SchemaName="<#= BF.TargetDB #>.dbo">
			<Columns>
				<#= table.columns.getbiml #>
			</Columns>
			 <Annotations>
				<Annotation AnnotationType="Tag" Tag="SQLSelect">
					SELECT <#=table.GetColumnList()#> FROM <#=table.SchemaQualifiedName #>
				</Annotation>
			</Annotations>
		</Table>
		<#  next #>
	</Tables>
</Biml> 

03_Create_Staging.biml
Once we have defined our metadata, we need to create the corresponding tables within the staging database. To do so, we create an SSIS package, loop through the Biml-Metadata (which sits in Rootnode.Tables) and create an execute SQL Task for each table (which will be represented as an AstTableNode) in it which will make use of the GetDropAndCreateDDL-Function – this will take part of pretty much everything we need for this step.

<#@ template tier="3" language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Packages>
		<Package Name="01_CreateStaging" ConstraintMode="Parallel" PackageSubpath="MySimpleBiml"  ProtectionLevel="DontSaveSensitive">
			<Tasks>
				<# for each table as asttablenode in RootNode.Tables  #>
				<ExecuteSQL Name="Create <#=table.Name#>" ConnectionName="Target">
					<DirectInput>
						<#=table.GetDropAndCreateDdl()#>
					</DirectInput>
				</ExecuteSQL>
				<# next #>
			</Tasks>
		</Package>
	</Packages>
</Biml> 

Before proceeding to the next step, we will execute this package to make sure, the tables we want to populate our data to actually exist.

04_Populate_Staging.biml
Now that we have our target tables set up, we will create another SSIS package and loop through our metadata again, but instead of creating tables, we will populate them.
To accomplish this, we will create a sequence container for each table and create a truncate as well as a dataflow task within each container.

<#@ template tier="4" language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Packages>
		<Package Name="02_Populate Tables" ConstraintMode="Parallel"  ProtectionLevel="DontSaveSensitive" PackageSubpath="MySimpleBiml">
			<Tasks>
			<# for each table as asttablenode in RootNode.Tables  #>
				<Container Name="Transfer <#=table.Schema.Name#>_<#=table.Name#>" ConstraintMode="Linear">
					<Tasks>
						<ExecuteSQL Name="Truncate" ConnectionName="Target">
							<DirectInput>truncate table <#=table.ScopedName#></DirectInput>
						</ExecuteSQL>
						<Dataflow Name="Copy <#=table.Schema.Name#>_<#=table.Name#>">
							<Transformations>
								<OleDbSource Name="Get Rows" ConnectionName="Source">
									<DirectInput><#=table.GetTag("SQLSelect")#></DirectInput>
								</OleDbSource>
								<OleDbDestination Name="Set Rows" ConnectionName="Target">
									<TableOutput TableName="<#=table.ScopedName#>" />
								</OleDbDestination>
							</Transformations>
						</Dataflow>
					</Tasks>
				</Container>
			<# next #>
			</Tasks>
		</Package> 
	</Packages>
</Biml> 

Once you build and run that SSIS package, you will have all your target tables populated with the corresponding data from AdventureWorks.
Will this solve all your real-life problems and tasks regarding data loads? Certainly not – but I hope, you got a good impression, how simple it actually is to feed metadata into Biml and build a solution on top of it.

You liked this post? Then keep going with Part 2 right away!

Any questions or comments? We’d love to hear from you at biml@solisyon.de!

Further Information on Biml in English can be found on our English Biml Page.

Happy Biml’ing!

Facebook