Loading tables based on foreign key topology using Biml – Part 2: Parallel loads

Loading tables based on foreign key topology using Biml – Part 2: Parallel loads
Diesen Artikel finden Sie hier auch in deutscher Sprache.

This post is the second part of "Loading tables based on foreign key topology using Biml – Part 1: Linear loads" and should be regarded as an add-on. Therefore, we recommend to read part 1 first, if you haven’t done so yet.

After the first excitment about how easy it actually was to take care of that topology, you might ask yourself: Why does it have to run linear? That takes way too long. And you’re right – and it doesn’t have to.

All we need to do is:

– Create a list of all the tables that we’ve already loaded (which will be empty at that point)
– Identify all tables that do not reference any other tables
– Load these tables, each followed by all tables that only reference this single table – recursively and add them to list of loaded tables
– Once that is done, load all tables that are referencing multiple tables where all required tables have been loaded before – and again, add them to the list
– Repeat this until no table is left to load (or for a maximum of 10 times in this example)
– If, for whichever reason, any tables are left, load them sequentially using the TopoSort function:

<#@ template tier="4" language="VB"#> 
<#@ code file="../Code/TopologySort.vb" #>
<#@ import namespace="TopologySort" #>
 <Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="03_Load_Parallel" ConstraintMode="Linear" PackageSubpath="<#= OutputPath #>">
            <Tasks>
                <ExecutePackage Name="Recreate Tables">
                    <ExternalFile ExternalFilePath="<#= OutputPath #>\01_Create.dtsx"/>
                </ExecutePackage>
                <Container Name="Load tables without reference and immediate descendants" ConstraintMode="Parallel">
                    <Tasks>
                        <# for each t as asttablenode in rootnode.tables.where(function(r) r.columns.OfType(Of AstTableColumnTableReferenceNode)().Count+r.columns.OfType(Of AstMultipleColumnTableReferenceNode)().Count = 0) #>
                        <Container Name="Load <#= t.name #>" ConstraintMode="Linear">
                            <Tasks>
                                <#= CallBimlScript("05_Recursion.biml",t,LoadedTables) #>
                            </Tasks>
                        </Container>
                        <# next #>
                    </Tasks>
                </Container>
                <# do while rootnode.tables.where(function(e) not loadedtables.contains(e.name) and not e.columns.OfType(Of AstMultipleColumnTableReferenceNode).Where(function(c) c.foreigntable.name <> e.name and not loadedtables.contains(c.foreigntable.name) ).count > 0 and not e.columns.OfType(Of AstTableColumnTableReferenceNode).Where(function(m) m.foreigntable.name <> e.name and not loadedtables.contains(m.foreigntable.name) ).count > 0).Any and level < 10
				Level += 1
				loadabletables = rootnode.tables.where(function(e) not loadedtables.contains(e.name) and not e.columns.OfType(Of AstMultipleColumnTableReferenceNode).Where(function(c) c.foreigntable.name <> e.name and not loadedtables.contains(c.foreigntable.name) ).count > 0 and not e.columns.OfType(Of AstTableColumnTableReferenceNode).Where(function(m) m.foreigntable.name <> e.name and not loadedtables.contains(m.foreigntable.name) ).count > 0).ToList #>
                <Container Name="Load possible tables - Level <#= Level #>" ConstraintMode="Parallel">
                    <Tasks>
                        <# for each tbl as asttablenode in loadabletables #>
                        <#= CallBimlScript("06_Dataflow.biml",tbl,loadedtables) #>
                        <# next #>
                    </Tasks>
                </Container>
                <# loop 
				 if  loadedtables.count < rootnode.tables.count then #>
                <Container Name="Load leftover tables" ConstraintMode="Linear">
                    <Tasks>
                        <# for each t as asttablenode in rootnode.tables.TopoSort.where(function(r) not loadedtables.contains(r.name)) #>
                        <#= CallBimlScript("06_Dataflow.biml",t,LoadedTables) #>
                        <# next #>
                    </Tasks>
                </Container>
                <# end if #>
            </Tasks>
        </Package>
    </Packages>
</Biml> 

The first step, which loads all tables which do not reference other tables as well as their descendants is actually not necessary – you could remove it and still, the logic would pick up all tables (just creating a couple more steps in the second loop). The reason I built that into this solution is to show you another pretty cool thing: You can actually use CallBimlScript within a called Biml script. You can even call the same script again – which will result in the desired effect of recursion:

<#@ template language="VB" designerbimlpath="Biml/Packages/Package/Tasks" #>
<#@ property name="tbl" type="AstTableNode" #> 
<#@ property name="LoadedTables" type="List (of String)" #> 
<#= CallBimlScript("06_Dataflow.biml",tbl,LoadedTables) #>
<# for each t as asttablenode in rootnode.tables.where(function(r) r.columns.OfType(Of AstTableColumnTableReferenceNode)().Count+r.columns.OfType(Of AstMultipleColumnTableReferenceNode)().Count  =1).where(function(e) e.columns.OfType(Of AstTableColumnTableReferenceNode)().first.foreigntable.name  = tbl.name )  #>
    <#= CallBimlScript("05_Recursion.biml",t,LoadedTables) #>
<#next #>

Depending on your machine and data, this load should be much faster that the sequential load from the first example – still ensuring our data’s integrity!

Any questions or comments? We’d love to hear from you at biml@solisyon.de!

Further Information on Biml in English can be found on our English Biml Page.

Happy Biml’ing!

Facebook