For Each Loops with Biml over Flat Files

For Each Loops with Biml over Flat Files

Diesen Artikel finden Sie hier auch in deutscher Sprache.

If you’ve read our post on working with flat files, you were probably wondering: But what if I have multiple files using the same format? This is a very typical use case when working with flat files so let’s take a look at how this can be done.

As always, we need some preparation.

First, download our sample data and extract it to C:\Flatfiles\ForEachLoop.

Then, create a target table to load that data into (we will be using the MyBiml_Flatfiles database again in our sample code):

CREATE TABLE [dbo].[FF_Inventory](
	[Warehouse] [nvarchar](50) NULL,
	[Item] [nvarchar](50) NULL,
	[Inventory] [bigint] NULL,
	[Filename] [nvarchar](35) NULL
) ON [PRIMARY] 

Then, as the last step of preparing for the actual fun, set up your Biml environment by defining the flat file format as well as the flat file connection and the link to your target database:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<FileFormats>
        <FlatFileFormat Name="MyFlatFile" CodePage="1252" ColumnNamesInFirstDataRow="true" IsUnicode="false">
    		<Columns>
        		<Column Name="Warehouse" DataType="String" Length="50" Delimiter="Semicolon" />
        		<Column Name="Item" DataType="String" Length="50" Delimiter="Semicolon" />
        		<Column Name="Inventory" DataType="Int64" Delimiter="CRLF" />
    		</Columns>
		</FlatFileFormat>	
	</FileFormats>
    <Connections>
        <Connection Name="Target" ConnectionString="Data Source=.;Initial Catalog=MyBiml_Flatfiles;Provider=SQLNCLI11.1;Integrated Security=SSPI;" />
		<FlatFileConnection Name="MyFlatFile" FilePath="empty" FileFormat="MyFlatFile" />
    </Connections>
</Biml>

Now that we are prepared, what is it, that we want to achieve?

We want to loop through a directory and import every CSV file in there into our FF_Inventory table. In addition to the data in the file, we also want to save the file name.

To get that done in SSIS, we need:

– A variable to store the current file name and will be used as the connection string for our flat file connection
– A for each loop that will loop through our directory, pass each file name in there to our variable and then run a data flow task
– A derived column in our data flow task that will store our variable so we can import it into our SQL table
– Since we want this to be "clean", we will also run a truncate before the for each loop

So let’s look at the code for that:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package ProtectionLevel="EncryptSensitiveWithUserKey" Name="For_Each_Loop" ConstraintMode="Linear">
            <Variables>
                <Variable DataType="String" Name="MyFileName"/>
            </Variables>
            <Connections>
                <Connection ConnectionName="MyFlatFile">
                    <Expressions>
                        <Expression ExternalProperty="ConnectionString">@[User::MyFileName]</Expression>
                    </Expressions>
                </Connection>
            </Connections>
            <Tasks>
                <ExecuteSQL Name="Truncate" ConnectionName="Target">
                    <DirectInput>truncate table ff_inventory</DirectInput>
                </ExecuteSQL>
                <ForEachFileLoop Name="Loop through CSV" Folder="C:\Flatfiles\ForEachLoop" FileSpecification="*.csv">
                    <VariableMappings>
                        <VariableMapping Name="0" VariableName="User.MyFileName" />
                    </VariableMappings>
                    <Tasks>
                        <Dataflow Name="Load all files in directory">
                            <Transformations>
                                <FlatFileSource Name="SRC" ConnectionName="MyFlatFile" />
                                <DerivedColumns Name="Filename">
                                    <InputPath OutputPathName="SRC.Output" />
                                    <Columns>
                                        <Column Name="Filename" DataType="String" Length="35">@[User::MyFileName]</Column>
                                    </Columns>
                                </DerivedColumns>
                                <OleDbDestination Name="Target" ConnectionName="Target">
                                    <InputPath OutputPathName="Filename.Output" />
                                    <ExternalTableOutput Table="[dbo].[FF_Inventory]" />
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                    </Tasks>
                </ForEachFileLoop>
            </Tasks>
        </Package>
    </Packages>
</Biml>

As you can see, we:

– define the variable
– assign that variable to the flat file connection
– run our truncate
– then define our for each loop with a directory to loop through, a variable mapping and the data flow task including the derived column

Of course, ideally you should put that into some code that accepts parameters and use CallBimlScript to make the whole exercise re-usable!

Any questions or comments? We’d love to hear from you at biml@solisyon.de!

Further Information on Biml in English can be found on our English Biml Page.

Happy Biml’ing!

Facebook