How to import text files with the same name and schema but different directories into database?
I require to import multiple txt files with the same name and same schemas into the same table in SQL Server 2008 database. The problem that I have is that they are all in different directories:
TEST
201304
sample1.txt
sample2.txt
201305
sample1.txt
sample2.txt
201306
sample1.txt
sample2.txt
Is there any way in SSIS that I can set this up?
Yes. You will want to use a Foreach File Container and then check the Traverse Subfolder option.
Edit
Apparently my answer wasn't cromulent enough, so please accept this working code which illustrates what my brief original answer stated.
Source data
I created 3 folders as described above to contain files sample1.txt
and sample2.txt
C:\>MKDIR SSISDATA\SO\TEST\201304
C:\>MKDIR SSISDATA\SO\TEST\201305
C:\>MKDIR SSISDATA\SO\TEST\201306
The contents of the file are below. Each version of the file in each folder has the ID value incremented along with the text values altered to prove it has picked up the new file.
ID,value
1,ABC
Package generation
This part assumes you have BIDS Helper installed. It is not required for the solution but simply provides a common framework future readers could use to reproduce this solution
I created a BIML file with the following content. Even though I have the table create step in there, I needed to have that run on the target server prior to generating the package.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<!-- Create a basic flat file source definition -->
<FileFormats>
<FlatFileFormat
Name="FFFSrc"
CodePage="1252"
RowDelimiter="CRLF"
IsUnicode="false"
FlatFileType="Delimited"
ColumnNamesInFirstDataRow="true"
>
<Columns>
<Column
Name="ID"
DataType="Int32"
Delimiter=","
ColumnType="Delimited"
/>
<Column
Name="value"
DataType="AnsiString"
Delimiter="CRLF"
InputLength="20"
MaximumWidth="20"
Length="20"
CodePage="1252"
ColumnType="Delimited"
/>
</Columns>
</FlatFileFormat>
</FileFormats>
<!-- Create a connection that uses the flat file format defined above-->
<Connections>
<FlatFileConnection
Name="FFSrc"
FileFormat="FFFSrc"
FilePath="C:\ssisdata\so\TEST\201306\sample1.txt"
DelayValidation="true"
/>
<OleDbConnection
Name="tempdb"
ConnectionString="Data Source=localhost\dev2012;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
/>
</Connections>
<!-- Create a package to illustrate how to apply an expression on the Connection Manager -->
<Packages>
<Package
Name="so_19957451"
ConstraintMode="Linear"
>
<Connections>
<Connection ConnectionName="tempdb"/>
<Connection ConnectionName="FFSrc">
<Expressions>
<!-- Assign a variable to the ConnectionString property.
The syntax for this is ConnectionManagerName.Property -->
<Expression PropertyName="FFSrc.ConnectionString">@[User::CurrentFileName]</Expression>
</Expressions>
</Connection>
</Connections>
<!-- Create a single variable that points to the current file -->
<Variables>
<Variable Name="CurrentFileName" DataType="String">C:\ssisdata\so\TEST\201306\sample1.txt</Variable>
<Variable Name="FileMask" DataType="String">*.txt</Variable>
<Variable Name="SourceFolder" DataType="String">C:\ssisdata\so\TEST</Variable>
<Variable Name="RowCountInput" DataType="Int32">0</Variable>
<Variable Name="TargetTable" DataType="String">[dbo].[so_19957451]</Variable>
</Variables>
<!-- Add a foreach file enumerator. Use the above -->
<Tasks>
<ExecuteSQL
Name="SQL Create Table"
ConnectionName="tempdb">
<DirectInput>
IF NOT EXISTS (SELECT * FROM sys.tables T WHERE T.name = 'so_19957451' and T.schema_id = schema_id('dbo'))
BEGIN
CREATE TABLE dbo.so_19957451(ID int NOT NULL, value varchar(20) NOT NULL);
END
</DirectInput>
</ExecuteSQL>
<ForEachFileLoop
Name="FELC Consume files"
FileSpecification="*.csv"
ProcessSubfolders="true"
RetrieveFileNameFormat="FullyQualified"
Folder="C:\"
ConstraintMode="Linear"
>
<!-- Define the expressions to make the input folder and the file mask
driven by variable values -->
<Expressions>
<Expression PropertyName="Directory">@[User::SourceFolder]</Expression>
<Expression PropertyName="FileSpec">@[User::FileMask]</Expression>
</Expressions>
<VariableMappings>
<!-- Notice that we use the convention of User.Variable name here -->
<VariableMapping
Name="0"
VariableName="User.CurrentFileName"
/>
</VariableMappings>
<Tasks>
<Dataflow Name="DFT Import file" DelayValidation="true">
<Transformations>
<FlatFileSource Name="FFS Sample" ConnectionName="FFSrc"/>
<RowCount Name="RC Source" VariableName="User.RowCountInput"/>
<OleDbDestination
Name="OLE_DST"
ConnectionName="tempdb">
<TableFromVariableOutput VariableName="User.TargetTable"/>
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</ForEachFileLoop>
</Tasks>
</Package>
</Packages>
</Biml>
Right click on the biml file and select Generate SSIS Package
. At this point, you should have a package named so_19957451 added to your current SSIS project.
Package configuration
There's no need for any configuration because it's already been done via BIML but moar screenshots make for better answers.
This is the basic package
Here are my variables
Configuration of the Foreach Loop, as called out in the MSDN article as well as my note of select the Traverse subfolder
Assign the value generated per loop to the variable Current
The flat file source has an expression applied to the ConnectionString property to ensure it uses the Variable @User::CurrentFileName. This changes the source per execution of the loop.
Execution results
Results from the database
Match the output from the package execution
Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample1.txt" has started.
Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample1.txt" has ended.
Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample2.txt" has started.
Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample2.txt" has ended.
Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample1.txt" has started.
Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample1.txt" has ended.
Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample2.txt" has started.
Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample2.txt" has ended.
Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample1.txt" has started.
Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample1.txt" has ended.
Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample2.txt" has started.
Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample2.txt" has ended.