Bulk Insert to Oracle using .NET
What is the fastest way to do Bulk insert to Oracle using .NET? I need to transfer about 160K records using .NET to Oracle. Currently, I'm using insert statement and execute it 160K times.It takes about 25 minutes to complete. The source data is stored in a DataTable, as a result of query from another database (MySQL),
Is there any better way to do this?
EDIT : I'm currently using System.Data.OracleClient, but willing to accept solutions using another provider (ODP.NET, DevArt, etc..)
Solution 1:
I'm loading 50,000 records in 15 or so seconds using Array Binding in ODP.NET
It works by repeatedly invoking a stored procedure you specify (and in which you can do updates/inserts/deletes), but it passes the multiple parameter values from .NET to the database in bulk.
Instead of specifying a single value for each parameter to the stored procedure you specify an array of values for each parameter.
Oracle passes the parameter arrays from .NET to the database in one go, and then repeatedly invokes the stored procedure you specify using the parameter values you specified.
http://www.oracle.com/technetwork/issue-archive/2009/09-sep/o59odpnet-085168.html
/Damian
Solution 2:
I recently discovered a specialized class that's awesome for a bulk insert (ODP.NET). Oracle.DataAccess.Client.OracleBulkCopy! It takes a datatable as a parameter, then you call WriteTOServer method...it is very fast and effective, good luck!!
Solution 3:
The solution of Rob Stevenson-Legget is slow because he doesn't bind his values but he uses string.Format( ).
When you ask Oracle to execute a sql statement it starts with calculating the has value of this statement. After that it looks in a hash table whether it already knows this statement. If it already knows it statement it can retrieve its execution path from this hash table and execute this statement really fast because Oracle has executed this statement before. This is called the library cache and it doesn't work properly if you don't bind your sql statements.
For example don't do:
int n;
for (n = 0; n < 100000; n ++)
{
mycommand.CommandText = String.Format("INSERT INTO [MyTable] ([MyId]) VALUES({0})", n + 1);
mycommand.ExecuteNonQuery();
}
but do:
OracleParameter myparam = new OracleParameter();
int n;
mycommand.CommandText = "INSERT INTO [MyTable] ([MyId]) VALUES(?)";
mycommand.Parameters.Add(myparam);
for (n = 0; n < 100000; n ++)
{
myparam.Value = n + 1;
mycommand.ExecuteNonQuery();
}
Not using parameters can also cause sql injection.
Solution 4:
SQL Server's SQLBulkCopy is blindingly fast. Unfortunately, I found that OracleBulkCopy is far slower. Also it has problems:
- You must be very sure that your input data is clean if you plan to use OracleBulkCopy. If a primary key violation occurs, an ORA-26026 is raised and it appears to be unrecoverable. Trying to rebuild the index does not help and any subsequent insert on the table fails, also normal inserts.
- Even if the data is clean, I found that OracleBulkCopy sometimes gets stuck inside WriteToServer. The problem seems to depend on the batch size. In my test data, the problem would happen at the exact same point in my test when I repeat is. Use a larger or smaller batch size, and the problem does not happen. I see that the speed is more irregular on larger batch sizes, this points to problems related to memory management.
Actually System.Data.OracleClient.OracleDataAdapter is faster than OracleBulkCopy if you want to fill a table with small records but many rows. You need to tune the batch size though, the optimum BatchSize for OracleDataAdapter is smaller than for OracleBulkCopy.
I ran my test on a Windows 7 machine with an x86 executable and the 32 bits ODP.Net client 2.112.1.0. . The OracleDataAdapter is part of System.Data.OracleClient 2.0.0.0. My test set is about 600,000 rows with a record size of max. 102 bytes (average size 43 chars). Data source is a 25 MB text file, read in line by line as a stream.
In my test I built up the input data table to a fixed table size and then used either OracleBulkCopy or OracleDataAdapter to copy the data block to the server. I left BatchSize as 0 in OracleBulkCopy (so that the current table contents is copied as one batch) and set it to the table size in OracleDataAdapter (again that should create a single batch internally). Best results:
- OracleBulkCopy: table size = 500, total duration 4'22"
- OracleDataAdapter: table size = 100, total duration 3'03"
For comparison:
- SqlBulkCopy: table size = 1000, total duration 0'15"
- SqlDataAdapter: table size = 1000, total duration 8'05"
Same client machine, test server is SQL Server 2008 R2. For SQL Server, bulk copy is clearly the best way to go. Not only is it overall fastest, but server load is also lower than when using data adapter. It is a pity that OracleBulkCopy does not offer quite the same experience - the BulkCopy API is much easier to use than DataAdapter.
Solution 5:
A really fast way to solve this problem is to make a database link from the Oracle database to the MySQL database. You can create database links to non-Oracle databases. After you have created the database link you can retrieve your data from the MySQL database with a ... create table mydata as select * from ... statement. This is called heterogeneous connectivity. This way you don't have to do anything in your .net application to move the data.
Another way is to use ODP.NET. In ODP.NET you can use the OracleBulkCopy-class.
But I don't think that inserting 160k records in an Oracle table with System.Data.OracleClient should take 25 minutes. I think you commit too many times. And do you bind your values to the insert statement with parameters or do you concatenate your values. Binding is much faster.