Most efficient way to insert Rows into MySQL Database

I've read a lot of questions about that but i couldn't find one that is fast enough. I think there are better ways to insert a lot of rows into a MySQL Database

I use the following code to insert 100k into my MySQL-Database:

public static void CSVToMySQL()
{
    string ConnectionString = "server=192.168.1xxx";
    string Command = "INSERT INTO User (FirstName, LastName ) VALUES (@FirstName, @LastName);";
    using (MySqlConnection mConnection = new MySqlConnection(ConnectionString))
    {
        mConnection.Open();

        for(int i =0;i< 100000;i++) //inserting 100k items
        using (MySqlCommand myCmd = new MySqlCommand(Command, mConnection))
        {
            myCmd.CommandType = CommandType.Text;
            myCmd.Parameters.AddWithValue("@FirstName", "test");
            myCmd.Parameters.AddWithValue("@LastName", "test");
            myCmd.ExecuteNonQuery();
        }
    }
}

This takes for 100k rows about 40 seconds. How can i make this faster or a little more efficient?

Might be faster to insert multiple rows via a DataTable/DataAdapter or at once:

INSERT INTO User (Fn, Ln) VALUES (@Fn1, @Ln1), (@Fn2, @Ln2)...

Due to security issues i can't load the data into a file and MySQLBulkLoad it.


Solution 1:

Here is my "multiple inserts"-code.

The insertion of 100k rows took instead of 40 seconds only 3 seconds!!

public static void BulkToMySQL()
{
    string ConnectionString = "server=192.168.1xxx";
    StringBuilder sCommand = new StringBuilder("INSERT INTO User (FirstName, LastName) VALUES ");           
    using (MySqlConnection mConnection = new MySqlConnection(ConnectionString))
    {
        List<string> Rows = new List<string>();
        for (int i = 0; i < 100000; i++)
        {
            Rows.Add(string.Format("('{0}','{1}')", MySqlHelper.EscapeString("test"), MySqlHelper.EscapeString("test")));
        }
        sCommand.Append(string.Join(",", Rows));
        sCommand.Append(";");
        mConnection.Open();
        using (MySqlCommand myCmd = new MySqlCommand(sCommand.ToString(), mConnection))
        {
            myCmd.CommandType = CommandType.Text;
            myCmd.ExecuteNonQuery();
        }
    }
}

The created SQL-statement looks like this:

INSERT INTO User (FirstName, LastName) VALUES ('test','test'),('test','test'),... ;

Update: Thanks Salman A I added MySQLHelper.EscapeString to avoid code injection which is internally used when you use parameters.

Solution 2:

I did a small test using three things MySqlDataAdapter,transactions and UpdateBatchSize. It is about 30 times faster than your first example. Mysql is running on separate box so there is latency involved. The batchsize might need some tuning. Code follows:

string ConnectionString = "server=xxx;Uid=xxx;Pwd=xxx;Database=xxx";

string Command = "INSERT INTO User2 (FirstName, LastName ) VALUES (@FirstName, @LastName);";


 using (var mConnection = new MySqlConnection(ConnectionString))
     {
         mConnection.Open();
         MySqlTransaction transaction = mConnection.BeginTransaction();

        //Obtain a dataset, obviously a "select *" is not the best way...
        var mySqlDataAdapterSelect = new MySqlDataAdapter("select * from User2", mConnection);

        var ds = new DataSet();

        mySqlDataAdapterSelect.Fill(ds, "User2");


        var mySqlDataAdapter = new MySqlDataAdapter();

        mySqlDataAdapter.InsertCommand = new MySqlCommand(Command, mConnection);


        mySqlDataAdapter.InsertCommand.Parameters.Add("@FirstName", MySqlDbType.VarChar, 32, "FirstName");
        mySqlDataAdapter.InsertCommand.Parameters.Add("@LastName", MySqlDbType.VarChar, 32, "LastName");
        mySqlDataAdapter.InsertCommand.UpdatedRowSource = UpdateRowSource.None;

        var stopwatch = new Stopwatch();
        stopwatch.Start();

        for (int i = 0; i < 50000; i++)
        {
            DataRow row = ds.Tables["User2"].NewRow();
            row["FirstName"] = "1234";
            row["LastName"] = "1234";
            ds.Tables["User2"].Rows.Add(row);
        }

         mySqlDataAdapter.UpdateBatchSize = 100;
         mySqlDataAdapter.Update(ds, "User2");

         transaction.Commit();

         stopwatch.Stop();
         Debug.WriteLine(" inserts took " + stopwatch.ElapsedMilliseconds + "ms");
    }
}

Solution 3:

Execute command in a Transaction and reuse same instance of command for each iteration. For further performance optimization, send 100 queries in one command. Going for parallel execution could give better performance (Parallel.For) but make sure each parallel loop get its own MySqlCommand instance.

public static void CSVToMySQL()
{
    string ConnectionString = "server=192.168.1xxx";
    string Command = "INSERT INTO User (FirstName, LastName ) VALUES (@FirstName, @LastName);";
    using (MySqlConnection mConnection = new MySqlConnection(ConnectionString)) 
    {
        mConnection.Open();
        using (MySqlTransaction trans = mConnection.BeginTransaction()) 
        {
            using (MySqlCommand myCmd = new MySqlCommand(Command, mConnection, trans)) 
            {
                myCmd.CommandType = CommandType.Text;
                for (int i = 0; i <= 99999; i++) 
                {
                    //inserting 100k items
                    myCmd.Parameters.Clear();
                    myCmd.Parameters.AddWithValue("@FirstName", "test");
                    myCmd.Parameters.AddWithValue("@LastName", "test");
                    myCmd.ExecuteNonQuery();
                }
                trans.Commit();
            }
        }
    }
}

Solution 4:

If Add of AddWithValue does not escape strings, you must do such in advance to avoid SQL injection and syntax errors.

Build INSERT statements with only 1000 rows at a time. That should run easily 10 times as fast as what you started with (1 row per INSERT). Doing all 100K at once is risky and possibly slower. Risky because you might blow out some limit (packet size, etc); slower because of the need for a huge ROLLBACK log. COMMIT after each batch, or use autocommit=1.

Solution 5:

This way may not be faster than the stringbuilder approach, but it is parameterized:

/// <summary>
    /// Bulk insert some data, uses parameters
    /// </summary>
    /// <param name="table">The Table Name</param>
    /// <param name="inserts">Holds list of data to insert</param>
    /// <param name="batchSize">executes the insert after batch lines</param>
    /// <param name="progress">Progress reporting</param>
    public void BulkInsert(string table, MySQLBulkInsertData inserts, int batchSize = 100, IProgress<double> progress = null)
    {
        if (inserts.Count <= 0) throw new ArgumentException("Nothing to Insert");

        string insertcmd = string.Format("INSERT INTO `{0}` ({1}) VALUES ", table,
                                         inserts.Fields.Select(p => p.FieldName).ToCSV());
        StringBuilder sb = new StringBuilder(); 
        using (MySqlConnection conn = new MySqlConnection(ConnectionString))
        using (MySqlCommand sqlExecCommand = conn.CreateCommand())
        {
            conn.Open();
            sb.AppendLine(insertcmd);
            for (int i = 0; i < inserts.Count; i++)
            {
                sb.AppendLine(ToParameterCSV(inserts.Fields, i));
                for (int j = 0; j < inserts[i].Count(); j++)
                {
                    sqlExecCommand.Parameters.AddWithValue(string.Format("{0}{1}",inserts.Fields[j].FieldName,i), inserts[i][j]);
                }
                //commit if we are on the batch sizeor the last item
                if (i > 0 && (i%batchSize == 0 || i == inserts.Count - 1))
                {
                    sb.Append(";");
                    sqlExecCommand.CommandText = sb.ToString();
                    sqlExecCommand.ExecuteNonQuery();
                    //reset the stringBuilder
                    sb.Clear();
                    sb.AppendLine(insertcmd);
                    if (progress != null)
                    {
                        progress.Report((double)i/inserts.Count);
                    }
                }
                else
                {
                    sb.Append(",");
                }
            }
        }
    }

This uses the helper classes as below:

/// <summary>
/// Helper class to builk insert data into a table
/// </summary>
public struct MySQLFieldDefinition
{
    public MySQLFieldDefinition(string field, MySqlDbType type) : this()
    {
        FieldName = field;
        ParameterType = type;
    }

    public string FieldName { get; private set; }
    public MySqlDbType ParameterType { get; private set; }
}

///
///You need to ensure the fieldnames are in the same order as the object[] array
///
public class MySQLBulkInsertData : List<object[]>
{
    public MySQLBulkInsertData(params MySQLFieldDefinition[] fieldnames)
    {
        Fields = fieldnames;
    }

    public MySQLFieldDefinition[] Fields { get; private set; }
}

And this helper method:

    /// <summary>
    /// Return a CSV string of the values in the list
    /// </summary>
    /// <returns></returns>
    /// <exception cref="ArgumentNullException"></exception>
    private string ToParameterCSV(IEnumerable<MySQLFieldDefinition> p, int row)
    {
        string csv = p.Aggregate(string.Empty,
            (current, i) => string.IsNullOrEmpty(current)
                    ? string.Format("@{0}{1}",i.FieldName, row)
                    : string.Format("{0},@{2}{1}", current, row, i.FieldName));
        return string.Format("({0})", csv);
    }

Maybe not super elegant but it works well. I require Progress tracking so that is included for me, feel free to remove that part.

This will produce SQL commands similar to your desired output.

EDIT: ToCSV:

        /// <summary>
    /// Return a CSV string of the values in the list
    /// </summary>
    /// <param name="intValues"></param>
    /// <param name="separator"></param>
    /// <param name="encloser"></param>
    /// <returns></returns>
    /// <exception cref="ArgumentNullException"></exception>
    public static string ToCSV<T>(this IEnumerable<T> intValues, string separator = ",", string encloser = "")
    {
        string result = String.Empty;
        foreach (T value in intValues)
        {
            result = String.IsNullOrEmpty(result)
                ? string.Format("{1}{0}{1}", value, encloser)
                : String.Format("{0}{1}{3}{2}{3}", result, separator, value, encloser);
        }
        return result;
    }