How to read data of an Excel file using C#?

OK,

One of the more difficult concepts to grasp about Excel VSTO programming is that you don't refer to cells like an array, Worksheet[0][0] won't give you cell A1, it will error out on you. Even when you type into A1 when Excel is open, you are actually entering data into Range A1. Therefore you refer to cells as Named Ranges. Here's an example:

Excel.Worksheet sheet = workbook.Sheets["Sheet1"] as Excel.Worksheet; 
Excel.Range range = sheet.get_Range("A1", Missing.Value)

You can now literally type:

range.Text // this will give you the text the user sees
range.Value2 // this will give you the actual value stored by Excel (without rounding)

If you want to do something like this:

Excel.Range range = sheet.get_Range("A1:A5", Missing.Value)

if (range1 != null)
     foreach (Excel.Range r in range1)
     {
         string user = r.Text
         string value = r.Value2

     }

There might be a better way, but this has worked for me.

The reason you need to use Value2 and not Value is because the Value property is a parametrized and C# doesn't support them yet.

As for the cleanup code, i will post that when i get to work tomorrow, i don't have the code with me, but it's very boilerplate. You just close and release the objects in the reverse order you created them. You can't use a Using() block because the Excel.Application or Excel.Workbook doesn't implement IDisposable, and if you don't clean-up, you will be left with a hanging Excel objects in memory.

Note:

  • If you don't set the Visibility property Excel doesn't display, which can be disconcerting to your users, but if you want to just rip the data out, that is probably good enough
  • You could OleDb, that will work too.

I hope that gets you started, let me know if you need further clarification. I'll post a complete

here is a complete sample:

using System;
using System.IO;
using System.Reflection;
using NUnit.Framework;
using ExcelTools = Ms.Office;
using Excel = Microsoft.Office.Interop.Excel;

namespace Tests
{
    [TestFixture]
    public class ExcelSingle
    {
        [Test]
        public void ProcessWorkbook()
        {
            string file = @"C:\Users\Chris\Desktop\TestSheet.xls";
            Console.WriteLine(file);

            Excel.Application excel = null;
            Excel.Workbook wkb = null;

            try
            {
                excel = new Excel.Application();

                wkb = ExcelTools.OfficeUtil.OpenBook(excel, file);

                Excel.Worksheet sheet = wkb.Sheets["Data"] as Excel.Worksheet;

                Excel.Range range = null;

                if (sheet != null)
                    range = sheet.get_Range("A1", Missing.Value);

                string A1 = String.Empty;

                if( range != null )
                    A1 = range.Text.ToString();

                Console.WriteLine("A1 value: {0}", A1);

            }
            catch(Exception ex)
            {
                //if you need to handle stuff
                Console.WriteLine(ex.Message);
            }
            finally
            {
                if (wkb != null)
                    ExcelTools.OfficeUtil.ReleaseRCM(wkb);

                if (excel != null)
                    ExcelTools.OfficeUtil.ReleaseRCM(excel);
            }
        }
    }
}

I'll post the functions from ExcelTools tomorrow, I don't have that code with me either.

Edit: As promised, here are the Functions from ExcelTools you might need.

public static Excel.Workbook OpenBook(Excel.Application excelInstance, string fileName, bool readOnly, bool editable,
        bool updateLinks) {
        Excel.Workbook book = excelInstance.Workbooks.Open(
            fileName, updateLinks, readOnly,
            Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
            Type.Missing, editable, Type.Missing, Type.Missing, Type.Missing,
            Type.Missing, Type.Missing);
        return book;
    }

public static void ReleaseRCM(object o) {
        try {
            System.Runtime.InteropServices.Marshal.ReleaseComObject(o);
        } catch {
        } finally {
            o = null;
        }
    }

To be frank, this stuff is much easier if you use VB.NET. It's in C# because I didn't write it. VB.NET does option parameters well, C# does not, hence the Type.Missing. Once you typed Type.Missing twice in a row, you run screaming from the room!

As for you question, you can try to following:

http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.range.find(VS.80).aspx

I will post an example when I get back from my meeting... cheers

Edit: Here is an example

range = sheet.Cells.Find("Value to Find",
                                                 Type.Missing,
                                                 Type.Missing,
                                                 Type.Missing,
                                                 Type.Missing,
                                                 Excel.XlSearchDirection.xlNext,
                                                 Type.Missing,
                                                 Type.Missing, Type.Missing);

range.Text; //give you the value found

Here is another example inspired by this site:

 range = sheet.Cells.Find("Value to find", Type.Missing, Type.Missing,Excel.XlLookAt.xlWhole,Excel.XlSearchOrder.xlByColumns,Excel.XlSearchDirection.xlNext,false, false, Type.Missing);

It helps to understand the parameters.

P.S. I'm one of those weird people who enjoys learning COM automation. All this code steamed from a tool I wrote for work which required me to process over 1000+ spreadsheets from the lab each Monday.


You can use Microsoft.Office.Interop.Excel assembly to process excel files.

  1. Right click on your project and go to Add reference. Add the Microsoft.Office.Interop.Excel assembly.
  2. Include using Microsoft.Office.Interop.Excel; to make use of assembly.

Here is the sample code:

    using Microsoft.Office.Interop.Excel;

    //create the Application object we can use in the member functions.
    Microsoft.Office.Interop.Excel.Application _excelApp = new Microsoft.Office.Interop.Excel.Application();
    _excelApp.Visible = true;

    string fileName = "C:\\sampleExcelFile.xlsx";

    //open the workbook
    Workbook workbook = _excelApp.Workbooks.Open(fileName,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing);

    //select the first sheet        
    Worksheet worksheet = (Worksheet)workbook.Worksheets[1];

    //find the used range in worksheet
    Range excelRange = worksheet.UsedRange;

    //get an object array of all of the cells in the worksheet (their values)
    object[,] valueArray = (object[,])excelRange.get_Value(
                XlRangeValueDataType.xlRangeValueDefault);

    //access the cells
    for (int row = 1;  row <= worksheet.UsedRange.Rows.Count; ++row)
    {
        for (int col = 1; col <= worksheet.UsedRange.Columns.Count; ++col)
        {
            //access each cell
            Debug.Print(valueArray[row, col].ToString());
        }
    }

    //clean up stuffs
    workbook.Close(false, Type.Missing, Type.Missing);
    Marshal.ReleaseComObject(workbook);

    _excelApp.Quit();
    Marshal.FinalReleaseComObject(_excelApp);

Why don't you create OleDbConnection? There are a lot of available resources in the Internet. Here is an example

OleDbConnection con = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+filename+";Extended Properties=Excel 8.0");
con.Open();
try
{
     //Create Dataset and fill with imformation from the Excel Spreadsheet for easier reference
     DataSet myDataSet = new DataSet();
     OleDbDataAdapter myCommand = new OleDbDataAdapter(" SELECT * FROM ["+listname+"$]" , con);
     myCommand.Fill(myDataSet);
     con.Close();
     richTextBox1.AppendText("\nDataSet Filled");

     //Travers through each row in the dataset
     foreach (DataRow myDataRow in myDataSet.Tables[0].Rows)
     {
          //Stores info in Datarow into an array
          Object[] cells = myDataRow.ItemArray;
          //Traverse through each array and put into object cellContent as type Object
          //Using Object as for some reason the Dataset reads some blank value which
          //causes a hissy fit when trying to read. By using object I can convert to
          //String at a later point.
          foreach (object cellContent in cells)
          {
               //Convert object cellContect into String to read whilst replacing Line Breaks with a defined character
               string cellText = cellContent.ToString();
               cellText = cellText.Replace("\n", "|");
               //Read the string and put into Array of characters chars
               richTextBox1.AppendText("\n"+cellText);
          }
     }
     //Thread.Sleep(15000);
}
catch (Exception ex)
{
     MessageBox.Show(ex.ToString());
     //Thread.Sleep(15000);
}
finally
{
     con.Close();
}

try
        {
            DataTable sheet1 = new DataTable("Excel Sheet");
            OleDbConnectionStringBuilder csbuilder = new OleDbConnectionStringBuilder();
            csbuilder.Provider = "Microsoft.ACE.OLEDB.12.0";
            csbuilder.DataSource = fileLocation;
            csbuilder.Add("Extended Properties", "Excel 12.0 Xml;HDR=YES");
            string selectSql = @"SELECT * FROM [Sheet1$]";
            using (OleDbConnection connection = new OleDbConnection(csbuilder.ConnectionString))
            using (OleDbDataAdapter adapter = new OleDbDataAdapter(selectSql, connection))
            {
                connection.Open();
                adapter.Fill(sheet1);
            }
        }
        catch (Exception e)
        {
            Console.WriteLine(e.Message);
        }

This worked for me. Please try it and let me know for queries.


First of all, it's important to know what you mean by "open an Excel file for reading and copy it to clipboard..."

This is very important because there are many ways you could do that depending just on what you intend to do. Let me explain:

  1. If you want to read a set of data and copy that in the clipboard and you know the data format (e.g. column names), I suggest you use an OleDbConnection to open the file, this way you can treat the xls file content as a Database Table, so you can read data with SQL instruction and treat the data as you want.

  2. If you want to do operations on the data with the Excel object model then open it in the way you began.

  3. Some time it's possible to treat an xls file as a kind of csv file, there are tools like File Helpers which permit you to treat and open an xls file in a simple way by mapping a structure on an arbitrary object.

Another important point is in which Excel version the file is.

I have, unfortunately I say, a strong experience working with Office automation in all ways, even if bounded in concepts like Application Automation, Data Management and Plugins, and generally I suggest only as the last resort, to using Excel automation or Office automation to read data; just if there aren't better ways to accomplish that task.

Working with automation could be heavy in performance, in terms of resource cost, could involve in other issues related for example to security and more, and last but not at least, working with COM interop it's not so "free".. So my suggestion is think and analyze the situation within your needs and then take the better way.