How to convert parquet file to CSV using .NET Core?

With Cinchoo ETL - an open source library, you can convert Parquet file to CSV easily.

Install Nuget package

install-package ChoETL.Parquet

Sample code

using ChoETL;

StringBuilder csv = new StringBuilder();
using (var r = new ChoParquetReader(@"*** Your Parquet file ***")
    .ParquetOptions(o => o.TreatByteArrayAsString = true)
    )
{
    using (var w = new ChoCSVWriter(csv)
        .WithFirstLineHeader()
        .UseNestedKeyFormat(false)
        )
        w.Write(r);
}

Console.WriteLine(csv.ToString());

For more information, please visit codeproject article.


I haven't given it a shot, but I wonder whether you could leverage / abuse the Microsoft Spark SQL libraries to your benefit.

There's

DataFrameReader.Parquet(String[])

https://docs.microsoft.com/en-us/dotnet/api/microsoft.spark.sql.dataframereader.parquet?view=spark-dotnet

And also:

DataFrameWriter.Csv(String) Method

https://docs.microsoft.com/en-us/dotnet/api/microsoft.spark.sql.dataframewriter.csv?view=spark-dotnet#Microsoft_Spark_Sql_DataFrameWriter_Csv_System_String_

I wonder whether you could use a DataFrame as an in memory intermediary.

It's just a guess at the moment as your question intrigued me, perhaps I'll give it a shot once I've got some sleep. :-)