How to convert parquet file to CSV using .NET Core?
With Cinchoo ETL - an open source library, you can convert Parquet file to CSV easily.
Install Nuget package
install-package ChoETL.Parquet
Sample code
using ChoETL;
StringBuilder csv = new StringBuilder();
using (var r = new ChoParquetReader(@"*** Your Parquet file ***")
.ParquetOptions(o => o.TreatByteArrayAsString = true)
)
{
using (var w = new ChoCSVWriter(csv)
.WithFirstLineHeader()
.UseNestedKeyFormat(false)
)
w.Write(r);
}
Console.WriteLine(csv.ToString());
For more information, please visit codeproject article.
I haven't given it a shot, but I wonder whether you could leverage / abuse the Microsoft Spark SQL libraries to your benefit.
There's
DataFrameReader.Parquet(String[])
https://docs.microsoft.com/en-us/dotnet/api/microsoft.spark.sql.dataframereader.parquet?view=spark-dotnet
And also:
DataFrameWriter.Csv(String) Method
https://docs.microsoft.com/en-us/dotnet/api/microsoft.spark.sql.dataframewriter.csv?view=spark-dotnet#Microsoft_Spark_Sql_DataFrameWriter_Csv_System_String_
I wonder whether you could use a DataFrame as an in memory intermediary.
It's just a guess at the moment as your question intrigued me, perhaps I'll give it a shot once I've got some sleep. :-)