How can this code be modified to work with CsvHelper/CsvReader to make the class more efficient for processing?

This question is a spin-off this one (Can this method be refactored to use a lamba?). In the comments to that question this was raised:

The refactoring needs to be done in a different place. It's not your function that sucks, it's the definition of CLMExplorerHistory.

The internal data structure does not need to represent the structure of the CSV file in a 1:1 fashion. Just make sure you can read and write in a compatible way.

I think that you're attempting to use the csv data structure directly rather than look at possible areas of duplication that would indicate the obvious place for implementing a class type in order to allow for code reuse. Could you post up a snippet of example data for your CSV file to indicate your starting point?


On the backend of those comments, here is a snippet of the example CSV data:

Date,Status,Weekly Bible Reading,Song1 #,Song1 Title,Song1 Scripture,Song2 #,Song2 Title,Song2 Scripture,Song3 #,Song3 Title,Song3 Scripture,Meeting?,CO Visit,Cancel Reason,# Classes,Chairman,PrayerOpen,PrayerClose,TreasuresTalk,TreasuresTalk_Theme,TreasuresDigging,BibleReading_A,BibleReading_Study_A,BibleReading_B,BibleReading_Study_B,BibleReading_C,BibleReading_Study_C,BibleReading_Source,Apply1_Description,Apply2_Description,Apply3_Description,Apply4_Description,Apply1_A,Apply1_Asst_A,Apply1_Study_A,Apply2_A,Apply2_Asst_A,Apply2_Study_A,Apply3_A,Apply3_Asst_A,Apply3_Study_A,Apply4_A,Apply4_Asst_A,Apply4_Study_A,Apply1_B,Apply1_Asst_B,Apply1_Study_B,Apply2_B,Apply2_Asst_B,Apply2_Study_B,Apply3_B,Apply3_Asst_B,Apply3_Study_B,Apply4_B,Apply4_Asst_B,Apply4_Study_B,Apply1_C,Apply1_Asst_C,Apply1_Study_C,Apply2_C,Apply2_Asst_C,Apply2_Study_C,Apply3_C,Apply3_Asst_C,Apply3_Study_C,Apply4_C,Apply4_Asst_C,Apply4_Study_C,LivingPart1,LivingPart1_Theme,LivingPart1_Length,LivingPart2,LivingPart2_Theme,LivingPart2_Length,CBS,CBS_Source,CBS_Read,Audience B,Audience C,AuxCounselor B,AuxCounselor C
01/03/2021,Registrato e completo,NUMERI 7-8,4,“Geova è il mio Pastore”,Salmo 23,54,“Questa è la via”,"Isaia 30:20, 21",127,Che tipo di persona sono?,2 Pietro 3:11,Y,N,,1,Italo De Gaeta,xxx,xxx,xxx,“L’accampamento d’Israele: lezioni utili”,xxx,xxx,5,,,,,Nu 7:1-17,Commemorazione,Visita ulteriore,Visita ulteriore,Visita ulteriore,xxx,xxx,11,xxx,xxx,6,xxx,xxx,12,xxx,xxx,17,,,,,,,,,,,,,,,,,,,,,,,,,xxx,Risultati raggiunti dall’organizzazione,(5 min),xxx,Bisogni locali,(10 min),xxx,"rr cap. 5 parr. 17-22, riquadro 5A",xxx,,,,

I use the CsvReader class to read it into the CLMExplorerHistory object. The class is defined like this:

public class CLMExplorerHistory
{
    [Format("dd/MM/yyyy")]
    [Name("Date")]
    public DateTime Date { get; set; }

    [BooleanFalseValues(new string[] { "N", "n" })]
    [BooleanTrueValues(new string[] { "Y", "y" })]
    [Name("Meeting?")]
    public bool Meeting { get; set; }

    [Name("# Classes")]
    public int Classes { get; set; }

    [Name("Chairman")]
    public string Chairman { get; set; }

    [Name("AuxCounselor B")]
    public string AuxCounsellor1 { get; set; }

    [Name("AuxCounselor C")]
    public string AuxCounsellor2 { get; set; }

    [Name("PrayerOpen")]
    public string PrayerOpen { get; set; }

    [Name("PrayerClose")]
    public string PrayerClose { get; set; }

    [Name("CBS")]
    public string CBSConductor { get; set; }

    [Name("CBS_Read")]
    public string CBSReader { get; set; }

    [Name("TreasuresTalk")]
    public string TreasuresTalkName { get; set; }

    [Name("TreasuresTalk_Theme")]
    public string TreasuresTalkTheme { get; set; }

    [Name("TreasuresDigging")]
    public string SpiritualGemsName { get; set; }

    [Name("LivingPart1")]
    public string LivingPart1Name { get; set; }

    [Name("LivingPart1_Theme")]
    public string LivingPart1Theme { get; set; }

    [Name("LivingPart2")]
    public string LivingPart2Name { get; set; }

    [Name("LivingPart2_Theme")]
    public string LivingPart2Theme { get; set; }

    [Name("BibleReading_A")]
    public string BibleReadingClass1Name { get; set; }

    [Name("BibleReading_B")]
    public string BibleReadingClass2Name { get; set; }

    [Name("BibleReading_C")]
    public string BibleReadingClass3Name { get; set; }

    [Name("BibleReading_Study_A")]
    public string BibleReadingStudy { get; set; }

    [Name("Apply1_Description")]
    public string StudentItem1Description { get; set; }

    [Name("Apply2_Description")]
    public string StudentItem2Description { get; set; }

    [Name("Apply3_Description")]
    public string StudentItem3Description { get; set; }

    [Name("Apply4_Description")]
    public string StudentItem4Description { get; set; }

    [Name("Apply1_A")]
    public string StudentItem1Class1StudentName { get; set; }

    [Name("Apply1_B")]
    public string StudentItem1Class2StudentName { get; set; }

    [Name("Apply1_C")]
    public string StudentItem1Class3StudentName { get; set; }

    [Name("Apply1_Asst_A")]
    public string StudentItem1Class1AssistantName { get; set; }

    [Name("Apply1_Asst_B")]
    public string StudentItem1Class2AssistantName { get; set; }

    [Name("Apply1_Asst_C")]
    public string StudentItem1Class3AssistantName { get; set; }

    [Name("Apply2_A")]
    public string StudentItem2Class1StudentName { get; set; }

    [Name("Apply2_B")]
    public string StudentItem2Class2StudentName { get; set; }

    [Name("Apply2_C")]
    public string StudentItem2Class3StudentName { get; set; }

    [Name("Apply2_Asst_A")]
    public string StudentItem2Class1AssistantName { get; set; }

    [Name("Apply2_Asst_B")]
    public string StudentItem2Class2AssistantName { get; set; }

    [Name("Apply2_Asst_C")]
    public string StudentItem2Class3AssistantName { get; set; }

    [Name("Apply3_A")]
    public string StudentItem3Class1StudentName { get; set; }

    [Name("Apply3_B")]
    public string StudentItem3Class2StudentName { get; set; }

    [Name("Apply3_C")]
    public string StudentItem3Class3StudentName { get; set; }

    [Name("Apply3_Asst_A")]
    public string StudentItem3Class1AssistantName { get; set; }

    [Name("Apply3_Asst_B")]
    public string StudentItem3Class2AssistantName { get; set; }

    [Name("Apply3_Asst_C")]
    public string StudentItem3Class3AssistantName { get; set; }

    [Name("Apply4_A")]
    public string StudentItem4Class1StudentName { get; set; }

    [Name("Apply4_B")]
    public string StudentItem4Class2StudentName { get; set; }

    [Name("Apply4_C")]
    public string StudentItem4Class3StudentName { get; set; }

    [Name("Apply4_Asst_A")]
    public string StudentItem4Class1AssistantName { get; set; }

    [Name("Apply4_Asst_B")]
    public string StudentItem4Class2AssistantName { get; set; }

    [Name("Apply4_Asst_C")]
    public string StudentItem4Class3AssistantName { get; set; }

    [Name("Apply1_Study_A")]
    public string StudentItem1Study { get; set; }

    [Name("Apply2_Study_A")]
    public string StudentItem2Study { get; set; }

    [Name("Apply3_Study_A")]
    public string StudentItem3Study { get; set; }

    [Name("Apply4_Study_A")]
    public string StudentItem4Study { get; set; }
}

And I read in the CSV file like this:

using (var reader = new StreamReader(_calendarDBPath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
    var records = csv.GetRecords<CLMExplorerHistory>();

    foreach (var record in records)
    {
        ...

At the moment the CSV record is read into this list of public properties. I wanted to know if it could directly read the fields into a more complex object?

Specifcally, the # Classes field has a value of 1, 2 or 3. And for each of these classes (not C# classes - think school) there is similar data.

There are 4 items per class and the description / study is the same in each class. So these CSV fields are common:

[Name("Apply1_Description")]
[Name("Apply2_Description")]
[Name("Apply3_Description")]
[Name("Apply4_Description")]

[Name("BibleReading_Study_A")]
[Name("Apply1_Study_A")]
[Name("Apply2_Study_A")]
[Name("Apply3_Study_A")]
[Name("Apply4_Study_A")]

But the rest are specific to the 3 Classes (schools):

  • These fields belong to Class 1:
[Name("BibleReading_A")]
[Name("Apply1_A")]
[Name("Apply1_Asst_A")]
[Name("Apply2_A")]
[Name("Apply2_Asst_A")]
[Name("Apply3_A")]
[Name("Apply3_Asst_A")]
[Name("Apply4_A")]
[Name("Apply4_Asst_A")]
  • These fields belong to Class 2:
[Name("BibleReading_B")]
[Name("Apply1_B")]
[Name("Apply1_Asst_B")]
[Name("Apply2_B")]
[Name("Apply2_Asst_B")]
[Name("Apply3_B")]
[Name("Apply3_Asst_B")]
[Name("Apply4_B")]
[Name("Apply4_Asst_B")]
  • These fields belong to Class 3:
[Name("BibleReading_C")]
[Name("Apply1_C")]
[Name("Apply1_Asst_C")]
[Name("Apply2_C")]
[Name("Apply2_Asst_C")]
[Name("Apply3_C")]
[Name("Apply3_Asst_C")]
[Name("Apply4_C")]
[Name("Apply4_Asst_C")]

The 3 Classes ca be represented by 3 "StudentClass" objects in a Lists. Something like

StudentClasses
    NumClasses
    Item1Desc
    Item2Desc
    Item3Desc
    Item4Desc
    BibleReadingStudy
    Item1Study
    Item2Study
    Item3Study
    Item4Study
    List<StudentClass>
           BibleReading
           Item1Student
           Item1Assistant
           Item2Student
           Item2Assistant
           Item3Student
           Item3Assistant
           Item4Student
           Item4Assistant

The above is built from:

StudentClasses
    NumClasses:        # Classes
    Item1Desc:         Apply1_Description
    Item2Desc:         Apply2_Description
    Item3Desc:         Apply3_Description
    Item4Desc:         Apply4_Description
    BibleReadingStudy: BibleReading_Study_A
    Item1Study:        Apply1_Study_A
    Item2Study:        Apply2_Study_A
    Item3Study:        Apply3_Study_A
    Item4Study:        Apply4_Study_A
    List<StudentClass>
           BibleReading    BibleReading_A|B|C
           Item1Student    Apply1_A|B|C
           Item1Assistant  Apply1_Asst_A|B|C
           Item2Student    Apply2_A|B|C
           Item2Assistant  Apply2_Asst_A|B|C
           Item3Student    Apply3_A|B|C
           Item3Assistant  Apply3_Asst_A|B|C
           Item4Student    Apply4_A|B|C
           Item5Assistant  Apply4_Asst_A|B|C

Now you know the mapping between fields to the proposed object. Obviously I can create that C# class myself and I can manually transfer the fields into such a structure. But does CvsReader have the ability to directly read into a class like this? Rather than one big list of properties?


Solution 1:

But does CvsReader have the ability to directly read into a class like this? Rather than one big list of properties?

I'd say "yes", though I'm not 100% sure of what you're ultimately hoping to create, but you can get cute with what the properties do. I'll give a simple example.

Suppose we have a CSV that is listing Shops and 3 products that need to be purchased from each shop

Shop,Product1,P1Price,Product2,P2Price,Product3,P3Price
Walmart,Eggs,1,Milk,2,Bread,3
BestBuy,PC,4,iPad,5,iPhone,6

You could have your class that holds the props:

class ShopLine
{
  string Shop {get;set;}
  string Product1 {get;set;}
  string Product2 {get;set;}
  string Product3 {get;set;}
  int P1Price {get;set;}
  int P2Price {get;set;}
  int P3Price {get;set;}
}

But noone has a gun to your head saying the data storage has to be like this.. You could:

class Product
{
  string Name {get;set;}
  int Price {get;set;}
}

class ShopLineTrick
{
  string Shop {get;set;}
  string Product1 {get => products[0].Name; set => products[0].Name= value;}
  string Product2 {get => products[1].Name; set => products[1].Name= value;}
  string Product3 {get => products[2].Name; set => products[2].Name= value;}
  int P1Price {get => products[0].Price; set => products[0].Price = value;}
  int P2Price {get => products[1].Price; set => products[1].Price = value;}
  int P3Price {get => products[2].Price; set => products[2].Price = value;}

  List<Product> products;

  ShopLineTrick(){
     //precreate 3 products
     products = Enumerable.Range(1,3).Select(x => new Product()).ToList();
  }
}

At the end of the operation you have some collection of ShopLineTrick that each contain some built hierarchy of Product; Product are a "reusable class from somewhere else" so this operation is like a parsing and mapping in one. What was a POCO for receiving the data in the file before, has become adevice capable of building the hierarchy of your smaller, reusable classes from somewhere else. You could query with LINQ to get them, for example

shopLines.SelectMany(sl => sl.Products). // a straight list of every product

I've explained using an alternative example because I don't quite yet understand what you're hoping to achieve with the data you have: I'm describing an alternative data storage so you can apply it to further your goal. For a recent project, for example, I had to store every version of a property at some point in time and then let a process I didn't control alter some properties of the object, and then examine what had changed. That was easy to do in a loop, with every one of 10 properties using a single object[10] for its storage. At some point in time I would clone the array, let the process alter stuff and then look for array entries in the current that were different to the cloned copy. Backing the properties with an array made that check (and even reset/accept changes) trivial by looping the arrays


The only thing to watch out for is that some kinds of reading CsvH does, it reuses an object you pass, so if you're e.g. doing an async read just be aware that it could read all the lines of the file and youd end up with only the data from the last one (it would be on you to extract the Products list, and renew it each loop pass to prevent overwriting)