Read fixed width record from text file

I've got a text file full of records where each field in each record is a fixed width. My first approach would be to parse each record simply using string.Substring(). Is there a better way?

For example, the format could be described as:

<Field1(8)><Field2(16)><Field3(12)>

And an example file with two records could look like:

SomeData0000000000123456SomeMoreData
Data2   0000000000555555MoreData    

I just want to make sure I'm not overlooking a more elegant way than Substring().


Update: I ultimately went with a regex like Killersponge suggested:

private readonly Regex reLot = new Regex(REGEX_LOT, RegexOptions.Compiled);
const string REGEX_LOT = "^(?<Field1>.{6})" +
                        "(?<Field2>.{16})" +
                        "(?<Field3>.{12})";

I then use the following to access the fields:

Match match = reLot.Match(record);
string field1 = match.Groups["Field1"].Value;

Solution 1:

Use FileHelpers.

Example:

[FixedLengthRecord()] 
public class MyData
{ 
  [FieldFixedLength(8)] 
  public string someData; 

  [FieldFixedLength(16)] 
  public int SomeNumber; 

  [FieldFixedLength(12)] 
  [FieldTrim(TrimMode.Right)]
  public string someMoreData;
}

Then, it's as simple as this:

var engine = new FileHelperEngine<MyData>(); 

// To Read Use: 
var res = engine.ReadFile("FileIn.txt"); 

// To Write Use: 
engine.WriteFile("FileOut.txt", res); 

Solution 2:

Substring sounds good to me. The only downside I can immediately think of is that it means copying the data each time, but I wouldn't worry about that until you prove it's a bottleneck. Substring is simple :)

You could use a regex to match a whole record at a time and capture the fields, but I think that would be overkill.

Solution 3:

Why reinvent the wheel? Use .NET's TextFieldParser class per this how-to for Visual Basic: How to read from fixed-width text files.

Solution 4:

You may have to watch out, if the end of the lines aren't padded out with spaces to fill the field, your substring won't work without a bit of fiddling to work out how much more of the line there is to read. This of course only applies to the last field :)

Solution 5:

Unfortunately out of the box the CLR only provides Substring for this.

Someone over at CodeProject made a custom parser using attributes to define fields, you might wanna look at that.