Can I use a batch file to remove blank columns in a .csv

I have nine columns in my current .csv file, however, the 2nd and 3rd columns are blank and I want to get rid of them. Is there a way to use batch commands to remove these empty columns from my .csv or export a new .csv without those blank columns?


Solution 1:

For /F is capable of parsing a .csv file. In one line, to be run directly from the command prompt:

for /F "tokens=1,2,3* delims=," %i in (input.csv) do @echo %i,%l>>output.csv

This traverses input.csv line for line and using a comma as delimiter, stores the first value in %i, the second in %j, the third in %k and the rest of the string (by using an asterisk in the list of tokens) in %l. By printing all but the second and third values, you are effectively taking those out.

However, for /F does not count empty strings as tokens, so the command cannot differentiate between two subsequent commas and a single one. This solution does work if the empty columns have a whitespace character, but may skip the wrong columns otherwise.

To circumvent this restriction, we can write a batch file that uses string replacement to insert a space between two subsequent commas. This replacement is executed twice to cope with the possibility of three commas. The quotes in ^"!line!^" are required to indicate for /F has to operate on a string instead of a file and are escaped to prevent !line! from being evaluated as a literal string instead of a variable.

@echo off
setlocal EnableDelayedExpansion
for /F "delims=" %%a in (input.csv) do (
    set line=%%a
    set line=!line:,,=, ,!
    set line=!line:,,=, ,!
    for /F "tokens=1,2,3* delims=," %%i in (^"!line!^") do (
        echo %%i,%%l>>output.csv
    )
)

The result is a new .csv file, identical to input.csv, except for the omitted second and third columns.

Solution 2:

You did not specify which OS. If yours comes with cut, use this.

cut -d',' -f1,4-9

-d sets the delimiter, in this case to a ,. Since comma is not a special character, you can omit the quotes.
-f selects the fields, in this case field 1 and fields 4 till 9 (skipping fields 2 and 3).

Testfile "example.cvs"
foo1,   ,   , ba1, baz1, blah1, wibble1, wubble1
foo2,   ,   , ba2, baz2, blah2, wibble2, wubble2
foo2,   ,   , ba3,     , blah3, wibble3, wubble3
foo2,   ,   , ba4, baz4, blah4, wibble4, wubble4

An actual test of the command:

> cut -d',' -f1,4-9 example.cvs

foo1, ba1, baz1, blah1, wibble1, wubble1
foo2, ba2, baz2, blah2, wibble2, wubble2
foo2, ba3,     , blah3, wibble3, wubble3
foo2, ba4, baz4, blah4, wibble4, wubble4

Cut comes preinstalled with most operating systems (e.g., OS/X, Linux distributions, BSDs, Solaris, ...; just about everyone except DOS and Windows).

If you have Windows, you can:

  1. Open the CVS file in a spreadsheet (e.g., OpenOffice Calc or MS Excel), delete two columns, and then save the file. Hard to automate though, so use only if you need to do it once or rarely.
  2. or download Unix tools for Windows and put the cut command in a batch file.