Read CSV using pandas with values enclosed with double quotes and values have comma in column
No need to preprocess csv file, just use engine type python :
dataset = pd.read_csv('sample.csv', sep=',', engine='python')
Use in python pandas sep=',\s*'
instead of sep=',\s+'
, it will make space(s) optional after each comma:
file1 = pd.read_csv('sample.txt',sep=',\s*',skipinitialspace=True,quoting=csv.QUOTE_ALL,engine='python')
Comma inside double quotes is Ok, it's allowed by rfc4180 standard.
As about " "
inside of data values (such as "value" "13") - you will need to clean up source file before processing. If double quotes stay together as "" it shouldn't be an issue because it comply with CSV standard, it calls escaped double quotes, but if there is a space between double quotes then you need to clean it up
Use:
sed -r 's/\"\s+\"/\"\"/g' src.csv >cleared.csv
before you feeding CSV to pandas. It will remove space between quotes or run
sed -r 's/\"\s+\"//g' src.csv >cleared.csv
to remove internal quotes completely.