Need of using 'r' before path-name while reading a csv file with pandas

Solution 1:

In Python, backslash is used to signify special characters.

For example, "hello\nworld" -- the \n means a newline. Try printing it.

Path names on Windows tend to have backslashes in them. But we want them to mean actual backslashes, not special characters.

r stands for "raw" and will cause backslashes in the string to be interpreted as actual backslashes rather than special characters.

e.g. r"hello\nworld" literally means the characters "hello\nworld". Again, try printing it.

More info is in the Python docs, it's a good idea to search them for questions like these.

https://docs.python.org/3/tutorial/introduction.html#strings

Solution 2:

A raw string will handle back slashes in most cases, such as these two examples:

In [11]:
r'c:\path'

Out[11]:
'c:\\path'

However, if there is a trailing slash then it will break:

In [12]:
r'c:\path\'

  File "<ipython-input-12-9995c7b1654a>", line 1
    r'c:\path\'
               ^
SyntaxError: EOL while scanning string literal

Forward slashes doesn't have this problem:

In [13]:
r'c:/path/'

Out[13]:
'c:/path/'

The safe and portable method is to use forward slashes always and if building a string for a full path to use os.path to correctly handle building a path that will work when the code is executed on different operating systems:

In [14]:
import os
path = 'c:/'
folder = 'path/'
os.path.join(path, folder)

Out[14]:
'c:/path/'

Solution 3:

  • This solution by Denziloe does a perfect job of explaining why r may precede a path string.
    • r'C:\Users\username' works
    • r'C:\Users\username\' does not, because the trailing \ escapes the '.
      • r'C:\Users\username\' + file, where file = 'test.csv' also won't work
      • Results in SyntaxError: EOL while scanning string literal
  • pandas methods that will read a file, such as pandas.read_csv will accept a str or a pathlib object for a file path.
  • If you need to iterate through a list a file names you can add them with an f-string as well.
    • num = 6, f'I have {num} files' interprets as 'I have 6 files', is an example of using an f-string.
import pandas as pd

files = ['test1.csv', 'test2.csv', 'test3.csv']

df_list = list()
for file in files:
    df_list.append(pd.read_csv(rf'C:\Users\username\{file}'))  # path with f-string

df = pd.concat(df_list)