How to read CSV file from GitHub using pandas

You should provide URL to raw content. Try using this:

import pandas as pd

url = 'https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv'
df = pd.read_csv(url, index_col=0)
print(df.head(5))

Output:

               alpha-2           ...            intermediate-region-code
name                             ...                                    
Afghanistan         AF           ...                                 NaN
Åland Islands       AX           ...                                 NaN
Albania             AL           ...                                 NaN
Algeria             DZ           ...                                 NaN
American Samoa      AS           ...                                 NaN

Add ?raw=true at the end of the GitHub URL to get the raw file link.

In your case,

import pandas as pd
url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv?raw=true'
df = pd.read_csv(url,index_col=0)

print(df.head(5))

Output:

               alpha-2 alpha-3  country-code     iso_3166-2   region  \
name                                                                   
Afghanistan         AF     AFG             4  ISO 3166-2:AF     Asia   
Åland Islands       AX     ALA           248  ISO 3166-2:AX   Europe   
Albania             AL     ALB             8  ISO 3166-2:AL   Europe   
Algeria             DZ     DZA            12  ISO 3166-2:DZ   Africa   
American Samoa      AS     ASM            16  ISO 3166-2:AS  Oceania   

                     sub-region intermediate-region  region-code  \
name                                                               
Afghanistan       Southern Asia                 NaN        142.0   
Åland Islands   Northern Europe                 NaN        150.0   
Albania         Southern Europe                 NaN        150.0   
Algeria         Northern Africa                 NaN          2.0   
American Samoa        Polynesia                 NaN          9.0   

                sub-region-code  intermediate-region-code  
name                                                       
Afghanistan                34.0                       NaN  
Åland Islands             154.0                       NaN  
Albania                    39.0                       NaN  
Algeria                    15.0                       NaN  
American Samoa             61.0                       NaN 

Note: This works only with GitHub links and not with GitLab or Bitbucket links.


You can copy/paste the url and change 2 things:

  1. Remove "blob"
  2. Replace github.com by raw.githubusercontent.com

For instance this link:

https://github.com/mwaskom/seaborn-data/blob/master/iris.csv

Works this way:

import pandas as pd

pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')