Apply Chi-Square to dataset which contains categorical variables

Solution 1:

You can use pd.factorize to encode your categorical variables:

df['nVoted?'] = pd.factorize(df['Voted?'])[0]
df['nCategory'] = pd.factorize(df['Political Category'])[0]
print(df)

# Output
         Voted? Political Category  nVoted?  nCategory
0           Yes              Right        0          0
1            No               Left        1          1
2  Not Answered             Center        2          2
3           Yes              Right        0          0
4           Yes              Right        0          0
5            No              Right        1          0

After that you can use scipy.stats.chisquare