Sort a pandas dataframe series by month name

You can use categorical data to enable proper sorting with pd.Categorical:

months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", 
          "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
df['months'] = pd.Categorical(df['months'], categories=months, ordered=True)
df.sort_values(...)  # same as you have now; can use inplace=True

When you specify the categories, pandas remembers the order of specification as the default sort order.

Docs: Pandas categories > sorting & order.


You should consider re-indexing it based on axis 0 (indexes)

new_order = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

df1 = df.reindex(new_order, axis=0)

Thanks @Brad Solomon for offering a faster way to capitalize string!

Note 1 @Brad Solomon's answer using pd.categorical should save your resources more than my answer. He showed how to assign order to your categorical data. You should not miss it :P

Alternatively, you can use.

df = pd.DataFrame([["dec", 12], ["jan", 40], ["mar", 11], ["aug", 21],
                  ["aug", 11], ["jan", 11], ["jan", 1]], 
                   columns=["Month", "Price"])
# Preprocessing: capitalize `jan`, `dec` to `Jan` and `Dec`
df["Month"] = df["Month"].str.capitalize()

# Now the dataset should look like
#   Month Price
#   -----------
#    Dec    XX
#    Jan    XX
#    Apr    XX

# make it a datetime so that we can sort it: 
# use %b because the data use the abbreviation of month
df["Month"] = pd.to_datetime(df.Month, format='%b', errors='coerce').dt.month
df = df.sort_values(by="Month")

total = (df.groupby(df['Month'])['Price'].mean())

# total 
Month
1     17.333333
3     11.000000
8     16.000000
12    12.000000

Note 2 groupby by default will sort group keys for you. Be aware to use the same key to sort and groupby in the df = df.sort_values(by=SAME_KEY) and total = (df.groupby(df[SAME_KEY])['Price'].mean()). Otherwise, one may gets unintended behavior. See Groupby preserve order among groups? In which way? for more information.

Note 3 A more computationally efficient way is first compute mean and then do sorting on months. In this way, you only need to sort on 12 items rather than the whole df. It will reduce the computational cost if one don't need df to be sorted.

Note 4 For people already have month as index, and wonder how to make it categorical, take a look at pandas.CategoricalIndex @jezrael has a working example on making categorical index ordered in Pandas series sort by month index