Grouping non-sorted year column values to a single column using pandas

Solution 1:

Group your columns by virtual groups:

out = df.groupby(np.repeat(np.arange(len(df.columns) // 2), 2), axis=1) \
        .apply(lambda x: x.set_index('year')).droplevel(0, axis=1).reset_index()
print(out)

# Output
   year  Jan  Feb  March
0  1990   25   30     45
1  1991   27   34     47
2  1992   23   38     49
3  1993   21   36     43

With your data, use this code before:

df = pd.read_csv('England_SE_and_Central_S.txt', skiprows=5, sep='\s+')
df.columns = df.columns.str.split('.').str[0]

out = df.groupby(np.repeat(np.arange(len(df.columns) // 2), 2), axis=1) \
        .apply(lambda x: x.set_index('year')).droplevel(0, axis=1).reset_index()

Output:

>>> out
     year  jan  feb  mar  apr  may   jun   jul   aug   sep  oct  nov  dec    win   spr    sum   aut   ann
0    1884  3.5  2.6  2.5  2.5  6.3   9.0  11.6  11.5  10.5  4.8  2.1  2.1    ---  3.77  10.75  5.79  5.76
1    1885 -0.0  3.2  0.4  3.2  4.8   9.2  10.8   9.4   8.3  3.9  3.2  0.4   1.69  2.78   9.82  5.11  4.72
2    1886 -1.1 -1.8  0.2  3.4  6.1   8.6  11.0  11.1   9.9  7.8  3.1 -1.0  -0.80  3.23  10.27  6.91  4.81
3    1887 -1.6 -0.1 -0.1  1.2  5.3   9.2  11.4  10.0   7.6  2.7  1.7 -0.1  -0.92  2.13  10.23  3.98  3.96
4    1888 -0.3 -1.6 -0.1  2.0  5.3   8.9  10.1  10.1   8.5  2.8  5.6  2.0  -0.66  2.39   9.73  5.58  4.45
..    ...  ...  ...  ...  ...  ...   ...   ...   ...   ...  ...  ...  ...    ...   ...    ...   ...   ...
133  2017  0.2  3.6  5.5  4.2  8.7  11.9  13.2  11.8   9.8  9.1  3.3  2.0   2.07  6.14  12.29  7.44  6.96
134  2018  2.9 -0.5  2.1  6.7  8.0  11.1  13.7  12.5   9.3  6.9  5.3  4.4   1.55  5.60  12.47  7.16  6.92
135  2019  0.7  2.2  4.8  4.4  6.4  10.5  12.8  12.4  10.2  7.5  3.6  3.1   2.43  5.23  11.95  7.12  6.59
136  2020  3.8  3.8  3.2  5.1  7.1  10.9  11.7  14.1   9.9  7.8  5.9  2.8   3.55  5.12  12.24  7.87  7.18
137  2021  0.7  2.3  3.1  1.2  6.0  11.5  13.3  12.1  11.6  8.8  3.9  4.6   1.92  3.44  12.28  8.10  6.61

[138 rows x 18 columns]

Solution 2:

The code below fetches the data directly from the URL you provided and should generate the output you expect.

import pandas as pd

df = pd.read_csv(
    "https://www.metoffice.gov.uk/pub/data/weather/uk/climate/datasets/Tmin/ranked/England_SE_and_Central_S.txt",
    sep="\s+",
    skiprows=5,
    storage_options={"User-Agent": "Mozilla/5.0"}
)

months = []
for i in range(df.shape[1] // 2):
    tmp = pd.DataFrame(df.iloc[:, 2*i]).set_index(df.iloc[:, 2*i + 1])
    months.append(tmp)

final_df = pd.concat(months, axis=1)

Output:

>>> final_df
      jan  feb  mar  apr  may   jun   jul   aug   sep  oct  nov  dec    win   spr    sum   aut   ann
1884  3.5  2.6  2.5  2.5  6.3   9.0  11.6  11.5  10.5  4.8  2.1  2.1    ---  3.77  10.75  5.79  5.76
1885 -0.0  3.2  0.4  3.2  4.8   9.2  10.8   9.4   8.3  3.9  3.2  0.4   1.69  2.78   9.82  5.11  4.72
1886 -1.1 -1.8  0.2  3.4  6.1   8.6  11.0  11.1   9.9  7.8  3.1 -1.0  -0.80  3.23  10.27  6.91  4.81
1887 -1.6 -0.1 -0.1  1.2  5.3   9.2  11.4  10.0   7.6  2.7  1.7 -0.1  -0.92  2.13  10.23  3.98  3.96
1888 -0.3 -1.6 -0.1  2.0  5.3   8.9  10.1  10.1   8.5  2.8  5.6  2.0  -0.66  2.39   9.73  5.58  4.45
...   ...  ...  ...  ...  ...   ...   ...   ...   ...  ...  ...  ...    ...   ...    ...   ...   ...
2017  0.2  3.6  5.5  4.2  8.7  11.9  13.2  11.8   9.8  9.1  3.3  2.0   2.07  6.14  12.29  7.44  6.96
2018  2.9 -0.5  2.1  6.7  8.0  11.1  13.7  12.5   9.3  6.9  5.3  4.4   1.55  5.60  12.47  7.16  6.92
2019  0.7  2.2  4.8  4.4  6.4  10.5  12.8  12.4  10.2  7.5  3.6  3.1   2.43  5.23  11.95  7.12  6.59
2020  3.8  3.8  3.2  5.1  7.1  10.9  11.7  14.1   9.9  7.8  5.9  2.8   3.55  5.12  12.24  7.87  7.18
2021  0.7  2.3  3.1  1.2  6.0  11.5  13.3  12.1  11.6  8.8  3.9  4.6   1.92  3.44  12.28  8.10  6.61