Set up a KNN for a score

I try to fill nan on the column "score" using a KNN (based on values from columns X_100g, Y_100g and Z_100g.
Here is my df:

Product_Name   brand   score    X_100g    Y_100g    Z_100g
PA              abc      a        40        45         na
PB              def      b        27        27          8
PC              ghi      na       78        na         56
PD              klm      c        na        29         29
PE              nop      b        57         3         76
PF              qrs      na       45        42         33

What I tried is :

imputer = KNNImputer(n_neighbors=5)
dataknn = imputer.fit_transform(data.filter("score"))

It seems that it doesn't work due to an error: "ValueError: at least one array or dtype is required"
Anny help to help me to solve that?
Thx!

I tried to change my initial code for:

imputer = SimpleImputer(strategy = "most_frequent")
dataimputed = imputer.fit_transform(data.filter(["score"]))

As a result I have the following error: "ValueError: cannot reindex from a duplicate axis"


Mistake 1:

df.filter('score') returns an empty dataframe.

This is because Pandas expects a list-like object as the 'items' parameter (i.e., a list of the names of the columns you want to select), refer docs. However, you are supplying a str.

Do a df.filter(['score']), or just a df['score'] to extract the 'score' column as a dataframe.

Mistake 2:

You are using KNNImputer with categorical variables, which is not possible as it works only on numeric data.

Instead, use SimpleImputer (or IterativeImputer) with the 'most_frequent' or 'constant' strategies – these work with categorical data.

If you really wish to use KNNImputer, first encode the 'score' column, impute the null values and then convert back.