Saving StandardScaler() model for use on new datasets

How do I save the StandardScaler() model in Sklearn? I need to make a model operational and don't want to load training data agian and again for StandardScaler to learn and then apply on new data on which I want to make predictions.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

#standardizing after splitting
X_train, X_test, y_train, y_test = train_test_split(data, target)
sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)
X_test_std = sc.transform(X_test)

Solution 1:

you could use joblib dump function to save the standard scaler model. Here's a complete example for reference.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data, target = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(data, target)

sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)

if you want to save the sc standardscaller use the following

from sklearn.externals.joblib import dump, load
dump(sc, 'std_scaler.bin', compress=True)

this will create the file std_scaler.bin and save the sklearn model.

To read the model later use load

sc=load('std_scaler.bin')

Note: sklearn.externals.joblib is deprecated. Install and use the pure joblib instead

Solution 2:

Or if you like to pickle:

import pickle
pickle.dump(sc, open('file/path/scaler.pkl','wb'))

sc = pickle.load(open('file/path/scaler.pkl','rb'))

Saving StandardScaler() model for use on new datasets

Solution 1:

Solution 2:

Related

Recent Posts