How to save model

Question

How to save model

zhaobin19941008 opened this issue 5 years ago · comments

zhaobin19941008 commented 5 years ago

when i trained a stacking regression model that has two levels , how can i save model to predict new data like RandomForest which i can use joblib to save a model to predict new data? can i save 1st model and 2 nd model Respectively ?

Igor Ivanov commented 5 years ago

Thanks!

Igor Ivanov · Answer 1 · Sat Jan 12 2019 19:51:58 GMT+0800 (China Standard Time)

You can save StackingTransformer or pipeline with any number of StackingTransformer objects (i.e. levels) as any other Scikit-learn-like object using joblib. Please see example.
stacking function does not allow to save models.

zhaobin19941008 · Answer 2 · Sun Jan 13 2019 20:40:54 GMT+0800 (China Standard Time)

Thanks for your answer, i will try this method. and I think vecstack is a good package for stacking

zhaobin19941008 · Answer 3 · Tue Jan 29 2019 04:49:59 GMT+0800 (China Standard Time)

I am sorry to bother you, when i use StackingTransformer and pipeline to save a model, there are some problems. I trained the 1st and 2nd model respectively according to the instructions, but the result is different when i use pipeline and 2nd which use OOF and the parameters of all models are same.

Igor Ivanov · Answer 4 · Tue Jan 29 2019 20:56:16 GMT+0800 (China Standard Time)

I need more details. Please post your code example.

zhaobin19941008 · Answer 5 · Wed Jan 30 2019 11:27:49 GMT+0800 (China Standard Time)

models1 = [
('XG',XGBRegressor( n_estimators=1200,learning_rate=0.05, min_child_weight=11, max_depth=101,gamma=0,subsample=1,colsample_bytree=0.7,reg_lambda=3,reg_alpha=0.05,n_jobs=6,silent=True, objective='reg:linear', booster='gbtree' )),

('ada',AdaBoostRegressor(DecisionTreeRegressor(),n_estimators=1000)),
    
('gb',GradientBoostingRegressor(n_estimators=500,learning_rate=0.09,max_depth=23, min_samples_split=43,min_samples_leaf=1,max_features=5,subsample=0.9,random_state=10))

]

Initialize StackingTransformer

stack = StackingTransformer(estimators=models1, # base estimators
regression=True, # regression task (if you need
# classification - set to False)
variant='A', # oof for train set, predict test
# set in each fold and find mean
metric=mean_squared_error, # metric: callable
n_folds=5, # number of folds
shuffle=True, # shuffle the data
random_state=0, # ensure reproducibility
verbose=2)

Fit StackingTransformer

stack = stack.fit(X_train, y_train)
S_train = stack.transform(X_train)
S_test = stack.transform(X_test)

Apply 2nd level estimator

def make_model():
model=Sequential()
model.add(Dense(128,input_dim=3,init='uniform'))
#model.add(Dropout(0.5))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
#model.add(Dropout(0.5))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
#model.add(Dropout(0.5))
model.add(BatchNormalization(axis=1))
model.add(Dense(512,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(512,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(128,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(64,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dropout(0.2))
model.add(Dense(1,activation='linear',name="out"))
sgd = SGD(0.000000001)
model.compile(optimizer=sgd,loss='mse')
model.load_weights('test_s9.h5')
return model

my_Regressor = KerasRegressor(build_fn=make_model, epochs=500,
batch_size=2048, verbose=0)
early_stopping = EarlyStopping(monitor='val_loss', patience=50, verbose=2)

Fit

my_Regressor.fit(S_train, y_train,epochs=500,batch_size=2048,verbose=1,validation_data=(S_test,y_test),callbacks=[early_stopping])

Predict

y_pred = my_Regressor.predict(S_test)

Final prediction score

print('Final prediction score : [%.8f]' % r2_score(y_test, y_pred))

#Pipeline
steps = [('stack', stack),
('final_estimator', my_Regressor)]

Init Pipeline

pipe = Pipeline(steps)

Fit

pipe = pipe.fit(X_train, y_train)

Predict

y_pred_pipe = pipe.predict(X_test)

Final prediction score

print('Final prediction score using Pipeline: [%.8f]' % r2_score(y_test, y_pred_pipe))

Save Pipeline

joblib.dump(pipe, 'pipe_stack_fold_5.pkl')

'''

Load Pipeline

pipe_loaded = joblib.load('pipe_with_stack.pkl')

Predict using loaded Pipeline

y_pred_pipe_loaded = pipe_loaded.predict(X_test)

Final prediction score

print('Final prediction score using loaded Pipeline: [%.8f]' % r2_score(y_test, y_pred_pipe_loaded))
'''

zhaobin19941008 · Answer 6 · Wed Jan 30 2019 11:34:53 GMT+0800 (China Standard Time)

when i output result(S_train,S_test) from 1st layer to train 2nd layer, the r2_score different from use pipeline

Igor Ivanov · Answer 7 · Wed Jan 30 2019 20:30:57 GMT+0800 (China Standard Time)

Please look at these lines from your code. You load different dump (file names are different):

joblib.dump(pipe, 'pipe_stack_fold_5.pkl')

pipe_loaded = joblib.load('pipe_with_stack.pkl')

zhaobin19941008 · Answer 8 · Wed Jan 30 2019 20:53:23 GMT+0800 (China Standard Time)

Initialize 1st level model

models1 = [
('XG',XGBRegressor( n_estimators=1200,learning_rate=0.05, min_child_weight=11, max_depth=101,gamma=0,subsample=1,colsample_bytree=0.7,reg_lambda=3,reg_alpha=0.05,n_jobs=6,silent=True, objective='reg:linear', booster='gbtree',random_state=10)),

('ada',AdaBoostRegressor(DecisionTreeRegressor(),n_estimators=1000,random_state=10)),
    
('gb',GradientBoostingRegressor(n_estimators=500,learning_rate=0.09,max_depth=23, min_samples_split=43,min_samples_leaf=1,max_features=5,subsample=0.9,random_state=10))

]

Initialize StackingTransformer

stack = StackingTransformer(estimators=models1, # base estimators
regression=True, # regression task (if you need
# classification - set to False)
variant='A', # oof for train set, predict test
# set in each fold and find mean
metric=mean_squared_error, # metric: callable
n_folds=5, # number of folds
shuffle=True, # shuffle the data
random_state=0, # ensure reproducibility
verbose=2)

Fit StackingTransformer

stack = stack.fit(X_train, y_train)
S_train = stack.transform(X_train)
S_test = stack.transform(X_test)

np.savetxt("S_train_5.txt",S_train)
np.savetxt("y_train_5.txt",y_train)
np.savetxt("S_test_5.txt",S_test)
np.savetxt("y_test_5.txt",y_test)

Apply 2nd level estimator

def make_model():
model=Sequential()
model.add(Dense(128,input_dim=3,init='uniform'))
#model.add(Dropout(0.5))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
#model.add(Dropout(0.5))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
#model.add(Dropout(0.5))
model.add(BatchNormalization(axis=1))
model.add(Dense(512,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(512,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(128,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(64,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dropout(0.2))
model.add(Dense(1,activation='linear',name="out"))
sgd = SGD(0.000000001)
model.compile(optimizer=sgd,loss='mse')
model.load_weights('test_s9.h5')
return model

my_Regressor = KerasRegressor(build_fn=make_model, epochs=500,
batch_size=2048, verbose=0)
early_stopping = EarlyStopping(monitor='val_loss', patience=200, verbose=2)

Fit

my_Regressor.fit(S_train, y_train,epochs=500,batch_size=2048,verbose=1,validation_data=(S_test,y_test),callbacks=[early_stopping])

Predict

y_pred = my_Regressor.predict(S_test)

Final prediction score

print('Final prediction score : [%.8f]' % r2_score(y_test, y_pred))

#Pipeline
steps = [('stack', stack),
('final_estimator', my_Regressor)]

Init Pipeline

pipe = Pipeline(steps)

Fit

pipe = pipe.fit(X_train, y_train)

Predict

y_pred_pipe = pipe.predict(X_test)

Final prediction score

print('Final prediction score using Pipeline: [%.8f]' % r2_score(y_test, y_pred_pipe))

Save Pipeline

joblib.dump(pipe, 'pipe_stack_fold_5.pkl')

zhaobin19941008 · Answer 9 · Wed Jan 30 2019 20:59:30 GMT+0800 (China Standard Time)

The result (print('Final prediction score : [%.8f]' % r2_score(y_test, y_pred))) is different from the result which i train the 2nd layer by S_train and S_test solely. I think the result will be same. but it is different

Igor Ivanov · Answer 10 · Thu Jan 31 2019 19:20:34 GMT+0800 (China Standard Time)

I think I don't understand your question. Please tell me which of two cases you ask me about:

Case 1. Saved Pipeline

You train your pipeline and save it:

joblib.dump(pipe, 'pipe.pkl')

Then you load it:

pipe_loaded_1 = joblib.load('pipe.pkl')
y_pred_pipe_loaded_1 = pipe_loaded_1.predict(X_test)

Then you load the same pipeline later. Prediction must be the same every time:

pipe_loaded_2 = joblib.load('pipe.pkl')
y_pred_pipe_loaded_2 = pipe_loaded_2.predict(X_test)

Predictions y_pred_pipe_loaded_1 and y_pred_pipe_loaded_2 are identical.

Case 2. Retraining

If you retrain StackingTransformer, or top level estimator, or the whole pipeline, each time result will be different because of different random states. If you want to get the same prediction after retraining you have to do the following:

Carefully set random_state in each estimator. For example in your code you set random_state for AdaBoostRegressor, but you forgot to set random_state for DecisionTreeRegressor:

AdaBoostRegressor(DecisionTreeRegressor(), n_estimators=1000, random_state=10)

Set all random seeds for neural network. Actually it’s hard to make neural network reproducible. Approach may depend on particular Keras backend. For example you can look at these instructions.

zhaobin19941008 · Answer 11 · Thu Jan 31 2019 22:02:54 GMT+0800 (China Standard Time)

The problem I encountered is i output the S_train and S_test from Stacking and StackingTransformer by ;
Stacking : S_train, S_test = stacking(.......)
StackingTransformer : stack = StackingTransformer(........)
stack = stack.fit(X_train, y_train)
S_train = stack.transform(X_train)
S_test = stack.transform(X_test)
Then i tune a neural network model separately by S_train and S_test . I train the neural network with S_train and S_test of different method (Stacking and StackingTransformer ) . the results are almost the same which is about 0.86. But when I applied my trained neural network model into 2nd layer. Then there is a result : print('Final prediction score : [%.8f]' % r2_score(y_test, y_pred)) . the printed result is 0.84 or some times is 0.81. And the result(print('Final prediction score using Pipeline: [%.8f]' % r2_score(y_test, y_pred_pipe))) is almost 0.84 or 0.81. I think the pipeline result (0.84 or 0.81)should be the same as the separate neural network model (0.86). So I don not know why does this happen ?
The code of neural network will be upload in minute.
I will try to set all random seeds right now.

zhaobin19941008 · Answer 12 · Thu Jan 31 2019 22:09:20 GMT+0800 (China Standard Time)

trainFile = pd.read_csv('S_train_y.csv')
testFile = pd.read_csv('S_test_y.csv')

X_train = trainFile.ix[:,:-1]
y_train = trainFile.ix[:,-1].values.reshape(trainFile.shape[0],1)
X_test = testFile.ix[:,:-1]
y_test = testFile.ix[:,-1].values.reshape(testFile.shape[0],1)

ss_X=StandardScaler()
ss_y=StandardScaler()
X_train=ss_X.fit_transform(X_train)
X_test=ss_X.fit_transform(X_test)

model=Sequential()
model.add(Dense(128,input_dim=3,init='uniform'))
#model.add(Dropout(0.5))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
#model.add(Dropout(0.5))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
#model.add(Dropout(0.5))
model.add(BatchNormalization(axis=1))
model.add(Dense(512,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(512,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(256,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(128,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dense(64,init='uniform'))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(axis=1))
model.add(Dropout(0.2))
model.add(Dense(1,activation='linear',name="out"))

adam = Adam(0.0000001)
sgd = SGD(0.000000001)
model.compile(optimizer=sgd,loss='mse')
model.load_weights('test_s9.h5')
model.fit(X_train,y_train,epochs=10,batch_size=2048,verbose=1,validation_data=(X_test,y_test), callbacks=[TensorBoard(log_dir='log')])
dense1_layer_model = Model(inputs=model.input,outputs=model.get_layer('out').output)
dense1_output = dense1_layer_model.predict(X_test)
print (r2_score(y_test,dense1_output))

Igor Ivanov · Answer 13 · Fri Feb 01 2019 00:09:12 GMT+0800 (China Standard Time)

I think the most probable reason is random seeds of your neural net. Each time you call fit method of your neural net or pipeline there will be different initialization and result.

Did you try to reproduce prediction of your neural net alone without stacking and pipeline?

You should try to split the problem. I mean try to reproduce results of each estimator separately. Start from your neural net. Train your neural net without stacking and pipeline on train data and then predict test data. Then try to train and predict again. Ensure that you are able to reproduce results and to get the same score. Repeat reproducibility check for each estimator in your pipeline. Then try stacking.