beat random forest with sgd elasticnet at 88 accuracy score
lampts opened this issue · comments
Thank for sharing your repo. I can beat current tfidf 5K by using sgd with elasticnet penalty and got accuracy 88%.
sgd = SGDClassifier(n_iter=500, loss='modified_huber', penalty='elasticnet')
sgd.fit(X, labels)
y_pred = sgd.predict(X_test)
print metrics.accuracy_score(y_test, y_pred)
print metrics.classification_report(y_test, y_pred)
print metrics.confusion_matrix(y_test, y_pred)
Output
0.881024096386
precision recall f1-score support
0.0 0.88 0.82 0.85 265
1.0 0.88 0.92 0.90 399
avg / total 0.88 0.88 0.88 664
[[216 49]
[ 30 369]]
That's awesome! :D
And great with the f1 scores. I'll need to add that to mine as well.
Want to create a pull request and add it to the script?
However, are you sure you fit the model solely on the training data?
sgd.fit(X, labels)
It could look like X represents the entire dataset?
Hi,
X is only training set, I can share my script so on by creating a pull request.
Laam
Gotcha! I'm actually adding your code to a new script which'll make it easier for others to expand upon it. Give me ten minutes, and it'll be ready so you can have a look and provide input, ok?
Sure, let check it https://github.com/lampts/data_science/blob/master/LeadQualifier-a88.ipynb
Voila, that's it. Thanks.