Need help with pystacknet?

Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

254 Stars 68 Forks MIT License 7 Commits 17 Opened issues

Readme

pystacknetis a light python version of StackNet which was originally made in Java.

It supports many of the original features, with some new elements.

git clone https://github.com/h2oai/pystacknet cd pystacknet python setup.py install

pystacknet's main object is a 2-dimensional list of sklearn type of models. This list defines the StackNet structure. This is the equivalent of parameters in the Java version. A representative example could be:

from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier, GradientBoostingClassifier from sklearn.linear_model import LogisticRegression`models=[ ######## First level ######## [RandomForestClassifier (n_estimators=100, criterion="entropy", max_depth=5, max_features=0.5, random_state=1), ExtraTreesClassifier (n_estimators=100, criterion="entropy", max_depth=5, max_features=0.5, random_state=1), GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=5, max_features=0.5, random_state=1), LogisticRegression(random_state=1) ], ######## Second level ######## [RandomForestClassifier (n_estimators=200, criterion="entropy", max_depth=5, max_features=0.5, random_state=1)] ]`

pystacknetis not as strict as in the

Javaversion and can allow

Regressors,

Classifiersor even

Transformersat any level of StackNet. In other words the following could work just fine:

from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, ExtraTreesClassifier, ExtraTreesRegressor, GradientBoostingClassifier,GradientBoostingRegressor from sklearn.linear_model import LogisticRegression, Ridge from sklearn.decomposition import PCA models=[`[RandomForestClassifier (n_estimators=100, criterion="entropy", max_depth=5, max_features=0.5, random_state=1), ExtraTreesRegressor (n_estimators=100, max_depth=5, max_features=0.5, random_state=1), GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=5, max_features=0.5, random_state=1), LogisticRegression(random_state=1), PCA(n_components=4,random_state=1) ], [RandomForestClassifier (n_estimators=200, criterion="entropy", max_depth=5, max_features=0.5, random_state=1)] ]`

**Note** that not all transformers are meaningful in this context and you should use it at your own risk.

A typical usage for classification could be :

from pystacknet.pystacknet import StackNetClassifiermodel=StackNetClassifier(models, metric="auc", folds=4, restacking=False,use_retraining=True, use_proba=True, random_state=12345,n_jobs=1, verbose=1)

model.fit(x,y) preds=model.predict_proba(x_test)

Where :

Command |
Explanation |
---|---|

models | List of models. This should be a 2-dimensional list . The first level hould defice the stacking level and each entry is the model. |

metric | Can be "auc","logloss","accuracy","f1","matthews" or your own custom metric as long as it implements (ytrue,ypred,sampleweight=) |

folds | This can be either integer to define the number of folds used in StackNetor an iterable yielding train/test splits. |

restacking | True for restacking else False |

useproba | When evaluating the metric, it will use probabilities instead of class predictions if use_proba==True |

useretraining |
If Trueit does one model based on the whole training data in order to score the test data. Otherwise it takes the average of all models used in the folds ( however this takes more memory and there is no guarantee that it will work better.) |

randomstate | Integer for randomised procedures |

njobs |
Number of models to run in parallel. This is independent of any extra threads allocated |

njobs | Number of models to run in parallel. This is independent of any extra threads allocated from the selected algorithms. e.g. it is possible to run 4 models in parallel where one is a randomforest that runs on 10 threads (it selected). |

verbose | Integer value higher than zero to allow printing at the console. |