streamDM C++

Stream Machine Learning in C++

About streamDM C++

streamDM in C++ implements extremely fast streaming decision trees in C++ for big data streams. The main advantages of streamDM in C++ over other C/C++ data stream libraries are the following:

    Faster than VFML in C and MOA in Java.
    Evaluation and learners are separated, not linked together.
    It contains several methods for learning numeric attributes.
    It is easy to extend and add new methods.
    The adaptive decision tree is more accurate and does not need an expert user to choose optimal parameters to use.
    It contains powerful ensemble methods.
    It is much faster and uses less memory.

Getting Started

First download and build streamDM C++:
git clone https://github.com/huawei-noah/streamDM-Cpp.git
cd streamDM-Cpp
make

Download a dataset:
wget "http://downloads.sourceforge.net/project/moa-datastream/Datasets/Classification/covtypeNorm.arff.zip"
unzip covtypeNorm.arff.zip

Evaluate the dataset:
./streamdm-cpp "EvaluatePrequential -l (HoeffdingTree -l NBAdaptive) -r ArffReader -ds covtypeNorm.arff -e (BasicClassificationEvaluator -f 100000)"

Documentation

streamDM in C++ executes tasks. Tasks can be evaluation tasks as "EvaluatePrequential" or "EvaluateHoldOut" and the parameters needed are a learner, a stream reader, and an evaluator.

The methods currently implemented are: Naive Bayes, Perceptron, Logistic Regression, Perceptron, Majority Class, Hoeffding Tree, Hoeffding Adaptive Tree, and Bagging.

The stream readers currently implemented support Arff, C45, and LibSVM formats.

Download streamDM C++

You can download streamDM C++ code at Github

Visit Github Page

Contributors

Albert Bifet, Wei Fan, Jiajin Zhang, Quan Liu, Dandan Tu, Silviu Maniu, Cheng He, Jianfeng Qian, and Jianfeng Zhang