http://www.cnblogs.com/wentingtu/archive/2011/12/13/2286212.html
随机森林的实际用例:
from sklearn.ensemble import RandomForestClassifier import numpy as np import pandas as pd # create the training & test sets, skipping the header row with [1:] dataset = pd.read_csv("/Users/ml_data/train.csv") target = dataset[[0]].values.ravel() train = dataset.iloc[:,1:].values test = pd.read_csv("/Users/ml_data/test.csv").values # create and train the random forest # multi-core CPUs can use: rf = RandomForestClassifier(n_estimators=100, n_jobs=2) rf = RandomForestClassifier(n_estimators=100) rf.fit(train, target) pred = rf.predict(test) np.savetxt(‘submission_rand_forest.csv‘, np.c_[range(1,len(test)+1),pred], delimiter=‘,‘, header = ‘ImageId,Label‘, comments = ‘‘, fmt=‘%d‘)
原文:http://www.cnblogs.com/SpeakSoftlyLove/p/5256131.html