如何在Python中使用多类别变量创建类似R风格的预测模型

您知道如何创建类似R风格的集成方法分类器的预测模型吗？

ded.fit(formula="X ~ Y + Z**2", data=fed)

目前的代码看起来像这样：

from sklearn.ensemble import RandomForestClassifiermodel = RandomForestClassifier(n_estimators=100, min_samples_leaf=10,    random_state=1)model.fit(x_train, y_train)

您可能会问我为什么需要这样做？

我需要添加更多的变量，不仅是X和Y，还需要Z、P、Q和R。
我需要像在R中那样进行实验，看看给某个变量添加指数、乘法或除法是否会增加或减少预测的准确性，如下面的公式所示：

X ~ Y + Z^2" 或 "X ~ Y + Z + (P*2) + Q**2

任何回答都将不胜感激。提前感谢您。

回答：

像下面这样的代码应该可以工作：

import pandas as pdimport numpy as npX = pd.DataFrame(np.random.randint(0,100,size=(100, 2)), columns=list('XZ'))y = np.random.randint(2,size=100) # 用于二分类的数据标签X['Z2'] = X.Z**2    # 添加更多特征print X.head() # 注意添加的特征 Z^2#    X   Z    Z2#0  88  90  8100#1  49  63  3969#2  27  23   529#3  47  71  5041#4  21  98  9604train_samples = 80  # 用于训练模型的样本数X_train = X[:train_samples]X_test = X[train_samples:]y_train = y[:train_samples]y_test = y[train_samples:]from sklearn.ensemble import RandomForestClassifierfrom pandas_ml import ConfusionMatriximport matplotlib.pyplot as pltmodel = RandomForestClassifier(n_estimators=100, min_samples_leaf=10, random_state=1)model.fit(X_train, y_train)y_pred = model.predict(X_test)#print confusion_matrix(y_test, y_pred)cm = ConfusionMatrix(y_test, y_pred)print cm# Predicted  0   1  __all__# Actual# 0          3   4        7# 1          4   9       13# __all__    7  13       20cm.plot()plt.show()

学技术

如何在Python中使用多类别变量创建类似R风格的预测模型

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复