如何在Scala 2.10中使用Spark 2.1.1获取随机森林的特征重要性？

我正在尝试从Spark MLib的随机森林回归器中获取特征重要性。问题是我使用Pipeline对象进行训练，我不知道如何将此对象转换为RandomForestRegressorModel来获取featureImportance。

我的代码中有趣的部分如下：

val rf = new RandomForestRegressor().        setLabelCol( "label" ).        setFeaturesCol( "features" ).        setNumTrees( numTrees ).        setFeatureSubsetStrategy( featureSubsetStrategy ).        setImpurity( impurity ).        setMaxDepth( maxDepth ).        setMaxBins( maxBins ).        setMaxMemoryInMB( maxMemoryInMB )val pipeline = new Pipeline().setStages(Array(rf))var model = pipeline.fit( trainingDataCached )// 获取特征重要性val featImp = model.featureImportance

我遗漏了什么？

谢谢。

编辑

这可能是正确的答案吗？

val featImp = model              .asInstanceOf[RandomForestRegressionModel]              .featureImportances

回答：

这可能是正确的答案吗？

差不多了。

model  .stages  .head  .asInstanceOf[RandomForestRegressionModel]  .featureImportances

学技术

如何在Scala 2.10中使用Spark 2.1.1获取随机森林的特征重要性？

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复