OneHotEncoder中的categorical_features已弃用，如何转换特定列

我需要将独立字段从字符串转换为算术表示法。我使用OneHotEncoder进行转换。我的数据集有许多独立列，其中一些如下所示：

Country     |    Age       --------------------------Germany     |    23Spain       |    25Germany     |    24Italy       |    30

我需要对Country列进行编码，如下所示：

0     |    1     |     2     |       3--------------------------------------1     |    0     |     0     |      230     |    1     |     0     |      251     |    0     |     0     |      24 0     |    0     |     1     |      30

我成功地通过使用OneHotEncoder获得了所需的转换，如下所示：

#Encoding the categorical datafrom sklearn.preprocessing import LabelEncoderlabelencoder_X = LabelEncoder()X[:,0] = labelencoder_X.fit_transform(X[:,0])#we are dummy encoding as the machine learning algorithms will be#confused with the values like Spain > Germany > Francefrom sklearn.preprocessing import OneHotEncoderonehotencoder = OneHotEncoder(categorical_features=[0])X = onehotencoder.fit_transform(X).toarray()

现在我收到了使用categories='auto'的弃用警告。如果我这样做，转换将应用于所有独立列，如国家、年龄、薪水等。

如何仅对数据集的第0列实现转换？

回答：

实际上有两个警告：

FutureWarning: 在版本0.22中，整数数据的处理方式将会改变。目前，类别是基于范围[0, max(values)]确定的，而将来将基于唯一值确定。如果你想要未来的行为并关闭此警告，你可以指定”categories=’auto'”。如果你在此OneHotEncoder之前使用了LabelEncoder将类别转换为整数，那么你现在可以直接使用OneHotEncoder。

第二个警告是：

‘categorical_features’关键字在版本0.20中已被弃用，并将在0.22中被移除。你可以使用ColumnTransformer代替。
“use the ColumnTransformer instead.”, DeprecationWarning)

在未来，除非你想使用”categories=’auto'”，否则你不应该在OneHotEncoder中直接定义列。第一个消息还告诉你直接使用OneHotEncoder，而不需要先使用LabelEncoder。最后，第二个消息告诉你使用ColumnTransformer，这就像是列转换的管道。

以下是适用于你的情况的等效代码：

from sklearn.compose import ColumnTransformer ct = ColumnTransformer([("Name_Of_Your_Step", OneHotEncoder(),[0])], remainder="passthrough")) # 最后一个参数([0])是你想要在这一步转换的列的列表ct.fit_transform(X)

另见：ColumnTransformer文档

对于上述示例;

编码分类数据（基本上是将文本转换为数值数据，即国家名称）

from sklearn.preprocessing import LabelEncoder, OneHotEncoderfrom sklearn.compose import ColumnTransformer#Encode Country Columnlabelencoder_X = LabelEncoder()X[:,0] = labelencoder_X.fit_transform(X[:,0])ct = ColumnTransformer([("Country", OneHotEncoder(), [0])], remainder = 'passthrough')X = ct.fit_transform(X)

学技术

OneHotEncoder中的categorical_features已弃用，如何转换特定列

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复