我的训练数据看起来像这样:训练数据
为了从中提取分类特征,我运行了以下代码:
categorial=[c for c in train.columns if train.columns(c).dtype in ['object'] ]
但我得到了以下错误:
---------------------------------------------------------------------------IndexError Traceback (most recent call last)<ipython-input-31-31eb7ac47e21> in <module>----> 1 categorial=[c for c in train.columns if train.columns[c].dtype in ['object'] ]<ipython-input-31-31eb7ac47e21> in <listcomp>(.0)----> 1 categorial=[c for c in train.columns if train.columns[c].dtype in ['object'] ]/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in __getitem__(self, key) 4295 if is_scalar(key): 4296 key = com.cast_scalar_indexer(key, warn_float=True)-> 4297 return getitem(key) 4298 4299 if isinstance(key, slice):IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
可能的解决方案是什么?
回答:
使用以下代码来选择’object’类型的变量:
categorical = train.select_dtypes('object')
如果你只想要变量名:
categorical_cols = train.select_dtypes('object').columns.tolist()