我想将数据框中所有非浮点类型的列转换为浮点类型,有没有办法一次性完成?下面是列的类型
longitude - float64 latitude - float64housing_median_age - float64total_rooms - float64total_bedrooms - objectpopulation - float64households - float64median_income - float64rooms_per_household - float64category_<1H OCEAN - uint8category_INLAND - uint8category_ISLAND - uint8category_NEAR BAY - uint8category_NEAR OCEAN - uint8
以下是我的代码片段
import pandas as pdimport numpy as np from sklearn.model_selection import KFolddf = pd.DataFrame(housing)df['ocean_proximity'] = pd.Categorical(df['ocean_proximity']) #类型转换 dfDummies = pd.get_dummies(df['ocean_proximity'], prefix = 'category' )df = pd.concat([df, dfDummies], axis=1)print df.head()housingdata = dfhf = housingdata.drop(['median_house_value','ocean_proximity'], axis=1)hl = housingdata[['median_house_value']]hf.fillna(hf.mean,inplace = True)hl.fillna(hf.mean,inplace = True)
回答:
如果不需要对降级或错误处理进行特定控制,一个快速且简便的方法是使用 df = df.astype(float)
。
为了获得更多控制,可以使用 pd.DataFrame.select_dtypes
按数据类型选择列。然后在列的子集上使用 pd.to_numeric
。
设置
df = pd.DataFrame([['656', 341.341, 4535], ['545', 4325.132, 562]], columns=['col1', 'col2', 'col3'])print(df.dtypes)col1 objectcol2 float64col3 int64dtype: object
解决方案
cols = df.select_dtypes(exclude=['float']).columnsdf[cols] = df[cols].apply(pd.to_numeric, downcast='float', errors='coerce')
结果
print(df.dtypes)col1 float32col2 float64col3 float32dtype: objectprint(df) col1 col2 col30 656.0 341.341 4535.01 545.0 4325.132 562.0