我查看了statsmodels的以下官方文档:
但是当我在一个练习数据集上尝试运行这段代码时(statsmodels.api已经导入为sm)
variance_inflation_factor=sm.stats.outliers_influence.variance_inflation_factor()vif=pd.DataFrame()vif['VIF']=[variance_inflation_factor(X_train.values,i) for i in range(X_train.shape[1])]vif['Predictors']=X_train.columns
我得到了错误消息:模块 ‘statsmodels.stats.api’ 没有属性 ‘outliers_influence’
谁能告诉我正确的使用方法是什么?
回答:
variance_inflation_factor=sm.stats.outliers_influence.variance_inflation_factor()
不需要通过不带参数调用函数来定义。相反,variance_inflation_factor
是一个需要两个输入的函数。
import pandas as pdimport numpy as npfrom statsmodels.stats.outliers_influence import variance_inflation_factorX_train = pd.DataFrame(np.random.standard_normal((1000,5)), columns=[f"x{i}" for iin range(5)])vif=pd.DataFrame()vif['VIF']=[variance_inflation_factor(X_train.values,i) for i in range(X_train.shape[1])]vif['Predictors']=X_train.columnsprint(vif)
这将产生
VIF Predictors0 1.002882 x01 1.004265 x12 1.001945 x23 1.004227 x34 1.003989 x4