如何正确地缩放和预测单个样本

我从一个乳腺癌数据集（5个特征 + 1个诊断列）中，在标准化数据（StandardScaler()）上训练并测试了一个逻辑模型。我使用Pickle导入模型：

log = pickle.load(open('./log.pkl', 'rb'))

我想预测一个新样本是否属于类别0（良性）或类别1（恶性）。

下面的测试数据属于类别1（我尝试了多个来自类别1的样本，结果都分类为属于0）：

radius = 11.41texture = 10.82perimeter = 73.34area = 403.3smoothness = 0.09373

为了创建样本并获取预测结果，我尝试了以下方法：

temp = [radius, texture, perimeter, area, smoothness]temp = np.array(temp).reshape((len(temp), 1))scaler = StandardScaler()temp = scaler.fit_transform(temp)# print(log.predict(temp))   # 结果为：ValueError: X has 1 features per sample; expecting 5print(log.predict(temp.T)) # 结果为：[0]，这是错误的# print(log.predict_proba(temp)) # 结果为：ValueError: X has 1 features per sample; expecting 5print(log.predict_proba(temp.T)) # 结果为：[[9.99999972e-01 2.78352951e-08]]，似乎不对

我还尝试了以下方法：

new_sample = np.array([radius, texture, perimeter, area, smoothness])# scaled_sample = scaler.fit_transform(new_sample.reshape(1, -1)) # 结果数组：array([[0., 0., 0., 0., 0.]])# scaled_sample = scaler.fit_transform(new_sample.reshape(1, -1).T) # 与下面相同scaled_sample = scaler.fit_transform(new_sample[:, np.newaxis])print(log.predict(scaled_sample.T))  # 结果为 [0]，这是错误的 print(log.predict_proba(scaled_sample.T)) # 结果为：[[9.99999972e-01 2.78352951e-08]]，与上面的predict_proba不同，似乎不对

如何正确地进行这样的预测？

谢谢，

最好的祝愿，Birgitte

回答：

根据scikit-learn文档中关于predict函数的说明，你的代码可以变得更加简单：

temp = np.array([[radius, texture, perimeter, area, smoothness]]) # 使用双括号scaler = StandardScaler()print(log.predict(scaler.fit_transform(temp)))

这是使用它的正确方法。但这个函数无法说明回归器拟合的质量。

学技术

如何正确地缩放和预测单个样本

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复