我的数据集如下(仅为样本):
DATE_REF,MONTH,YEAR,DAY_OF_YEAR,DAY_OF_MONTH,WEEK_DAY,WEEK_DAY_1,WEEK_DAY_2,WEEK_DAY_3,WEEK_DAY_4,WEEK_DAY_5,WEEK_DAY_6,WEEK_DAY_7,WEEK_NUMBER_IN_MONTH,WEEKEND,WORK_DAY,AMOUNT_SOLD20100101,1,2010,1,1,6,0,0,0,0,0,1,0,1,0,0,020100102,1,2010,2,2,7,0,0,0,0,0,0,1,1,1,0,220100103,1,2010,3,3,1,1,0,0,0,0,0,0,2,1,0,020100104,1,2010,4,4,2,0,1,0,0,0,0,0,2,0,1,1283020100105,1,2010,5,5,3,0,0,1,0,0,0,0,2,0,1,1920020100106,1,2010,6,6,4,0,0,0,1,0,0,0,2,0,1,2293020100107,1,2010,7,7,5,0,0,0,0,1,0,0,2,0,1,2349520100108,1,2010,8,8,6,0,0,0,0,0,1,0,2,0,1,2321520100109,1,2010,9,9,7,0,0,0,0,0,0,1,2,1,0,17220100110,1,2010,10,10,1,1,0,0,0,0,0,0,3,1,0,020100111,1,2010,11,11,2,0,1,0,0,0,0,0,3,0,1,1881520100112,1,2010,12,12,3,0,0,1,0,0,0,0,3,0,1,2541520100113,1,2010,13,13,4,0,0,0,1,0,0,0,3,0,1,2526220100114,1,2010,14,14,5,0,0,0,0,1,0,0,3,0,1,2796720100115,1,2010,15,15,6,0,0,0,0,0,1,0,3,0,1,2635220100116,1,2010,16,16,7,0,0,0,0,0,0,1,3,1,0,20220100117,1,2010,17,17,1,1,0,0,0,0,0,0,4,1,0,1020100118,1,2010,18,18,2,0,1,0,0,0,0,0,4,0,1,2029520100119,1,2010,19,19,3,0,0,1,0,0,0,0,4,0,1,2598220100120,1,2010,20,20,4,0,0,0,1,0,0,0,4,0,1,2474520100121,1,2010,21,21,5,0,0,0,0,1,0,0,4,0,1,2808720100122,1,2010,22,22,6,0,0,0,0,0,1,0,4,0,1,2841720100123,1,2010,23,23,7,0,0,0,0,0,0,1,4,1,0,11520100124,1,2010,24,24,1,1,0,0,0,0,0,0,5,1,0,520100125,1,2010,25,25,2,0,1,0,0,0,0,0,5,0,1,2018520100126,1,2010,26,26,3,0,0,1,0,0,0,0,5,0,1,2593220100127,1,2010,27,27,4,0,0,0,1,0,0,0,5,0,1,3171020100128,1,2010,28,28,5,0,0,0,0,1,0,0,5,0,1,2102020100129,1,2010,29,29,6,0,0,0,0,0,1,0,5,0,1,5146020100130,1,2010,30,30,7,0,0,0,0,0,0,1,5,1,0,67020100131,1,2010,31,31,1,1,0,0,0,0,0,0,6,1,0,17
我在Azure ML上使用以下实验尝试预测新日期(DATE_REF
)的AMOUNT_SOLD
:
然后我部署了Web服务并测试了预测,但AMOUNT_SOLD
列的结果全部为零。
我可能遗漏了什么?
回答:
虽然我想复制你的Azure ML实验,但我没有足够的数据。但我所做的如下:
我复制了你的样本数据,然后将其乘以4倍(添加行 x 2)。然后拆分数据(70%/30%),随机种子为7(以获得可复现的结果)。提升决策树回归使用默认参数。在调整模型超参数中,我选择了AMOUNT_SOLD作为标签列。然后评分模型和评估模型。
准确度/决定系数非常好。
之后,要将此作为Web服务部署,你必须首先从你的训练实验中设置一个预测实验。设置Web服务 > 预测实验
你的实验将如魔法般移动。
Web服务输入模块默认会放置在实验的顶部。我将其移动并连接到评分模型的右侧,这样当你输入Web服务的参数时,它将使用你的训练模型进行预测。
在评分模型模块之后,我放置了一个选择数据集中的列模块,并仅选择了名为评分标签的列。此列包含模型的预测。然后我使用编辑元数据模块重命名评分标签列,然后将其传递给Web服务输出模块。
你的实验现在已经准备好作为Web服务部署了。
为了预测新值,我使用当前日期的详细信息作为输入测试了Web服务。(虽然DATE_REF输入必须是20170818 😀 )
然后输出看起来像这样:
你的Web服务现在可以预测新值了。