我在尝试使用Mahout,并开始构建所有内容并查看示例。我主要对协同过滤感兴趣,所以我从BookCrossing数据集的推荐示例开始。我成功地让所有内容都运行起来,样本没有错误。然而,输出看起来像这样:
INFO: Creating FileDataModel for file /tmp/taste.bookcrossing.INFO: Reading file info...INFO: Read lines: 433647INFO: Processed 10000 usersINFO: Processed 20000 usersINFO: Processed 30000 usersINFO: Processed 40000 usersINFO: Processed 50000 usersINFO: Processed 60000 usersINFO: Processed 70000 usersINFO: Processed 77799 usersINFO: Beginning evaluation using 0.9 of BookCrossingDataModelINFO: Processed 10000 usersINFO: Processed 20000 usersINFO: Processed 22090 usersINFO: Beginning evaluation of 4245 usersINFO: Starting timing of 4245 tasks in 2 threadsINFO: Average time per recommendation: 296msINFO: Approximate memory used: 115MB / 167MBINFO: Unable to recommend in 1 casesINFO: Average time per recommendation: 67msINFO: Approximate memory used: 107MB / 167MBINFO: Unable to recommend in 2363 casesINFO: Average time per recommendation: 72msINFO: Approximate memory used: 146MB / 167MBINFO: Unable to recommend in 5095 casesINFO: Average time per recommendation: 71msINFO: Approximate memory used: 113MB / 167MBINFO: Unable to recommend in 7596 casesINFO: Average time per recommendation: 71msINFO: Approximate memory used: 130MB / 167MBINFO: Unable to recommend in 10896 casesINFO: Evaluation result: 1.0895580110095793
当我检查代码时,我可以看到它做了以下操作:
RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();File ratingsFile = TasteOptionParser.getRatings(args);DataModel model = ratingsFile == null ? new BookCrossingDataModel(true) : new BookCrossingDataModel(ratingsFile, true);IRStatistics evaluation = evaluator.evaluate( new BookCrossingBooleanRecommenderBuilder(), new BookCrossingDataModelBuilder(), model, null, 3, Double.NEGATIVE_INFINITY, 1.0);log.info(String.valueOf(evaluation));
这看起来是正确的,但我希望看到更多关于生成的建议和/或相似性的详细信息。返回的对象是IRStatistics类型,它只暴露了一些关于结果统计的数字。我应该在其他地方寻找吗?这个推荐器不是用来获取实际推荐的吗?
回答:
你实际上并没有生成推荐,这里你只是在进行评估。
《Mahout实战》书中的这个示例(链接)应该能让你了解如何实际获取推荐。
该示例仅为一个用户请求推荐,在你的情况下,你需要遍历所有用户并获取每个用户的推荐,然后你决定如何处理这些推荐,比如将它们输出到文件中。
此外,该示例没有使用数据模型构建器或推荐器构建器,但通过查看方法签名,你应该不难弄明白如何使用它们。