我在尝试构建一个NFL选秀前景成功概率的模型时遇到了困难,无法找到一种方法来打印出球员的名字及其对应的模型输出。例如,目前输出类似于这样的内容 “[79 22 36 72 20 48 2 68 16 36 11 68 68 16 22 17 60 62 15 17 11 68 0 8428 22 45 48 79 84 2 37 68]”,我希望能同时打印出与这些输出相关的球员。我正在使用我在网上找到的模板代码来构建我想要的模型类型。我将在下面发布代码。
数据链接: https://docs.google.com/spreadsheets/d/1BQa34rfq7oC3jOO65c4xUqKTuhDGKf46pPwGmjSS3ko/edit?usp=sharing
在训练过程中,“Player”列实际上并不重要,因为这些数据是追溯到2004年的历史选秀数据,但显然在最终输出中,当我要求模型预测今年的前景时,我需要输出名字。
import pandas as pd import xgboost from sklearn import model_selection from sklearn.metrics import accuracy_score from sklearn.preprocessing import LabelEncoder # load data data = pd.read_csv(r"C:\Users\yanke\Documents\NFLDraft\QBDataSet.csv", index_col=0) dataset = data # split data into X and y X = dataset.iloc[:,0:4] Y = dataset.iloc[:,4] # encode string class values as integers label_encoder = LabelEncoder() label_encoder = label_encoder.fit(Y) label_encoded_y = label_encoder.transform(Y) seed = 7 test_size = 0.33 X_train, X_test, y_train, y_test = model_selection.train_test_split(X, label_encoded_y, test_size=test_size, random_state=seed) # fit model no training data model = xgboost.XGBClassifier() model.fit(X_train, y_train) print(model) # make predictions for test data y_pred = model.predict(X_test) predictions = [round(value) for value in y_pred] # evaluate predictions accuracy = accuracy_score(y_test, predictions) print("Accuracy: %.2f%%" % (accuracy * 100.0)) print(y_pred)
回答:
这会有效吗?
for player, prediction in zip(X_test.index, predictions): print(player, prediction)
输出:
Colin Kaepernick 3Jeff Driskel 2Dwayne Haskins 1Colt McCoy 1Ryan Lindley 2Jameis Winston 2Sam Darnold 1Sam Bradford 1Troy Smith 1Johnny Manziel 1Matthew Stafford 3Kyler Murray 2Daniel Jones 2Gardner Minshew 1Joe Webb 2Curtis Painter 1Andrew Luck 1Josh Freeman 2Landry Jones 1Ryan Finley 1Deshaun Watson 1Marcus Mariota 1Dan Orlovsky 1Russell Wilson 2Nathan Peterman 1Kyle Orton 2Paxton Lynch 2Alex Smith 1Brodie Croyle 1Vince Young 2Brandon Weeden 1Teddy Bridgewater 1Brett Hundley 1