在数据集中跨多个日期解决相同的优化函数

我正在处理一个包含大约10万行每日篮球统计数据的数据集。我的项目目标是根据相同的约束条件,汇总每天的前9名得分手。当我在数据集的精简版本(仅限于一个日期)上运行以下代码时,我能够编译出最佳组合。

单一日期数据集

from pulp import *import numpy as npimport pandas as pddateparse = lambda dates: pd.datetime.strptime(dates, '%m/%d/%Y')players = pd.read_csv("Date.csv", parse_dates=['DATE'], index_col='DATE',date_parser=dateparse)players["PG"] = (players["POSITION"] == 'PG').astype(float)players["SG"] = (players["POSITION"] == 'SG').astype(float)players["SF"] = (players["POSITION"] == 'SF').astype(float)players["PF"] = (players["POSITION"] == 'PF').astype(float)players["C"] = (players["POSITION"] == 'C').astype(float)players["SALARY"] = players["SALARY"].astype(float)model = LpProblem("problem", LpMaximize)total_points = {}cost = {}PG = {}SG = {}SF = {}PF = {}C = {}number_of_players = {}for i, player in players.iterrows():    var_name = 'x' + str(i) # Create variable name    decision_var = pulp.LpVariable(var_name, cat='Binary') # Initialize Variables    total_points[decision_var] = player["POINTS"] # Create PPG Dictionary    cost[decision_var] = player["SALARY"] # Create Cost Dictionary        # Create Dictionary for Player Types    PG[decision_var] = player["PG"]    SG[decision_var] = player["SG"]    SF[decision_var] = player["SF"]    PF[decision_var] = player["PF"]    C[decision_var] = player["C"]    number_of_players[decision_var] = 1.0objective_function = pulp.LpAffineExpression(total_points)model += objective_function#Define cost constraint and add it to the modeltotal_cost = pulp.LpAffineExpression(cost)model += (total_cost <= 60000)PG_constraint = pulp.LpAffineExpression(PG)SG_constraint = pulp.LpAffineExpression(SG)SF_constraint = pulp.LpAffineExpression(SF)PF_constraint = pulp.LpAffineExpression(PF)C_constraint = pulp.LpAffineExpression(C)total_players = pulp.LpAffineExpression(number_of_players)model += (PG_constraint <= 2)model += (SG_constraint <= 2)model += (SF_constraint <= 2)model += (PF_constraint <= 2)model += (C_constraint <= 1)model += (total_players <= 9)model.solve()

当我的数据集只包含一个日期时,这段代码能够正常工作并解决问题。我希望能够将所有相关日期添加到范围内,并让优化器遍历每个日期,找出每一天的最佳组合。

我尝试创建一个循环,为每一天生成约束条件,但这导致了错误

from datetime import datetimestart_date = "2021-01-25"stop_date = "2021-01-27"start = datetime.strptime(start_date, "%Y-%m-%d")stop = datetime.strptime(stop_date, "%Y-%m-%d")from datetime import timedeltawhile start < stop:    objective_function = pulp.LpAffineExpression(total_points)    model += objective_function#Define cost constraint and add it to the model    total_cost = pulp.LpAffineExpression(cost)    model += (total_cost <= 60000)

是否有可能在一个数据集中为多个日期分配相同的目标和约束?这样做的结果将是为数据集中的每个日期找到9名最佳球员。


回答:

只需创建你的日期列表,然后遍历这些日期,并将按日期过滤后的数据框输入到你的求解器中:

from pulp import *import numpy as npimport pandas as pddateparse = lambda dates: pd.datetime.strptime(dates, '%m/%d/%Y')players = pd.read_csv("Date.csv", parse_dates=['DATE'], index_col='DATE',date_parser=dateparse)players["PG"] = (players["POSITION"] == 'PG').astype(float)players["SG"] = (players["POSITION"] == 'SG').astype(float)players["SF"] = (players["POSITION"] == 'SF').astype(float)players["PF"] = (players["POSITION"] == 'PF').astype(float)players["C"] = (players["POSITION"] == 'C').astype(float)players["SALARY"] = players["SALARY"].astype(float)players = players.reset_index()date_list = list(set(players['DATE'])) #<-- create the list of dates in the dataframedate_list.sort()  #<-- sort the datesfor var_date in date_list:    rows = []    model = LpProblem("problem", LpMaximize)    filter_players = players[players['DATE'] == var_date].reset_index(drop=True) #<-- filter by date    total_points = {}    cost = {}    PG = {}    SG = {}    SF = {}    PF = {}    C = {}    number_of_players = {}        for i, player in filter_players.iterrows():  #<--then run on that filtered dataframe        var_name = 'x' + str(i) # Create variable name        decision_var = pulp.LpVariable(var_name, cat='Binary') # Initialize Variables            total_points[decision_var] = player["POINTS"] # Create PPG Dictionary        cost[decision_var] = player["SALARY"] # Create Cost Dictionary                # Create Dictionary for Player Types        PG[decision_var] = player["PG"]        SG[decision_var] = player["SG"]        SF[decision_var] = player["SF"]        PF[decision_var] = player["PF"]        C[decision_var] = player["C"]        number_of_players[decision_var] = 1.0        objective_function = pulp.LpAffineExpression(total_points)    model += objective_function    #Define cost constraint and add it to the model    total_cost = pulp.LpAffineExpression(cost)    model += (total_cost <= 60000)        PG_constraint = pulp.LpAffineExpression(PG)    SG_constraint = pulp.LpAffineExpression(SG)    SF_constraint = pulp.LpAffineExpression(SF)    PF_constraint = pulp.LpAffineExpression(PF)    C_constraint = pulp.LpAffineExpression(C)    total_players = pulp.LpAffineExpression(number_of_players)        model += (PG_constraint <= 2)    model += (SG_constraint <= 2)    model += (SF_constraint <= 2)    model += (PF_constraint <= 2)    model += (C_constraint <= 1)    model += (total_players <= 9)        model.solve()        row = {}    for v in model.variables():        if v.varValue:            idx = int(v.name.replace('x',''))            player_name = filter_players.iloc[idx]['PLAYER']            position = filter_players.iloc[idx]['POSITION']            salary = filter_players.iloc[idx]['SALARY']            points = filter_players.iloc[idx]['POINTS']            row = {'DATE':var_date, 'PLAYER':player_name, 'POSITION':position, 'POINTS':points, 'SALARY':salary}            rows.append(row)                lineup = pd.DataFrame(rows)    lineup.loc[9,'PLAYER'] = ''    lineup.loc[9,'POSITION'] = 'Total'    lineup.loc[9,'POINTS'] = lineup['POINTS'].sum(axis=0)    lineup.loc[9,'SALARY'] = lineup['SALARY'].sum(axis=0)       print ('Lineup for: %s' %var_date.strftime('%m/%d/%Y'))       print (lineup.iloc[:,1:])    print ('\n\n')

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注