机器学习面试编程测试

我正在解决一个小型编程测试,这是面试之前提交给我的测试。

我不得不删除关于问题的信息,你可以在下面的链接中轻松找到它。

所以我尝试了几种直观的方法,成功与否各有不同。在一些研究中,我在GIT上发现了一个例子(https://github.com/miracode/Machine-Works),其中使用了一些节点。我决定将其实现到我的脚本中进行测试。结果证明它的速度比我的快得多,但仍然无法处理整个输入集。这是一个25MB的txt文件,包含54个不同的案例,其中一些案例每个测试用例有10000多台机器。我在其他GIT解决方案中也发现了同样的解决方案(而且只有这一个)。

所以当我运行自己的脚本时,我能理解它会在完成大输入测试之前崩溃我的电脑,但从GIT中获取一个解决方案却无法计算测试输入,这真是令人惊讶。

我的电脑有16GB的RAM,我从未见过它像这样崩溃,即使处理更大的数据集时也是如此。

这是我对他们解决方案的实现的副本:

from load_input2 import load as loadimport time"""Third version of project """""" Implementing decision object, inspired from GIT-found script """PATH = 'input2.txt'class TestCase(object):    def __init__(self, C, D, machines=[]):        self.budget = C        self.days = D        self.machines = sorted([Machine(i[0], i[1], i[2], i[3])                         for i in machines], key = lambda x : x.day)    def run(self):        choice = Decision()        (choice.machine, choice.budget, choice.day) = (None, self.budget, 0)        choices = [choice, ]        for machine in self.machines:            next_choice = []            for choice in choices:                choice.to_buy, choice.not_buy = Decision(), Decision()                choice.to_buy.day, choice.not_buy.day = machine.day, machine.day                potential_budget = choice.budget + choice.machine.p_sell + choice.machine.daily_profit * \                    (machine.day - choice.day -                     1) if choice.machine else choice.budget                if machine.p_buy <= potential_budget:                    choice.to_buy.budget = potential_budget - machine.p_buy                    choice.to_buy.machine = machine                    next_choice.append(choice.to_buy)                choice.not_buy.machine = choice.machine                try:                    choice.not_buy.budget = choice.budget + \                        choice.machine.daily_profit * \                        (machine.day - choice.day)                except AttributeError:                    choice.not_buy.budget = choice.budget                next_choice.append(choice.not_buy)            choices = next_choice        results = []        for choice in choices:            try:                results.append(choice.budget +                               choice.machine.daily_profit *                               (self.days -                                choice.day) +                               choice.machine.p_sell)            except AttributeError:                results.append(choice.budget)        return(max(results))class Machine(object):    def __init__(self, day, p_buy, p_sell, daily_profit):        self.p_buy, self.p_sell = p_buy, p_sell        self.day, self.daily_profit = day, daily_profitclass Decision(object):    def __init__(self):        self.to_buy, self.not_buy = None, None        self.machine, self.budget = None, None        self.day = Nonedef main():    start = time.time()    global PATH    testcases = load(PATH)    count = 1    for (case_data, data) in testcases:        machines = [i for i in data]        dolls = TestCase(case_data[2], case_data[3], machines).run()        print(            "Case {}: {}".format(case_data[0], dolls))    print("Effectue en  ", start - time.time())if __name__ == '__main__':    main()

Load_input2.py :

def load(path):    with open(path) as fil:        inp = fil.read().split('\n')  # 打开输入文件    testcases, results = {}, {}    count = 1    for line in inp:  # 分割并获取每个TestCase的结果        split = [int(i) for i in line.split()]        if len(split) == 3:            case = tuple([count]+split)            testcases[case] = []            count+=1        else:            if len(split) > 0:                testcases[case].append(split)    sort = sorted([(case,data) for case,data in testcases.items()] , key = lambda x : x[0][0])    #print([i[0] for i in sort])    return(sort)

如果你有任何建议或提示,我很乐意接受!

我并不真的想要一个可以直接粘贴的解决方案,因为这是面试问题,我希望它能真实反映我的能力,尽管我确实认为我的能力中包括在出色的社区中搜索的能力;)

谢谢你的关心!


编辑:整个输入测试集在这里可以找到:https://gitlab.com/InfoCode/Coding_Problems/raw/master/MachineWork/input.txt


编辑:我使用的原始脚本,肯定不是最优的,但在非常大的测试用例上我相信计算量要少得多,过程不同,在开头解释过

""" First version of the project"""""" Using a day-to-day approach to estimate best behavior"""""" On each day, this algorithm will complete :"""""" - Looking for each machine to be bought on this day and taking the more profitable one in long-term run"""""" - During all depreciation period (time required for the machine to be cost-effective), checking if the purchase of the machine won't interfer with some more profitable machine"""""" - Buying the machine and moving along to next day"""""" This algorithm allow a faster execution for input with large sets of machines to be sold"""""" Cannot yet found how to prevent him from choosing the machine 2 in case (6,10,20) which leads to a decrease of 1 dollar in profits"""PATH = 'input2.txt'# Defining the TestCase class which is used for iterating through the daysclass TestCase(object):    def __init__(self, C, D, machines=[]):        self.budget = C        self.days = D        self.machines = [Machine(self, i[0], i[1], i[2], i[3])                         for i in machines]        self.choices = []    # Main function for running the iteration through the days    def run_case(self):        for i in range(1, self.days + 1):            best = self.best_machine_on_day(i)            if (best is not None and self.should_buy(best[0], i)):                self.choices.append(best)        if len(self.choices) > 0:            self.choices[-1][0].buy_sell(self, self.days + 1, sell=True)        return(self.budget)    # Function to define the best machine on a specific day    def best_machine_on_day(self, n):        results = []        for machine in self.machines:            if n == machine.day:                results.append(machine.day_based_potential(self, n))        if len(results) == 0:            return(None)        elif len(results) == 1:            return(results[0])        else:            return(max(results, key=lambda x: x[2] * (self.days - n) - x[1]))    # To define rather an individual should buy or not a machine having a    # small look on the day aheads    def should_buy(self, machine, n):        potential_budget = self.budget + self.choices[-1][0].p_sell + self.choices[-1][0].daily_profit * (            n - self.choices[-1][0].day - 1) if len(self.choices) > 0 else self.budget        day_to_cover_cost = int(            machine.cost / machine.daily_profit) if machine.cost % machine.daily_profit != 0 else machine.cost / machine.daily_profit - 1        for day in range(day_to_cover_cost):            next_day = self.best_machine_on_day(n + day + 1)            if next_day is not None:                day_to_buy = next_day[0].day                if (                    machine.earnings_from_day(                        self, day_to_buy) < next_day[0].earnings_from_day(                        self, day_to_buy) or machine.cost >= machine.daily_profit * (                        next_day[0].day - machine.day)) and next_day[0].p_buy <= potential_budget:                    return(False)        if (potential_budget >= machine.p_buy and machine.earnings_from_day(                self, n) >= machine.p_buy):            if len(self.choices) > 0:                self.choices[-1][0].buy_sell(self, n, sell=True)            machine.buy_sell(self, n)            return(True)        else:            return(False)# Defining the machine objectclass Machine(object):    def __init__(self, case, day, p_buy, p_sell, daily_profit):        self.cost = p_buy - p_sell        self.p_buy, self.p_sell = p_buy, p_sell        self.day = day        self.daily_profit = daily_profit    # To compute the earnings from a starting day n to the end    def earnings_from_day(self, case, n):        if self.day <= n <= case.days:            return((case.days - n) * self.daily_profit - self.cost)        else:            return(0)    # Represent itself method    def day_based_potential(self, case, n):        return((self, self.cost, self.daily_profit))    # Actions on Budget    def buy_sell(self, case, n, sell=False):        if sell:            case.budget += self.p_sell + self.daily_profit * (n - self.day - 1)        else:            case.budget -= self.p_buydef main():    global PATH    testcases = load(PATH)    count = 1    for case_data, data in testcases.items():        machines = [i for i in data]        dolls = TestCase(case_data[1], case_data[2], machines).run_case()        print(            "Case {}: {}".format(count, dolls))        count += 1if __name__ == '__main__':    main()

回答:

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注