无法将生成的数据保存到JSONL文件 – 总是显示“Wrote 0 examples to finetuning_events.jsonl”消息

问题描述

在使用Llama Index生成JSONL数据时,过程一直顺利进行,直到最后一步将结果保存到JSONL文件时。然而,每次尝试保存数据时似乎都不成功,因为我总是收到“Wrote 0 examples to finetuning_events.jsonl”的消息。我不确定导致这个问题的具体原因。

重现步骤

  1. 成功使用Llama Index生成JSONL数据。
  2. 尝试将结果保存到JSONL文件中。
  3. 收到消息“Wrote 0 examples to finetuning_events.jsonl”。

附加信息

  • 使用的Llama Index版本:0.10.22
  • 操作系统:Windows

日志

Wrote 0 examples to ./dataset_data/finetuning_events.jsonl

我的代码:

     def jsonl_generation(self):        """        Generate JSONL file for fine-tuning events and perform model refinement.        """        # Initialize OpenAI FineTuningHandler and CallbackManager        finetuning_handler = OpenAIFineTuningHandler()        callback_manager = CallbackManager([finetuning_handler])        self.llm.callback_manager = callback_manager        # Load questions for fine-tuning from a file        questions = []        with open(f'{self.dataset_path}/train_questions.txt', "r", encoding='utf-8') as f:            for line in f:                questions.append(line.strip())        try:            # Generate responses to the questions using GPT-4 and save the fine-tuning events to a JSONL file            index = VectorStoreIndex.from_documents(                self.documents            )            query_engine = index.as_query_engine(similarity_top_k=2, llm=self.llm)            for question in questions:                response = query_engine.query(question)        except Exception as e:            # Handle the exception here, you might want to log the error or take appropriate action            print(f"An error occurred: {e}")        finally:            # Save the fine-tuning events to a JSONL file            finetuning_handler.save_finetuning_events(f'{self.dataset_path}/finetuning_events.jsonl')

回答:

我刚刚解决了这个问题。这是我的解决方案。目前,它正在将数据集存储到jsonl数据中。

    def jsonl_generation(self):        """        Generate JSONL file for fine-tuning events and perform model refinement.        """        # Initialize OpenAI FineTuningHandler and CallbackManager        finetuning_handler = OpenAIFineTuningHandler()        callback_manager = CallbackManager([finetuning_handler])        llm = OpenAI(model="gpt-4", temperature=0.3)        Settings.callback_manager, = (callback_manager,)        # Load questions for fine-tuning from a file        questions = []        with open(f'{self.dataset_path}/train_questions.txt', "r", encoding='utf-8') as f:            for line in f:                questions.append(line.strip())        try:            from llama_index.core import VectorStoreIndex            # Generate responses to the questions using GPT-4 and save the fine-tuning events to a JSONL file            index = VectorStoreIndex.from_documents(                self.documents            )            query_engine = index.as_query_engine(similarity_top_k=2, llm=llm)            for question in questions:                response = query_engine.query(question)        except Exception as e:            # Handle the exception here, you might want to log the error or take appropriate action            print(f"An error occurred: {e}")        finally:            # Save the fine-tuning events to a JSONL file            finetuning_handler.save_finetuning_events(f'{self.dataset_path}/finetuning_events.jsonl')

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注