无法将生成的数据保存到JSONL文件 – 总是显示“Wrote 0 examples to finetuning_events.jsonl”消息

问题描述

在使用Llama Index生成JSONL数据时,过程一直顺利进行,直到最后一步将结果保存到JSONL文件时。然而,每次尝试保存数据时似乎都不成功,因为我总是收到“Wrote 0 examples to finetuning_events.jsonl”的消息。我不确定导致这个问题的具体原因。

重现步骤

  1. 成功使用Llama Index生成JSONL数据。
  2. 尝试将结果保存到JSONL文件中。
  3. 收到消息“Wrote 0 examples to finetuning_events.jsonl”。

附加信息

  • 使用的Llama Index版本:0.10.22
  • 操作系统:Windows

日志

Wrote 0 examples to ./dataset_data/finetuning_events.jsonl

我的代码:

     def jsonl_generation(self):        """        Generate JSONL file for fine-tuning events and perform model refinement.        """        # Initialize OpenAI FineTuningHandler and CallbackManager        finetuning_handler = OpenAIFineTuningHandler()        callback_manager = CallbackManager([finetuning_handler])        self.llm.callback_manager = callback_manager        # Load questions for fine-tuning from a file        questions = []        with open(f'{self.dataset_path}/train_questions.txt', "r", encoding='utf-8') as f:            for line in f:                questions.append(line.strip())        try:            # Generate responses to the questions using GPT-4 and save the fine-tuning events to a JSONL file            index = VectorStoreIndex.from_documents(                self.documents            )            query_engine = index.as_query_engine(similarity_top_k=2, llm=self.llm)            for question in questions:                response = query_engine.query(question)        except Exception as e:            # Handle the exception here, you might want to log the error or take appropriate action            print(f"An error occurred: {e}")        finally:            # Save the fine-tuning events to a JSONL file            finetuning_handler.save_finetuning_events(f'{self.dataset_path}/finetuning_events.jsonl')

回答:

我刚刚解决了这个问题。这是我的解决方案。目前,它正在将数据集存储到jsonl数据中。

    def jsonl_generation(self):        """        Generate JSONL file for fine-tuning events and perform model refinement.        """        # Initialize OpenAI FineTuningHandler and CallbackManager        finetuning_handler = OpenAIFineTuningHandler()        callback_manager = CallbackManager([finetuning_handler])        llm = OpenAI(model="gpt-4", temperature=0.3)        Settings.callback_manager, = (callback_manager,)        # Load questions for fine-tuning from a file        questions = []        with open(f'{self.dataset_path}/train_questions.txt', "r", encoding='utf-8') as f:            for line in f:                questions.append(line.strip())        try:            from llama_index.core import VectorStoreIndex            # Generate responses to the questions using GPT-4 and save the fine-tuning events to a JSONL file            index = VectorStoreIndex.from_documents(                self.documents            )            query_engine = index.as_query_engine(similarity_top_k=2, llm=llm)            for question in questions:                response = query_engine.query(question)        except Exception as e:            # Handle the exception here, you might want to log the error or take appropriate action            print(f"An error occurred: {e}")        finally:            # Save the fine-tuning events to a JSONL file            finetuning_handler.save_finetuning_events(f'{self.dataset_path}/finetuning_events.jsonl')

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注