无法将生成的数据保存到JSONL文件 – 总是显示“Wrote 0 examples to finetuning_events.jsonl”消息

问题描述

在使用Llama Index生成JSONL数据时，过程一直顺利进行，直到最后一步将结果保存到JSONL文件时。然而，每次尝试保存数据时似乎都不成功，因为我总是收到“Wrote 0 examples to finetuning_events.jsonl”的消息。我不确定导致这个问题的具体原因。

重现步骤

成功使用Llama Index生成JSONL数据。
尝试将结果保存到JSONL文件中。
收到消息“Wrote 0 examples to finetuning_events.jsonl”。

附加信息

使用的Llama Index版本：0.10.22
操作系统：Windows

日志

Wrote 0 examples to ./dataset_data/finetuning_events.jsonl

我的代码：

     def jsonl_generation(self):        """        Generate JSONL file for fine-tuning events and perform model refinement.        """        # Initialize OpenAI FineTuningHandler and CallbackManager        finetuning_handler = OpenAIFineTuningHandler()        callback_manager = CallbackManager([finetuning_handler])        self.llm.callback_manager = callback_manager        # Load questions for fine-tuning from a file        questions = []        with open(f'{self.dataset_path}/train_questions.txt', "r", encoding='utf-8') as f:            for line in f:                questions.append(line.strip())        try:            # Generate responses to the questions using GPT-4 and save the fine-tuning events to a JSONL file            index = VectorStoreIndex.from_documents(                self.documents            )            query_engine = index.as_query_engine(similarity_top_k=2, llm=self.llm)            for question in questions:                response = query_engine.query(question)        except Exception as e:            # Handle the exception here, you might want to log the error or take appropriate action            print(f"An error occurred: {e}")        finally:            # Save the fine-tuning events to a JSONL file            finetuning_handler.save_finetuning_events(f'{self.dataset_path}/finetuning_events.jsonl')

回答：

我刚刚解决了这个问题。这是我的解决方案。目前，它正在将数据集存储到jsonl数据中。

    def jsonl_generation(self):        """        Generate JSONL file for fine-tuning events and perform model refinement.        """        # Initialize OpenAI FineTuningHandler and CallbackManager        finetuning_handler = OpenAIFineTuningHandler()        callback_manager = CallbackManager([finetuning_handler])        llm = OpenAI(model="gpt-4", temperature=0.3)        Settings.callback_manager, = (callback_manager,)        # Load questions for fine-tuning from a file        questions = []        with open(f'{self.dataset_path}/train_questions.txt', "r", encoding='utf-8') as f:            for line in f:                questions.append(line.strip())        try:            from llama_index.core import VectorStoreIndex            # Generate responses to the questions using GPT-4 and save the fine-tuning events to a JSONL file            index = VectorStoreIndex.from_documents(                self.documents            )            query_engine = index.as_query_engine(similarity_top_k=2, llm=llm)            for question in questions:                response = query_engine.query(question)        except Exception as e:            # Handle the exception here, you might want to log the error or take appropriate action            print(f"An error occurred: {e}")        finally:            # Save the fine-tuning events to a JSONL file            finetuning_handler.save_finetuning_events(f'{self.dataset_path}/finetuning_events.jsonl')

学技术

无法将生成的数据保存到JSONL文件 – 总是显示“Wrote 0 examples to finetuning_events.jsonl”消息

问题描述

重现步骤

附加信息

日志

发表回复取消回复

问题描述

重现步骤

附加信息

日志

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复