在langchain中使用Flask前端无法检索Chroma文档

我正在使用langchain创建一个chroma数据库来存储通过Flask前端的PDF文件。当从命令行运行Python文件时，我能够查询数据库并成功检索数据。

然而，我试图通过Flask使用这个功能，到目前为止我无法使其工作。这是我在使用来查询数据库的类：

from langchain.embeddings import OpenAIEmbeddingsfrom langchain.chains import RetrievalQAfrom langchain.llms import OpenAIfrom langchain.vectorstores import Chromaclass Chat_db:    def __init__(self):        persist_directory = 'chromadb'        embedding = OpenAIEmbeddings()        vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)        retriever = vectordb.as_retriever(search_kwargs={"k": 2})        self.qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff",                                         retriever= retriever)            def chat_over_documents(self, query):        result = self.qa({"query": query})        return result['result'] #if __name__ == "__main__":#    query = "Where was the US declaration of independence signed?"#    vector_db = Chat_db()#    result= vector_db.chat_over_documents(query)#    print(result)

如果我取消注释“main”部分，我会得到我期望的结果。如果我通过’app.py’运行查询，我会得到langchain的预设结果“我很抱歉，我不知道”。这是我在’app.py’中调用它的方式

## app.py...chat_db = Chat_db()@app.route('/message', methods=['POST'])def message():    user_message = request.json['message']    if "document" in user_message.lower():        response = chat_db.chat_over_documents(user_message)        return jsonify({"message": response})    else:        response = assistant.chat_with_gpt(user_message)        return jsonify({"message": response})

我不确定我做错了什么；任何帮助都非常感激。

我尝试使用浏览器的开发者工具和终端进行调试。从日志来看，我的怀疑是langchain在Chroma db完成初始化之前就已经进行了两次openai api调用。

回答：

最终我选择了一个纯Python和Chroma db的解决方案，移除了Langchain（仅从这部分移除，因为我在项目的其他部分仍然使用langchain。）

有效的解决方案如下：

import chromadbfrom chromadb.config import Settingsclass Chat_db:    def __init__(self, persist_directory="../data/Chromadb"):        self.persist_directory = persist_directory                     self.client_settings = Settings(is_persistent= True, persist_directory= persist_directory, anonymized_telemetry=False)        self.persistent_client = chromadb.Client(settings= self.client_settings)        self.doc_collection = self.persistent_client.get_or_create_collection(name = "books")               #我以为我可以直接使用self.persistent_client进行查询，但它不起作用。         # 必须创建一个PersistentClient来用于查询。        self.queryclient = chromadb.PersistentClient(path= persist_directory, settings= self.client_settings)            def chat_over_documents(self, collection_name, query, k = 5):        collection = self.queryclient.get_collection(name=collection_name)        results = collection.query(query_texts= query, n_results= k)        flat_results = [item for sublist in results['documents'] for item in sublist]         return flat_results   ...   #省略了其他用于预处理和添加文档到Chroma集合的方法

在app.py中

...document_handler = Chat_db()...@app.route('/message', methods=['POST'])def message():    user_message = request.json['message']    if "/document" in user_message.lower():        keywords = assistant.preprocess_keywords(user_message)        vector_mgs = document_handler.chat_over_documents("books", keywords)        response = assistant.query_stored_documents(user_message, vector_mgs)        return jsonify({"message": response})    else:        response = assistant.chat_with_gpt(user_message)        return jsonify({"message": response})

注意：assistant类使用langchain进行聊天和其他功能。然而，它的query_stored_documents()方法使用系统和用户消息构建提示，并查询openai聊天完成端点以获得所需的结果。

学技术

在langchain中使用Flask前端无法检索Chroma文档

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复