为嵌套JSON定义输出模式的最佳方法是什么?我目前使用的方法感觉并不理想。
# 添加到规划器 -> from langchain.experimental.plan_and_execute import load_chat_planner
refinement_response_schemas = [
ResponseSchema(name="plan", description="""{'1': {'step': '','tools': [],'data_sources': [],'sub_steps_needed': bool}, '2': {'step': '','tools': [<empty list>],'data_sources': [<>], 'sub_steps_needed': bool},}"""),
] # 在描述中定义JSON模式,工作正常但感觉不够恰当
refinement_output_parser = StructuredOutputParser.from_response_schemas(refinement_response_schemas)
refinement_format_instructions = refinement_output_parser.get_format_instructions()
refinement_output_parser.parse(output)
结果如下:
{'plan': {'1': {'step': 'Identify the top 5 strikers in La Liga',
'tools': [],
'data_sources': ['sports websites', 'official league statistics'],
'sub_steps_needed': False},
'2': {'step': 'Identify the top 5 strikers in the Premier League',
'tools': [],
'data_sources': ['sports websites', 'official league statistics'],
'sub_steps_needed': False},
...
'6': {'step': 'Given the above steps taken, please respond to the users original question',
'tools': [],
'data_sources': [],
'sub_steps_needed': False}}}
虽然它能工作,但我想要知道是否有更好的方法来处理这个问题。
回答:
据我所知,推荐的方法是使用pydantic输出解析器而不是结构化输出解析器… python.langchain.com/docs/modules/model_io/output_parsers/…(处理嵌套的解释在这里… youtube.com/watch?v=yD_oDTeObJY)。
例如:
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List, Optional
...
class PlanItem(BaseModel):
step: str
tools: Optional[str] = []
data_sources: Optional[str] = []
sub_steps_needed: str
class Plan(BaseModel):
plan: List[PlanItem]
parser = PydanticOutputParser(pydantic_object=Plan)
parser.get_format_instructions()