我已经将一个模型从Huggingface转换为Onnx,使用了提供的工具:
optimum-cli export onnx --model deepset/roberta-base-squad2 "roberta-base-squad2" --framework pt
转换过程没有出现任何错误。
我使用以下代码进行推理:
// QnA服务配置:
// 网站:https://huggingface.co/deepset/roberta-base-squad2
Configuration QnAConfig = new Configuration(@"C:\Bert\dslimL\model.onnx")
{
ConfigurationFileName = "config.json",
HasTokenTypeIds = false,
IsCasedModel = false,
MaximumNumberOfTokens = 1,
MergesFileName = "merges.txt",
NumberOfTokens = 5,
Repository = "deepset/roberta-large-squad2",
TokenizerName = TokenizerName.Tokenizer,
VocabularyFileName = "vocab.json"
};
Configuration = QnAConfig;
labelsCount = Configuration.ModelConfiguration.IdTolabel.Count;
var sessionOptions = new SessionOptions()
{
ExecutionMode = ExecutionMode.ORT_PARALLEL,
EnableCpuMemArena = true,
EnableMemoryPattern = true,
EnableProfiling = true,
GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL,
InterOpNumThreads = 10
};
sessionOptions.AppendExecutionProvider_CPU(0);
Session = new InferenceSession(Configuration.ModelPath, sessionOptions);
// 设置问题和上下文:
Schema.Question = sentence;
Schema.Context = Context;
// 处理子上下文:
Schema.Sentence = $"'question': '{sentence}',"
+ $" 'context': '{Context}'";
//
var Start = Configuration.Tokenizer.Model.TokenToId("<s>");
var End = Configuration.Tokenizer.Model.TokenToId("</s>");
var Pad = Configuration.Tokenizer.Model.TokenToId("<pad>");
//
var result = Configuration.Tokenizer.Encode(Schema.Sentence);
var decode = Configuration.Tokenizer.Decode(result.Ids);
//
var inputArray = result.Ids.ToLongArray((long)Start, (long)End);
var MyAttentionMask = AttentionMaskHelpers.BuildMask(20, 20, -0);
//
long[] attMask = new long[inputArray.Length];
for (int i = 0; i < inputArray.Length; i++)
if (i < inputArray.Length * 0.5)
attMask[i] = 1;
else
attMask[i] = 0;
//
var tensorInputIds = TensorExtensions.ConvertToTensor(inputArray, inputArray.Length);
var attention_Mask = TensorExtensions.ConvertToTensor(attMask, inputArray.Length);
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("input_ids", tensorInputIds),
NamedOnnxValue.CreateFromTensor("attention_mask", attention_Mask),
//NamedOnnxValue.CreateFromTensor("token_type_ids", attention_Mask)
};
////////////////////////////////////////////////////////////////////////////////////////
/// # 将答案(tokens)转换回原始文本
/// # Score: 模型的得分
/// # Start: 答案在上下文字符串中第一个字符的索引
/// # End: 答案在上下文字符串中最后一个字符之后的字符索引
/// # Answer: 答案的纯文本
/// 参见:https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/question_answering.py
/// 运行会话,推断输出:
var inputMeta = Session.InputMetadata;
var output = Session.Run(inputs);
// 初始化一个新的答案ID列表:
List<int> AnswerIds = new List<int>();
// 输出Logits:
List<float> startLogits = output[0].AsEnumerable<float>().ToList();
List<float> endLogits = output[1].AsEnumerable<float>().ToList();
// 获取上下文的索引:
float start = startLogits.Max();
float end = endLogits.Max();
Schema.Score = ((start + end) / 10.0f);
// 获取最高分数的索引:
Schema.StartIndex = startLogits.IndexOf(start);
Schema.EndIndex = endLogits.IndexOf(end);
// 标记化句子:
TokenizerResult Tokens = Configuration.Tokenizer.Encode(Schema.Sentence);
// 获取ID列表:
for (int i = Schema.StartIndex; i <= Schema.EndIndex; i++)
AnswerIds.Add(Convert.ToInt32(inputArray[i]));
// 获取答案:
Schema.Answer = Configuration.Tokenizer.Decode(AnswerIds).Trim();
我使用的是:
using Microsoft.ML.Tokenizers;
分词器工作正常:
Tokenizer = new Tokenizer(new EnglishRoberta(vocabFilePath, mergeFilePath, dictPath), RobertaPreTokenizer.Instance);
我已经检查了标记和ID,它们是匹配的,所以分词过程是好的。
问题:所有预测结果的开始和结束索引都是零,这让我得到了开始标记的结果,所以如果开始标记是”,即使得分好坏,结果也是”。
模型在使用Python脚本时运行良好,没有错误!
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipelinemodel_name = "deepset/roberta-base-squad2"model = AutoModelForQuestionAnswering.from_pretrained(model_name)tokenizer = AutoTokenizer.from_pretrained(model_name)nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)QA_input = {'question': 'Why is model conversion important?','context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'}res = nlp(QA_input)print(res)
我认为这个问题直接与Onnx和Onnxruntime环境有关,除非我做错了什么。然而,我有其他模型使用类似的代码可以正常工作。
我尝试了其他模型:
- deepset/roberta-base-squad2
- deepset/roberta-large-squad2
所有模型都有相同的问题。
我认为onnxruntime比大多数人愿意承认的还要bug多,真是遗憾!
回答:
我的建议:不要在Onnx上浪费时间!
花时间学习Python,使用Flask来创建IIS Web API模块,而不是在onnx上浪费时间,因为它太bug多且不可靠!
有很多好的例子可以学习!
from flask import Flask, jsonify, request # 从Flask模型导入对象from keras.models import load_modelfrom transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipelineapp = Flask(__name__) # 使用Flask定义应用程序tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer)
不要在onnx上浪费时间!