我正在按照BERT的说明进行微调,具体说明见这里
这是我的代码:
from sentence_transformers import SentenceTransformer, SentencesDataset, InputExample, losses, evaluationfrom torch.utils.data import DataLoader# load modelembedder = SentenceTransformer('bert-large-nli-mean-tokens')print("embedder loaded...")# define your train dataset, the dataloader, and the train losstrain_dataset = SentencesDataset(x_sample["input"].tolist(), embedder)train_dataloader = DataLoader(train_dataset, shuffle=False, batch_size=16)train_loss = losses.CosineSimilarityLoss(embedder)sentences1 = ['This list contains the first column', 'With your sentences', 'You want your model to evaluate on']sentences2 = ['Sentences contains the other column', 'The evaluator matches sentences1[i] with sentences2[i]', 'Compute the cosine similarity and compares it to scores[i]']scores = [0.3, 0.6, 0.2]evaluator = evaluation.EmbeddingSimilarityEvaluator(sentences1, sentences2, scores)# tune the modelembedder.fit(train_objectives=[(train_dataloader, train_loss)], epochs=1, warmup_steps=100, evaluator=evaluator, evaluation_steps=1)
在4%时,训练停止,程序无警告或错误地退出。没有输出。
我不知道如何排查问题 – 任何帮助都将非常受欢迎。
编辑:将标题从“fails”改为“stops/quits”,因为我不知道它是否真的失败了
这是我在终端上看到的:Epoch: 0%|Killedtion: 0%|
“Killed”这个词与“iteration”这个词重叠…可能是内存问题?我是在Windows上的Ubuntu虚拟机中的WSL的vscode终端上运行的
在GitHub上找到了这个问题:https://github.com/ElderResearch/gpu_docker/issues/38
回答:
我的解决方案是将批次和工作线程设置为1,虽然速度很慢
train_dataloader = DataLoader(train_dataset, shuffle=False, batch_size=1, num_workers=1)