如何使用自定义数据集在PyTorch/few-shot-vid2vid上进行训练

我想使用我从FaceForensics视频中创建的自定义数据集来训练few-show-vid2vid模型。因此,我使用ffmpeg生成了图像序列,并使用dlib生成了关键点。当我尝试启动训练脚本时,出现了以下错误。问题到底出在哪里?之前提供的小数据集对我来说是可以正常工作的。

CustomDatasetDataLoader485 sequencesdataset [FaceDataset] was createdResuming from epoch 1 at iteration 0create web directory ./checkpoints/face/web...---------- Networks initialized ----------------------- Optimizers initialized -------------./checkpoints/face/latest_net_G.pth not exists yet!./checkpoints/face/latest_net_D.pth not exists yet!model [Vid2VidModel] was createdTraceback (most recent call last):  File "train.py", line 73, in <module>    train()  File "train.py", line 40, in train    for idx, data in enumerate(dataset, start=trainer.epoch_iter):  File "/home/keno/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in __next__    return self._process_data(data)  File "/home/keno/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data    data.reraise()  File "/home/keno/.local/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise    raise self.exc_type(msg)IndexError: Caught IndexError in DataLoader worker process 0.Original Traceback (most recent call last):  File "/home/keno/.local/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop    data = fetcher.fetch(index)  File "/home/keno/.local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch    data = [self.dataset[idx] for idx in possibly_batched_index]  File "/home/keno/.local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>    data = [self.dataset[idx] for idx in possibly_batched_index]  File "/home/keno/repos/few-shot-vid2vid/data/fewshot_face_dataset.py", line 103, in __getitem__    Li = self.get_face_image(keypoints, transform_L, ref_img.size)  File "/home/keno/repos/few-shot-vid2vid/data/fewshot_face_dataset.py", line 168, in get_face_image    x = keypoints[sub_edge, 0]IndexError: index 82 is out of bounds for axis 0 with size 82

编辑:我如何设置我的数据集。我使用ffmpeg -i _video_ -o %05d.jpg从视频中创建了图像序列,遵循了提供的样本数据集的目录结构。然后,我使用dlib的标志检测生成了关键点,基于dlib网站上提供的代码示例。我将样本代码扩展到68个点,并将它们保存到.txt文件中:

import reimport sysimport osimport dlibimport glob# if len(sys.argv) != 4:#     print(#         "Give the path to the trained shape predictor model as the first "#         "argument and then the directory containing the facial images.\n"#         "For example, if you are in the python_examples folder then "#         "execute this program by running:\n"#         "    ./face_landmark_detection.py shape_predictor_68_face_landmarks.dat ../examples/faces\n"#         "You can download a trained facial shape predictor from:\n"#         "    http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2")#     exit()predictor_path = sys.argv[1]faces_folder_path = sys.argv[2]text_file_path = sys.argv[3]detector = dlib.get_frontal_face_detector()predictor = dlib.shape_predictor(predictor_path)win = dlib.image_window()for f in glob.glob(os.path.join(faces_folder_path, "*.jpg")):    file_number = os.path.split(f)    print(file_number[1])    file_number = os.path.splitext(file_number[1])    file_number = file_number[0]    export_path = os.path.join(text_file_path, '%s.txt' % file_number)    text = open(export_path,"w+")    print("Processing file: {}".format(f))    img = dlib.load_rgb_image(f)    win.clear_overlay()    win.set_image(img)    # Ask the detector to find the bounding boxes of each face. The 1 in the    # second argument indicates that we should upsample the image 1 time. This    # will make everything bigger and allow us to detect more faces.    dets = detector(img, 1)    print("Number of faces detected: {}".format(len(dets)))    for k, d in enumerate(dets):        print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(            k, d.left(), d.top(), d.right(), d.bottom()))        # Get the landmarks/parts for the face in box d.        shape = predictor(img, d)        for i in range(67):            result = str(shape.part(i))            result = result.strip("()")            print(result)            text.write(result + '\n')        # Draw the face landmarks on the screen.        win.add_overlay(shape)    text.close()    win.add_overlay(dets)

回答:

for i in range(67):

这是错误的,你应该使用range(68)来获取68个面部特征点。你可以通过python -c "for i in range(67): print(i)"来验证,这只会从0数到66(总共67个数字)。python -c "for i in range(68): print(i)"会从0数到67(总共68个项目),并获取完整的面部特征点集。

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注