Keras图像预处理:元组索引超出范围

该脚本的目标是使用Keras现有的图像预处理模块对视频数据进行增强。在这个原型中,一个样本视频被分割成一系列帧并进行处理,最终步骤包括执行随机旋转、平移、剪切和缩放操作:

from keras import backend as Kfrom keras.preprocessing.image import random_rotation, random_shift, random_shear, random_zoomK.set_image_dim_ordering("th")import cv2import numpy as npvideo_file_path = "./training-data/yes/1.mov"samples_generated_per_sample = 10self_rows = 100self_columns = 150self_frames_per_sequence = 45# haar cascades for localizing oral regionface_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')mouth_cascade = cv2.CascadeClassifier('haarcascade_mcs_mouth.xml')video = cv2.VideoCapture(video_file_path)success, frame = video.read()frames = []success = True# convert to grayscale, localize oral region, equalize dimensions, # normalize pixels, equalize lengths, and accumulate valid frames while success:  success, frame = video.read()  if success:    # convert to grayscale    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)    # localize single facial region    faces_coords = face_cascade.detectMultiScale(frame, 1.3, 5)    if len(faces_coords) == 1:      face_x, face_y, face_w, face_h = faces_coords[0]      frame = frame[face_y:face_y + face_h, face_x:face_x + face_w]      # localize oral region      mouth_coords = mouth_cascade.detectMultiScale(frame, 1.3, 5)      threshold = 0      for (mouth_x, mouth_y, mouth_w, mouth_h) in mouth_coords:        if (mouth_y > threshold):            threshold = mouth_y            valid_mouth_coords = (mouth_x, mouth_y, mouth_w, mouth_h)        else:            pass      mouth_x, mouth_y, mouth_w, mouth_h = valid_mouth_coords      frame = frame[mouth_y:mouth_y + mouth_h, mouth_x:mouth_x + mouth_w]      frames.append(frame)    # ignore multiple facial region detections    else:        pass# pre-pad short sequences and equalize sequence lengthsif len(frames) < self_frames_per_sequence:    frames = [frames[0]]*(self_frames_per_sequence - len(frames)) + framesframes = frames[0:self_frames_per_sequence]frames = np.asarray(frames)rotated_frames = random_rotation(frames, rg=45)shifted_frames = random_shift(rotated_frames, wrg=0.25, hrg=0.25)sheared_frames = random_shear(shifted_frames, intensity=0.79)zoomed_frames = random_zoom(sheared_frames, zoom_range=(1.25, 1.25))

当脚本运行时,会出现以下错误:Output Error


回答:

问题在于帧的尺寸不一致。解决方案是在应用变换之前先将帧的尺寸统一:

# pre-pad short sequences, equalize frame dimensions, and equalize sequence lengthsif len(frames) < self_frames_per_sequence:    frames = [frames[0]]*(self_frames_per_sequence - len(frames)) + framesframes = frames[0:self_frames_per_sequence]frames = [cv2.resize(frame, (self_columns, self_rows)).astype('float32') for frame in frames]frames = np.asarray(frames)rotated_frames = random_rotation(frames, rg=45)shifted_frames = random_shift(rotated_frames, wrg=0.25, hrg=0.25)sheared_frames = random_shear(shifted_frames, intensity=0.79)zoomed_frames = random_zoom(sheared_frames, zoom_range=(1.25, 1.25))

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注