### Keras, memoryerror – data = data.astype(“float”) / 255.0. 无法为形状为(13165, 32, 32, 3)的数组分配309 MiB的内存

我目前正在处理Smiles数据集，并尝试使用深度学习来检测笑容是正面还是负面。我使用的机器是Raspberry Pi 3，运行的Python版本是3.7（不是2.7）。

我的训练集中总共有13165张图片。我希望将这些图片存储在一个数组中。然而，我遇到了一个问题，就是无法为形状为(13165, 32, 32, 3)的数组分配内存。

以下是源代码（shallownet_smile.py）：

from sklearn.preprocessing import LabelBinarizerfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import classification_reportfrom pyimagesearch.preprocessing import ImageToArrayPreprocessorfrom pyimagesearch.preprocessing import SimplePreprocessorfrom pyimagesearch.datasets import SimpleDatasetLoaderfrom pyimagesearch.nn.conv.shallownet import ShallowNetfrom keras.optimizers import SGDfrom imutils import pathsimport matplotlib.pyplot as pltimport numpy as npimport argparseap = argparse.ArgumentParser()ap.add_argument("-d", "--dataset", required=True, help="path to input dataset")args = vars(ap.parse_args())# grab the list of images we'll be describingprint("[INFO] loading images...")imagePaths = list(paths.list_images(args["dataset"]))sp = SimplePreprocessor(32, 32)iap = ImageToArrayPreprocessor()sdl = SimpleDatasetLoader(preprocessors=[sp, iap])(data, labels) = sdl.load(imagePaths, verbose=1)# convert values to between 0-1data = data.astype("float") / 255.0# partition our data into training and test sets(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.25,    random_state=42)# convert the labels from integers to vectorstrainY = LabelBinarizer().fit_transform(trainY)testY = LabelBinarizer().fit_transform(testY)# initialize the optimizer and modelprint("[INFO] compiling model...")# initialize stochastic gradient descent with learning rate of 0.005opt = SGD(lr=0.005)model = ShallowNet.build(width=32, height=32, depth=3, classes=2)model.compile(loss="categorical_crossentropy", optimizer=opt,    metrics=["accuracy"])# train the networkprint("[INFO] training network...")H = model.fit(trainX, trainY, validation_data=(testX, testY), batch_size=32,    epochs=100, verbose=1)print("[INFO] evaluating network...")predictions = model.predict(testX, batch_size=32)print(classification_report(    testY.argmax(axis=1),    predictions.argmax(axis=1),    target_names=["positive", "negative"]))plt.style.use("ggplot")plt.figure()plt.plot(np.arange(0, 100), H.history["loss"], label="train_loss")plt.plot(np.arange(0, 100), H.history["val_loss"], label="val_loss")plt.plot(np.arange(0, 100), H.history["acc"], label="train_acc")plt.plot(np.arange(0, 100), H.history["val_acc"], label="val_acc")plt.title("Training Loss and Accuracy")plt.xlabel("Epoch #")plt.ylabel("Loss/Accuracy")plt.legend()plt.show()

假设数据集在我当前目录下。以下是我得到的错误信息：

python3 shallownet_smile.py -d=datasets/Smiles

错误信息

我仍然感到困惑，不知道哪里出了问题。我非常希望有经验的深度学习/机器学习专家能够解释并澄清我做错了什么。

感谢您的帮助和关注。

回答：

首先，您的系统内存非常低，因此请尝试使用较小的图片。

错误主要源自这一行代码 data = data.astype("float") / 255.0

原因是数据已经是一个uint8类型的numpy数组，而您又在创建一个float32类型的numpy数组，这会占用额外的内存。

我将修改一些部分的simpledataloader，以便您可以进行训练。

转到 from pyimagesearch.datasets import SimpleDatasetLoader。它应该在文件夹pyimagesearch/datasets/simpledatasetloader.py中（示例代码：https://github.com/whydna/Deep-Learning-For-Computer-Vision/blob/master/pyimagesearch/datasets/simpledatasetloader.py）

用我的代码更改此.py文件，并更改max_image的值（除非您能处理现有的内存，否则请减少它），还要删除这一行 data = data.astype("float") / 255.0，因为我从函数中发送了预处理后的数组。

# import the necessary packagesimport numpy as npimport cv2import osmax_image = 1000class SimpleDatasetLoader:    def __init__(self, preprocessors=None):        # store the image preprocessor        self.preprocessors = preprocessors        # if the preprocessors are None, initialize them as an        # empty list        if self.preprocessors is None:            self.preprocessors = []    def load(self, imagePaths, verbose=-1):        # initialize the list of features and labels        data = []        labels = []        cnt = 0        # loop over the input images        for (i, imagePath) in enumerate(imagePaths):            if cnt >= max_image:                break            # load the image and extract the class label assuming            # that our path has the following format:            # /path/to/dataset/{class}/{image}.jpg            image = cv2.imread(imagePath)            label = imagePath.split(os.path.sep)[-2]            # check to see if our preprocessors are not None            if self.preprocessors is not None:                # loop over the preprocessors and apply each to                # the image                for p in self.preprocessors:                    image = p.preprocess(image)            # treat our processed image as a "feature vector"            # by updating the data list followed by the labels            data.append(image)            labels.append(label)            # show an update every `verbose` images            cnt += 1            if verbose > 0 and i > 0 and (i + 1) % verbose == 0:                print("[INFO] processed {}/{}".format(i + 1,                    len(imagePaths)))        # return a tuple of the data and labels        return (np.array(data, dtype='float32')/255., np.array(labels))

如果您仍然遇到内存问题，请在此处减少batch_size

H = model.fit(trainX, trainY, validation_data=(testX, testY), batch_size=4,    epochs=100, verbose=1)

学技术

### Keras, memoryerror – data = data.astype(“float”) / 255.0. 无法为形状为(13165, 32, 32, 3)的数组分配309 MiB的内存

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复