如何检测Python代码中的内存泄漏?

我对机器学习和Python都比较新手!我希望我的代码能够预测对象,主要是汽车。当我启动脚本时,它运行得很顺畅,但处理了大约20张图片后,系统就因为内存泄漏而挂掉了。我希望这个脚本能够运行在我的整个数据库上,而我的数据库远不止20张图片。

我尝试使用pympler tracker来跟踪哪些对象占用了最多的内存 –

这是我尝试运行的预测图片中对象的代码:

from imageai.Prediction import ImagePredictionimport osimport urllib.requestimport mysql.connectorfrom pympler.tracker import SummaryTrackertracker = SummaryTracker()mydb = mysql.connector.connect(  host="localhost",  user="phpmyadmin",  passwd="anshu",  database="python_test")counter = 0mycursor = mydb.cursor()sql = "SELECT id, image_url FROM `used_cars` " \      "WHERE is_processed = '0' AND image_url IS NOT NULL LIMIT 1"mycursor.execute(sql)result = mycursor.fetchall()def dl_img(url, filepath, filename):    fullpath = filepath + filename    urllib.request.urlretrieve(url,fullpath)for eachfile in result:    id = eachfile[0]    print(id)    filename = "image.jpg"    url = eachfile[1]    filepath = "/home/priyanshu/PycharmProjects/untitled/images/"    print(filename)    print(url)    print(filepath)    dl_img(url, filepath, filename)    execution_path = "/home/priyanshu/PycharmProjects/untitled/images/"    prediction = ImagePrediction()    prediction.setModelTypeAsResNet()    prediction.setModelPath( os.path.join(execution_path,                 "/home/priyanshu/Downloads/resnet50_weights_tf_dim_ordering_tf_kernels.h    5"))    prediction.loadModel()    predictions, probabilities =         prediction.predictImage(os.path.join(execution_path, "image.jpg"), result_count=1)    for eachPrediction, eachProbability in zip(predictions, probabilities):        per = 0.00        label = ""        print(eachPrediction, " : ", eachProbability)        label = eachPrediction        per = eachProbability    print("Label: " + label)    print("Per:" + str(per))    counter = counter + 1    print("Picture Number: " + str(counter))    sql1 = "UPDATE used_cars SET is_processed = '1' WHERE id = '%s'" % id    sql2 = "INSERT into label (used_car_image_id, object_label, percentage) " \           "VALUE ('%s', '%s', '%s') " % (id, label, per)    print("done")    mycursor.execute(sql1)    mycursor.execute(sql2)    mydb.commit()    tracker.print_diff()

这是我从单张图片中得到的结果,经过几次迭代后,它会消耗掉所有的RAM。我应该做哪些更改来阻止内存泄漏?

seat_belt  :  12.617655098438263Label: seat_beltPer:12.617655098438263Picture Number: 1donetypes |    objects |   total size<class 'tuple |      130920 |     11.98 MB<class 'dict |       24002 |      6.82 MB<class 'list |       56597 |      5.75 MB<class 'int |      175920 |      4.70 MB<class 'str |       26047 |      1.92 MB<class 'set |         740 |    464.38 KB<class 'tensorflow.python.framework.ops.Tensor |        6515 |    356.29 KB<class 'tensorflow.python.framework.ops.Operation._InputList |        6097 |    333.43 KB<class 'tensorflow.python.framework.ops.Operation |        6097 |    333.43 KB<class 'SwigPyObject |        6098 |    285.84 KB<class 'tensorflow.python.pywrap_tensorflow_internal.TF_Output |        4656 |    254.62 KB<class 'tensorflow.python.framework.traceable_stack.TraceableObject |        3309 |    180.96 KB<class 'tensorflow.python.framework.tensor_shape.Dimension |             1767 |     96.63 KB<class 'tensorflow.python.framework.tensor_shape.TensorShapeV1 |        1298 |     70.98 KB<class 'weakref |         807 |     63.05 KB

回答:

在这种情况下,模型在每次处理图片的for循环中都会加载。应该将模型放在for循环之外,这样模型就不会每次都重新启动,也不会占用程序当前所占用的内存。代码应该这样工作 ->

execution_path = "/home/priyanshu/PycharmProjects/untitled/images/"prediction = ImagePrediction()prediction.setModelTypeAsResNet()prediction.setModelPath( os.path.join(execution_path, "/home/priyanshu/Downloads/resnet50_weights_tf_dim_ordering_tf_kernels.h    5"))prediction.loadModel()for eachfile in result:    id = eachfile[0]    print(id)    filename = "image.jpg"url = eachfile[1]filepath = "/home/priyanshu/PycharmProjects/untitled/images/"print(filename)print(url)print(filepath)dl_img(url, filepath, filename)predictions, probabilities = prediction.predictImage(os.path.join(execution_path, "image.jpg"), result_count=1)for eachPrediction, eachProbability in zip(predictions, probabilities):    per = 0.00    label = ""    print(eachPrediction, " : ", eachProbability)    label = eachPrediction    per = eachProbability    print("Label: " + label)    print("Per:" + str(per))    counter = counter + 1    print("Picture Number: " + str(counter))    sql1 = "UPDATE used_cars SET is_processed = '1' WHERE id = '%s'" % id    sql2 = "INSERT into label (used_car_image_id, object_label, percentage) " \       "VALUE ('%s', '%s', '%s') " % (id, label, per)    print("done")    mycursor.execute(sql1)    mycursor.execute(sql2)    mydb.commit()    tracker.print_diff()

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注