【问题标题】:How to detect memory leak in python code?如何检测python代码中的内存泄漏?
【发布时间】:2019-08-18 21:40:54
【问题描述】:

我是机器学习和 python 的新手!我希望我的代码能够预测在我的情况下主要是汽车的对象。 当我启动脚本时,它运行顺利,但在 20 张左右的图片之后,由于内存泄漏,它挂断了我的系统。 我希望这个脚本可以运行到我的整个数据库中,这个数据库有 20 多张图片。

我已经尝试过 pympler 跟踪器来跟踪哪些对象占用的内存最多 -

这是我试图运行以预测图片中的对象的代码:

from imageai.Prediction import ImagePrediction
import os
import urllib.request
import mysql.connector
from pympler.tracker import SummaryTracker
tracker = SummaryTracker()

mydb = mysql.connector.connect(
  host="localhost",
  user="phpmyadmin",
  passwd="anshu",
  database="python_test"
)
counter = 0
mycursor = mydb.cursor()

sql = "SELECT id, image_url FROM `used_cars` " \
      "WHERE is_processed = '0' AND image_url IS NOT NULL LIMIT 1"
mycursor.execute(sql)
result = mycursor.fetchall()



def dl_img(url, filepath, filename):
    fullpath = filepath + filename
    urllib.request.urlretrieve(url,fullpath)

for eachfile in result:
    id = eachfile[0]
    print(id)
    filename = "image.jpg"
    url = eachfile[1]
    filepath = "/home/priyanshu/PycharmProjects/untitled/images/"
    print(filename)
    print(url)
    print(filepath)
    dl_img(url, filepath, filename)

    execution_path = "/home/priyanshu/PycharmProjects/untitled/images/"

    prediction = ImagePrediction()
    prediction.setModelTypeAsResNet()
    prediction.setModelPath( os.path.join(execution_path,                 "/home/priyanshu/Downloads/resnet50_weights_tf_dim_ordering_tf_kernels.h    5"))
    prediction.loadModel()

    predictions, probabilities =         prediction.predictImage(os.path.join(execution_path, "image.jpg"), result_count=1)
    for eachPrediction, eachProbability in zip(predictions, probabilities):
        per = 0.00
        label = ""
        print(eachPrediction, " : ", eachProbability)
        label = eachPrediction
        per = eachProbability

    print("Label: " + label)
    print("Per:" + str(per))
    counter = counter + 1
    print("Picture Number: " + str(counter))

    sql1 = "UPDATE used_cars SET is_processed = '1' WHERE id = '%s'" % id
    sql2 = "INSERT into label (used_car_image_id, object_label, percentage) " \
           "VALUE ('%s', '%s', '%s') " % (id, label, per)
    print("done")

    mycursor.execute(sql1)
    mycursor.execute(sql2)

    mydb.commit()
    tracker.print_diff()

这是我从一张图片中得到的结果,经过一些迭代,它正在消耗整个 RAM。我应该做些什么改变来阻止泄漏?

seat_belt  :  12.617655098438263
Label: seat_belt
Per:12.617655098438263
Picture Number: 1
done
types |    objects |   total size
<class 'tuple |      130920 |     11.98 MB
<class 'dict |       24002 |      6.82 MB
<class 'list |       56597 |      5.75 MB
<class 'int |      175920 |      4.70 MB
<class 'str |       26047 |      1.92 MB
<class 'set |         740 |    464.38 KB
<class 'tensorflow.python.framework.ops.Tensor |        6515 |    
356.29 KB
<class 'tensorflow.python.framework.ops.Operation._InputList |        
6097 |    333.43 KB
<class 'tensorflow.python.framework.ops.Operation |        6097 |    
333.43 KB
<class 'SwigPyObject |        6098 |    285.84 KB
<class 'tensorflow.python.pywrap_tensorflow_internal.TF_Output |        
4656 |    254.62 KB
<class 'tensorflow.python.framework.traceable_stack.TraceableObject |        3309 |    180.96 KB
<class 'tensorflow.python.framework.tensor_shape.Dimension |        
     1767 |     96.63 KB
<class 'tensorflow.python.framework.tensor_shape.TensorShapeV1 |        
1298 |     70.98 KB
<class 'weakref |         807 |     63.05 KB

【问题讨论】:

    标签: python python-3.x machine-learning


    【解决方案1】:

    在这种情况下,模型每次都在 for 循环中加载图像。模型应该在 for 循环之外,在这种情况下,模型不会每次都启动,也不会占用程序正在占用的内存。 代码应该以这种方式工作 ->

    execution_path = "/home/priyanshu/PycharmProjects/untitled/images/"
    
    prediction = ImagePrediction()
    prediction.setModelTypeAsResNet()
    prediction.setModelPath( os.path.join(execution_path, "/home/priyanshu/Downloads/resnet50_weights_tf_dim_ordering_tf_kernels.h    5"))
    prediction.loadModel()
    
    for eachfile in result:
        id = eachfile[0]
        print(id)
        filename = "image.jpg"
    url = eachfile[1]
    filepath = "/home/priyanshu/PycharmProjects/untitled/images/"
    print(filename)
    print(url)
    print(filepath)
    dl_img(url, filepath, filename)
    
    predictions, probabilities = prediction.predictImage(os.path.join(execution_path, "image.jpg"), result_count=1)
    for eachPrediction, eachProbability in zip(predictions, probabilities):
        per = 0.00
        label = ""
        print(eachPrediction, " : ", eachProbability)
        label = eachPrediction
        per = eachProbability
    
        print("Label: " + label)
        print("Per:" + str(per))
        counter = counter + 1
        print("Picture Number: " + str(counter))
    
        sql1 = "UPDATE used_cars SET is_processed = '1' WHERE id = '%s'" % id
        sql2 = "INSERT into label (used_car_image_id, object_label, percentage) " \
           "VALUE ('%s', '%s', '%s') " % (id, label, per)
        print("done")
    
        mycursor.execute(sql1)
        mycursor.execute(sql2)
    
        mydb.commit()
        tracker.print_diff()
    

    【讨论】:

      猜你喜欢
      • 2015-05-03
      • 2013-03-21
      • 2011-08-21
      • 1970-01-01
      • 2012-07-16
      • 2011-08-21
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多