【问题标题】:Condor running python successfully, but doesn't show output filesCondor 成功运行 python,但不显示输出文件
【发布时间】:2021-09-24 18:36:04
【问题描述】:

我是 HTCondor 的新手,我正在尝试在 condor 系统上运行 python 脚本。我想在我的代码中使用 cv2 和 numpy,同时能够在完成后读取我的打印件和腌制数据。

当前代码运行并完成(日志文件:返回值 0)。但是condor_bin.out 在我的打印应该出现的地方是空的。并且没有文件random_dat.pickle被传输。

我是不是做错了什么?

Python 脚本:

import numpy as np
import pickle
import cv2 as cv

print('test')
# setup cv2
sift = cv.SIFT_create()
img = cv.imread("0.jpg", cv.IMREAD_GRAYSCALE)

for i in range(25):
    # calc cv2
    kp, des = sift.detectAndCompute(img, None)
    # calc np
    norms = np.linalg.norm(des, axis=1)

# calc normal? python
index = []
for p in kp:
    temp = (p.pt, p.size, p.angle, p.response, p.octave, p.class_id)
    index.append(temp)

with open('./random_dat.pickle', 'wb') as handle:
    pickle.dump((123456, index, des, norms), handle)
    
print("finished")

Condor 设置文件 (test.info)

#Normal execution
Universe = vanilla

#I need just one CPU (which is the default)
RequestCpus    = 1
#No GPU
RequestGPUs    = 0
#I need disk spqce KB
RequestDisk = 150MB
#I need 2 GBytes of RAM (resident memory)
RequestMemory  = 150MB
#It will not run longer than 1 day
+RequestWalltime = 100

#retrieve data
#should_transfer_files = YES
#when_to_transfer_output = ON_EXIT

#I'm a nice person, I think...
NiceUser = true
#Mail me only if something is wrong
Notification = Always

# The job will 'cd' to this directory before starting, be sure you can _write_ here.
initialdir = /users/students/r0xxxxxx/Documents/testing_condor/
# This is the executable or script I want to run
executable = /users/students/r0xxxxxx/Documents/testing_condor/main.py

#Output of condors handling of the jobs, will be in 'initialdir'
Log          = condor_bin.log
#Standard output of the 'executable', in 'initialdir'
Output       = condor_bin.out
#Standard error of the 'executable', in 'initialdir'
Error        = condor_bin.err
#Standard error of the 'executable', in 'initialdir'

# Start just 1 instance of the job
Queue 1

我使用condor_submit test.info 提交了它,导致condor_bin.log 出现以下登录:

...
000 (356.000.000) 2021-07-15 18:23:28 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=abcdefg.abcd.abcdefg.be&noUDP&sock=schedd_2422_de78>
...
000 (357.000.000) 2021-07-15 18:24:19 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=abcdefg.abcd.abcdefg.be&noUDP&sock=schedd_2422_de78>
...
040 (356.000.000) 2021-07-15 18:24:21 Started transferring input files
    Transferring to host: <10.xx.xx.xx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.abcd.abcdefg.be&noUDP&sock=slot1_1_123445_eb75_5374>
...
040 (356.000.000) 2021-07-15 18:24:21 Finished transferring input files
...
001 (356.000.000) 2021-07-15 18:24:22 Job executing on host: <10.xx.xx.xx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.abcd.abcdefg.be&noUDP&sock=startd_2178_815c>
...
006 (356.000.000) 2021-07-15 18:24:22 Image size of job updated: 1
    0  -  MemoryUsage of job (MB)
    0  -  ResidentSetSize of job (KB)
...
040 (356.000.000) 2021-07-15 18:24:22 Started transferring output files
...
040 (356.000.000) 2021-07-15 18:24:22 Finished transferring output files
...
005 (356.000.000) 2021-07-15 18:24:22 Job terminated.
    (1) Normal termination (return value 0)
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
    0  -  Run Bytes Sent By Job
    803  -  Run Bytes Received By Job
    0  -  Total Bytes Sent By Job
    803  -  Total Bytes Received By Job
    Partitionable Resources :    Usage  Request Allocated 
       Cpus                 :                 1         1 
       Disk (KB)            :       13   153600    782129 
       Gpus (Average)       :                 0         0 
       Memory (MB)          :        0      150       256 

    Job terminated of its own accord at 2021-07-15T16:24:22Z.
...

正如您在 test.info 中看到的,我尝试使用

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

但这没有用。

如何查看我的打印语句以及完成后如何查看我的腌制数据?

非常感谢您的帮助!

【问题讨论】:

    标签: python transfer condor


    【解决方案1】:

    按照@Greg 的建议添加#!/usr/bin/python 导致以下错误

    Executable file 'my_file/path' is a script with CRLF (DOS/Windows) line endings.
    This generally doesn't work, and you should probably run 'dos2unix myfile/path' -- or a similar tool -- before you resubmit.
    

    我在我的 Linux 系统上生成了一个新的 Python 文件,其中添加了以下几行作为前缀

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    

    test.info condor 文件中使用 should_transfer_files = YESwhen_to_transfer_output = ON_EXIT 设置时成功在 condor 上运行。

    TLDR;运行在 Windows 中生成的 Python 代码可能会在 Linux 上运行的 condor 系统上产生错误。修复:将您的代码写入/复制到 Linux 生成的 Python 文件中。

    【讨论】:

      【解决方案2】:

      尝试使用

      启动脚本

      #!/usr/bin/python

      【讨论】:

        猜你喜欢
        • 2012-10-20
        • 2021-06-14
        • 2019-06-05
        • 2020-12-28
        • 2021-11-30
        • 1970-01-01
        • 2018-08-15
        • 1970-01-01
        • 2021-07-03
        相关资源
        最近更新 更多