代码之家  ›  专栏  ›  技术社区  ›  Charlie Parker

如何更改condor中的设置,以便用户不必修改他们的实验脚本?

  •  0
  • Charlie Parker  · 技术社区  · 4 年前

    chmod a+x /home/user/automl-meta-learning/results_plots/main.py
    


    这里有一个用户提交脚本示例:

    ####################
    #
    # Experiments script
    # Simple HTCondor submit description file
    #
    # chmod a+x test_condor.py
    # chmod a+x experiments_meta_model_optimization.py
    # chmod a+x meta_learning_experiments_submission.py
    # chmod a+x download_miniImagenet.py
    # chmod a+x ~/meta-learning-lstm-pytorch/main.py
    # chmod a+x /home/user/automl-meta-learning/automl-proj/meta_learning/datasets/rand_fc_nn_vec_mu_ls_gen.py
    # chmod a+x /home/user/automl-meta-learning/automl-proj/experiments/meta_learning/supervised_experiments_submission.py
    # chmod a+x /home/user/automl-meta-learning/results_plots/main.py
    # condor_submit -i
    # condor_submit job.sub
    #
    ####################
    
    # Executable = /home/user/automl-meta-learning/automl-proj/experiments/meta_learning/supervised_experiments_submission.py
    Executable = /home/user/automl-meta-learning/automl-proj/experiments/meta_learning/meta_learning_experiments_submission.py
    # Executable = /home/user/meta-learning-lstm-pytorch/main.py
    # Executable = /home/user/automl-meta-learning/automl-proj/meta_learning/datasets/rand_fc_nn_vec_mu_ls_gen.py
    
    ## Output Files
    Log          = experiment_output_job.$(CLUSTER).log.out
    Output       = experiment_output_job.$(CLUSTER).out.out
    Error        = experiment_output_job.$(CLUSTER).err.out
    
    # Use this to make sure 1 gpu is available. The key words are case insensitive.
    REquest_gpus = 1
    requirements = (CUDADeviceName != "Tesla K40m")
    # requirements = (CUDADeviceName == "Quadro RTX 6000")
    
    # requirements = ((CUDADeviceName = "Tesla K40m")) && (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) && (TARGET.Cpus >= RequestCpus) && (TARGET.gpus >= Requestgpus) && ((TARGET.FileSystemDomain == MY.FileSystemDomain) || (TARGET.HasFileTransfer))
    # requirements = (CUDADeviceName == "Tesla K40m")
    # requirements = (CUDADeviceName == "GeForce GTX TITAN X")
    
    # Note: to use multiple CPUs instead of the default (one CPU), use request_cpus as well
    Request_cpus = 4
    # Request_cpus = 16
    
    # E-mail option
    Notify_user = me@gmail.com
    Notification = always
    
    Environment = MY_CONDOR_JOB_ID= $(CLUSTER)
    
    # "Queue" means add the setup until this line to the queue (needs to be at the end of script).
    Queue
    
    0 回复  |  直到 4 年前
        1
  •  0
  •   Greg    4 年前

    如果要直接执行.py文件,则需要设置execute位,这是一种Linux/Unix习惯用法,与HTCondor关系不大。也就是说,在命令行上,如果您想运行

    $ ./foo.py
    

    foo.py需要可执行位集。如果您想解决这个问题,可以将foo.py作为参数传递给python,然后运行

    $ python foo.py
    

    那么不需要设置可执行位。为了在HTCondor中模拟这种情况,可以将/usr/bin/python或/usr/bin/python3设置为可执行文件,并将foo.py设置为参数,例如。

    executable = /usr/bin/python
    arguments = foo.py
    

    这都假设您有一个共享的文件系统。如果使用HTCondor的文件传输将数据发送到worker节点,那么还有几行。