代码之家  ›  专栏  ›  技术社区  ›  Nikolay Dyankov

从Dockerfile运行命令时出错,但从容器运行良好

  •  0
  • Nikolay Dyankov  · 技术社区  · 2 年前

    我正在为OneFormer ML模型设置docker映像( environment instructions here ).

    除了最后一个命令之外,我使用的Dockerfile是有效的:

    FROM nvidia/cuda:11.3.1-devel-ubuntu20.04
    
    # Set environment variables
    ENV DEBIAN_FRONTEND=noninteractive
    ENV LANG=C.UTF-8
    ENV LC_ALL=C.UTF-8
    
    # Update package list and install dependencies
    RUN apt-get update && apt-get install -y --no-install-recommends \
        wget \
        ca-certificates \
        git \
        build-essential \
        libglib2.0-0 \
        libsm6 \
        libxext6 \
        libxrender1 \
        libyaml-cpp-dev \
        libopencv-dev \
        && rm -rf /var/lib/apt/lists/*
    
    # Install GCC, G++ 9
    RUN apt-get update && apt-get install -y --no-install-recommends \
        gcc-9 \
        g++-9 \
        && rm -rf /var/lib/apt/lists/* \
        && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 100 \
        && update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-9 100
    
    # Install conda 4.12.0
    RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh -O miniconda.sh \
        && chmod +x miniconda.sh \
        && ./miniconda.sh -b -p /opt/conda \
        && rm miniconda.sh \
        && /opt/conda/bin/conda clean -tipsy \
        && ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
        && echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
        && echo "conda activate base" >> ~/.bashrc
    
    # Set some environment variables
    ENV PATH /opt/conda/bin:$PATH
    ENV WANDB_API_KEY=...
    ENV CUDA_HOME=/usr/local/cuda
    ENV FORCE_CUDA=1
    
    # Clone OneFormer repository and set working directory
    RUN git clone https://github.com/SHI-Labs/OneFormer.git /OneFormer
    WORKDIR /OneFormer
    
    # Install dependencies
    RUN conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -c conda-forge
    RUN pip3 install -U opencv-python
    RUN python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
    RUN pip3 install git+https://github.com/cocodataset/panopticapi.git
    RUN pip3 install git+https://github.com/mcordts/cityscapesScripts.git
    RUN pip3 install -r requirements.txt
    # RUN pip3 install wandb
    # RUN wandb login
    
    # Setup MSDeformAttn
    # RUN cd oneformer/modeling/pixel_decoder/ops && \
    #     sh ./make.sh
    
    # Set the default command to run when starting the container
    CMD ["/bin/bash"]
    

    当它到达 # Setup MSDeformAttn 块我得到这个错误:

    #0 13.59 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    #0 15.97 /opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py:381: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
    #0 15.97   warnings.warn(msg.format('we could not find ninja.'))
    #0 15.97 Traceback (most recent call last):
    #0 15.97   File "setup.py", line 69, in <module>
    #0 15.97     setup(
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
    #0 15.97     return distutils.core.setup(**attrs)
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/core.py", line 148, in setup
    #0 15.97     dist.run_commands()
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/dist.py", line 966, in run_commands
    #0 15.97     self.run_command(cmd)
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/dist.py", line 985, in run_command
    #0 15.97     cmd_obj.run()
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/command/build.py", line 135, in run
    #0 15.97     self.run_command(cmd_name)
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/cmd.py", line 313, in run_command
    #0 15.97     self.distribution.run_command(command)
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/dist.py", line 985, in run_command
    #0 15.97     cmd_obj.run()
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
    #0 15.97     _build_ext.run(self)
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    #0 15.97     _build_ext.build_ext.run(self)
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/command/build_ext.py", line 340, in run
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    #0 15.97     _build_ext.build_extension(self, ext)
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/command/build_ext.py", line 528, in build_extension
    #0 15.97     objects = self.compiler.compile(sources,
    #0 15.97   File "/opt/conda/lib/python3.8/distutils/ccompiler.py", line 574, in compile
    #0 15.97     self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 483, in unix_wrap_single_compile
    #0 15.97     cflags = unix_cuda_flags(cflags)
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 450, in unix_cuda_flags
    #0 15.97     cflags + _get_cuda_arch_flags(cflags))
    #0 15.97   File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1606, in _get_cuda_arch_flags
    #0 15.97     arch_list[-1] += '+PTX'
    #0 15.97 IndexError: list index out of range
    ------
    failed to solve: executor failed running [/bin/sh -c cd oneformer/modeling/pixel_decoder/ops &&     sh ./make.sh]: exit code: 1
    

    但当我注释掉这个命令时,启动容器并执行以下操作:

    cd oneformer/modeling/pixel_decoder/ops
    bash make.sh
    

    它构建时没有出现错误,并且运行了演示代码。它拒绝从码头文件中运行的原因是什么?错误消息没有多大帮助。

    0 回复  |  直到 2 年前