代码之家  ›  专栏  ›  技术社区  ›  gsamaras a Data Head

没有阶段正在运行,但numRunningTasks!=0

  •  4
  • gsamaras a Data Head  · 技术社区  · 9 年前

    WARN ExecutorAllocationManager:没有阶段正在运行,但 numRunningTasks!=0

    来自Spark的内部 code 我发现了这个:

        // If this is the last stage with pending tasks, mark the scheduler queue as empty
        // This is needed in case the stage is aborted for any reason
        if (stageIdToNumTasks.isEmpty) {
          allocationManager.onSchedulerQueueEmpty()
          if (numRunningTasks != 0) {
            logWarning("No stages are running, but numRunningTasks != 0")
            numRunningTasks = 0
          }
        }
    

    有人能解释一下吗?


    我正在谈论Id为0的任务。

    enter image description here


    我可以报告,在Spark的MLlib中 KMeans() 哪里 the one of the two samples 据说完成任务更少。我不确定这项工作是否会失败。。

    2  takeSample at KMeans.scala:355 2016/08/27 21:39:04   7 s 1/1 9600/9600
    1  takeSample at KMeans.scala:355 2016/08/27 21:38:57   6 s 1/1 6608/9600
    

    输入集为100米点,256个维度。

    PySpark的一些参数:master是yarn,mode是cluster,

    spark.dynamicAllocation.enabled             false
    # Better serializer - https://spark.apache.org/docs/latest/tuning.html#data-serialization
    spark.serializer                            org.apache.spark.serializer.KryoSerializer
    spark.kryoserializer.buffer.max             2000m
    
    # Bigger PermGen space, use 4 byte pointers (since we have < 32GB of memory)
    spark.executor.extraJavaOptions             -XX:MaxPermSize=512m -XX:+UseCompressedOops
    
    # More memory overhead
    spark.yarn.executor.memoryOverhead          4096
    spark.yarn.driver.memoryOverhead            8192
    
    spark.executor.cores                        8
    spark.executor.memory                       8G
    
    spark.driver.cores                          8
    spark.driver.memory                         8G
    spark.driver.maxResultSize                  4G
    
    1 回复  |  直到 8 年前
        1
  •  2
  •   gsamaras a Data Head    9 年前

    我们得到的是这个代码:

        ...
        // If this is the last stage with pending tasks, mark the scheduler queue as empty
        // This is needed in case the stage is aborted for any reason
        if (stageIdToNumTasks.isEmpty) {
          allocationManager.onSchedulerQueueEmpty()
          if (numRunningTasks != 0) {
            logWarning("No stages are running, but numRunningTasks != 0")
            numRunningTasks = 0
          }
        }
      }
    }
    

    来自Spark的 GitHub ,其中评论是迄今为止最好的解释。