代码之家 › 专栏 › 技术社区 › Peter

ECS EC2自动缩放卡滞

amazon-ecs amazon-ec2 amazon-web-services

Peter · 技术社区 · 3 年前

我正在努力学习自动缩放 ECS 具有 EC2 launch type 。

没有自动缩放部件,一切都很好。

当我添加自动缩放部件,或 Scalable Target 这个 Alarm 和 Policy 对于这两种情况,无论是扩展还是扩展,服务都会陷入以下事件:

service ecs-service was unable to place a task because no container instance met all of its requirements. The closest matching container-instance XXX has insufficient CPU units available.

如果我查看服务,所需的容量被卡住了 4 ,挂起为 0 而跑步 1 。关于警报 high cpu usage alarm 是 OK 以及 low cpu usage alarm 是 In alarm 。

这个 Task Definition 有 1024 分配给CPU的MB 1024 MB到内存。容器具有 1024 分配给CPU的MB 1024 MB到内存。

我一直在等待 40 minutes 。

我会期待什么? 我为 high CPU使用率(20%),使警报反应轻松。然后,我将数量增加到4,检查使用的CPU百分比。

在添加和删除时,这两种方式都应该有效。因此,当启用高电平时,它应该加起来为4,当启用低电时,它应该减到1。

以下是没有任务ID、日期和事件ID的整个事件链,以简化阅读。

service ecs-service was unable to place a task because no container instance met all of its requirements. The closest matching container-instance XXX has insufficient CPU units available. For more 

service ecs-service registered 1 targets in target-group ecs-target

service ecs-service was unable to place a task because no container instance met all of its requirements. The closest matching container-instance XXX has insufficient CPU units available. For more information, see the Troubleshooting section.

Message: Successfully set desired count to 4. Waiting for change to be fulfilled by ecs. Cause: monitor alarm high-cpu-usage in state ALARM triggered policy ecs-high-policy

service ecs-service has started 1 tasks: task

service ecs-service has stopped 1 running tasks: task

service ecs-service deregistered 1 targets in target-group ecs-target

service ecs-service (instance XXX) (port 8080) is unhealthy in target-group ecs-target due to (reason Health checks failed)

service ecs-service has started 1 tasks: task 

service ecs-service was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster. For more information, see the Troubleshooting section.

Message: Successfully set desired count to 4. Found it was later changed to 0. Cause: monitor alarm high-cpu-usage in state ALARM triggered policy ecs-high-policy

Message: Successfully set desired count to 4. Found it was later changed to 0. Cause: monitor alarm high-cpu-usage in state ALARM triggered policy ecs-high-policy

Message: Successfully set desired count to 3. Change successfully fulfilled by ecs. Cause: monitor alarm high-cpu-usage in state ALARM triggered policy ecs-high-policy

Message: Successfully set desired count to 2. Change successfully fulfilled by ecs. Cause: monitor alarm high-cpu-usage in state ALARM triggered policy ecs-high-policy

这是我的可扩展目标、警报和策略:

该服务使用负载平衡器。

  ServiceScalableTarget:
    Type: AWS::ApplicationAutoScaling::ScalableTarget
    DependsOn: Service
    Properties:
      MaxCapacity: !Ref MaxSize
      MinCapacity: !Ref MinSize
      ResourceId:
        Fn::Join:
        - '/'
        - - 'service'
          - Ref: Cluster
          - Fn::GetAtt:
            - Service
            - 'Name'
      RoleARN:
        Fn::ImportValue: !Ref ECSAutoScalingRole
      ScalableDimension: ecs:service:DesiredCount
      ServiceNamespace: ecs

  HighCpuUsageAlarm:
    Type: AWS::CloudWatch::Alarm
    DependsOn: ScalingPolicyHigh
    Properties:
      AlarmName: high-cpu
      MetricName: CPUUtilization
      Namespace: AWS/ECS
      Dimensions:
        - Name: ServiceName
          Value: !Ref ServiceName
        - Name: ClusterName
          Value: !Ref Cluster
      Statistic: Average
      Period: 300
      EvaluationPeriods: 1
      Threshold: 20
      ComparisonOperator: GreaterThanOrEqualToThreshold
      AlarmActions:
        - !Ref ScalingPolicyHigh


  ScalingPolicyHigh: 
    Type: AWS::ApplicationAutoScaling::ScalingPolicy
    Properties:
      PolicyName: olicy-high
      PolicyType: StepScaling
      ScalingTargetId:
        Ref: ServiceScalableTarget
      StepScalingPolicyConfiguration:
        AdjustmentType: ChangeInCapacity
        Cooldown: 600
        MetricAggregationType: Average
        StepAdjustments:
        - MetricIntervalLowerBound: 0
          MetricIntervalUpperBound: 15
          ScalingAdjustment: 1
        - MetricIntervalLowerBound: 15
          MetricIntervalUpperBound: 25
          ScalingAdjustment: 2
        - MetricIntervalLowerBound: 25
          ScalingAdjustment: 3

  LowCpuUsageAlarm:
    Type: AWS::CloudWatch::Alarm
    DependsOn: ScalingPolicyLow
    Properties:
      AlarmName: low-cpu
      MetricName: CPUUtilization
      Namespace: AWS/ECS
      Dimensions:
        - Name: ServiceName
          Value: !Ref ServiceName
        - Name: ClusterName
          Value: !Cluster
      Statistic: Average
      Period: 300
      EvaluationPeriods: 2
      Threshold: 15
      ComparisonOperator: LessThanOrEqualToThreshold
      AlarmActions:
        - !Ref ScalingPolicyLow

  ScalingPolicyLow: 
    Type: AWS::ApplicationAutoScaling::ScalingPolicy
    Properties:
      PolicyName: policy-low
      PolicyType: StepScaling
      ScalingTargetId:
        Ref: ServiceScalableTarget
      StepScalingPolicyConfiguration:
        AdjustmentType: ChangeInCapacity
        Cooldown: 600
        MetricAggregationType: Average
        StepAdjustments:
        - MetricIntervalLowerBound: -15
          MetricIntervalUpperBound: 0
          ScalingAdjustment: -1
        - MetricIntervalLowerBound: -25
          MetricIntervalUpperBound: -15
          ScalingAdjustment: -2
        - MetricIntervalUpperBound: -25
          ScalingAdjustment: -3

我很感激你的帮助。我无法使它正常工作。

0 回复 | 直到 3 年前