代码之家  ›  专栏  ›  技术社区  ›  Ercan Koc

使用EKS和RDS Aurora在My VPC中努力切换到IPv6

  •  0
  • Ercan Koc  · 技术社区  · 9 月前

    我一直在尝试在VPC中切换到IPv6,以节省与IPv4使用相关的成本。我的设置包括EKS和RDS Aurora,我正在使用Terraform配置所有内容。

    但是,当我尝试为EKS创建具有公共和私有子网的纯IPv6 VPC时,我遇到了以下错误:

    "At least one subnet in each AZ should have 2 free IPs. Invalid AZs: { [eu-central-1a, eu-central-1b] }, provided subnets: { subnet-06a43f*, subnet-05350*}"
    

    另一方面,如果我为EKS设置双栈IPv6子网,则NAT网关需要IPv4。但是,当我尝试在没有IPv4 NAT网关的情况下部署EKS时,我收到了以下错误:

    "Error: waiting for EKS Node Group (-eks-cluster:-eks-workers) to be created: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. Last error: i-0bb3*: NodeCreationFailure: Instances failed to join the Kubernetes cluster."
    

    似乎让它工作的唯一方法是启用使用IPv4的NAT网关,不幸的是,这违背了我通过切换到IPv6来降低成本的目标。

    还有其他人经历过吗?关于如何在不遇到这些问题的情况下有效地过渡到IPv6,有什么建议吗?

    module "vpc_and_subnets" {
      source  = "terraform-aws-modules/vpc/aws"
      version = "5.13.0"
    
      name = local.name
      cidr = local.vpc_cidr
    
      azs = local.azs
      private_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 3, k)]
      public_subnets   = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 3, k + length(local.azs))]
      database_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 3, k + 2*length(local.azs))]
    
    
      enable_ipv6                                   = true
      #public_subnet_ipv6_native    = true
      #private_subnet_ipv6_native    = true
      create_egress_only_igw = true
    
      public_subnet_ipv6_prefixes   = [for k, v in local.azs : k]
      private_subnet_ipv6_prefixes  = [for k, v in local.azs : k + length(local.azs)]
      database_subnet_ipv6_prefixes = [for k, v in local.azs : k + 2*length(local.azs)]
    
      private_subnet_assign_ipv6_address_on_creation = true
      public_subnet_assign_ipv6_address_on_creation = true
    
    
      # create nat gateways 
      enable_nat_gateway     = var.enable_nat_gateway
      
      
      #single_nat_gateway     = var.single_nat_gateway
      #one_nat_gateway_per_az = var.one_nat_gateway_per_az
    
      # enable dns hostnames and support
      enable_dns_hostnames = var.enable_dns_hostnames
      enable_dns_support   = var.enable_dns_support
      
    
      # tags for public, private subnets and vpc
      tags                = var.tags
      public_subnet_tags  = var.additional_public_subnet_tags
      private_subnet_tags = var.additional_private_subnet_tags
    
      # create internet gateway
      #create_igw       = var.create_igw
      instance_tenancy = var.instance_tenancy
    
      create_database_subnet_group           = true
      create_database_subnet_route_table     = true
      create_database_internet_gateway_route = true
      database_subnet_group_name = "${var.name}-${var.database_subnet_group_name}"
      
    }
    
    module "eks" {
      # invoke public eks module
      source  = "terraform-aws-modules/eks/aws"
      version = "20.8.3"
    
      # eks cluster name and version
      cluster_name    = var.eks_cluster_name
      cluster_version = var.k8s_version
      # vpc id where the eks cluster security group needs to be created
      vpc_id = var.vpc_id
      cluster_ip_family = var.cluster_ip_family
      create_cni_ipv6_iam_policy = true
    
      # subnets where the eks cluster needs to be created
      control_plane_subnet_ids = var.control_plane_subnet_ids
    
      enable_cluster_creator_admin_permissions = true
    
      # to enable public and private access for eks cluster endpoint
      cluster_endpoint_private_access      = true
      cluster_endpoint_public_access       = true
      cluster_endpoint_public_access_cidrs = var.public_access_cidrs
    
      # create an OpenID Connect Provider for EKS to enable IRSA
      enable_irsa = true
    
      # install eks managed addons
      # more details are here - https://docs.aws.amazon.com/eks/latest/userguide/eks-add-ons.html
      cluster_addons = {
        # extensible DNS server that can serve as the Kubernetes cluster DNS
        coredns = {
          preserve    = true
          most_recent = true
        }
    
        # maintains network rules on each Amazon EC2 node. It enables network communication to your Pods
        kube-proxy = {
          most_recent = true
        }
    
        # a Kubernetes container network interface (CNI) plugin that provides native VPC networking for your cluster
        vpc-cni = {
          most_recent = true
        }
        aws-ebs-csi-driver = {
          most_recent = true
        }
        aws-efs-csi-driver = {
          most_recent = true
        }     
      }
        # Extend cluster security group rules
      cluster_security_group_additional_rules = {
        egress_nodes_ephemeral_ports_tcp = {
          description                = "To node 1025-65535"
          protocol                   = "tcp"
          from_port                  = 1025
          to_port                    = 65535
          type                       = "egress"
          source_node_security_group = true
        }
        }
        
      # Extend node-to-node security group rules
      node_security_group_additional_rules = {
        ingress_self_all = {
          description = "Node to node all ports/protocols"
          protocol    = "-1"
          from_port   = 0
          to_port     = 0
          type        = "ingress"
          self        = true
        }
      }
      # subnets where the eks node groups needs to be created
      subnet_ids = var.eks_node_groups_subnet_ids
    
    
      # eks managed node group named worker
      eks_managed_node_groups         = var.eks_managed_node_groups
      eks_managed_node_group_defaults = var.eks_managed_node_group_defaults
    }
    
    
    resource "aws_security_group_rule" "allow_worker_nodes" {
      security_group_id = module.eks.cluster_primary_security_group_id
      type              = "ingress"
      from_port         = 443
      to_port           = 443
      protocol          = "tcp"
      source_security_group_id = module.eks.node_security_group_id
    }
    
    1 回复  |  直到 9 月前
        1
  •  1
  •   Lloyd Osabutey-Anikon    9 月前

    由于目前一些AWS服务依赖于IPv4,特别是在NAT网关和某些AWS托管服务(如EKS)方面,AWS中仅支持IPv6的设置可能具有挑战性。

    了解问题:

    1. 带EKS的纯IPv6 VPC:

      • 您遇到的错误表明,EKS控制平面需要在每个可用区(AZ)中具有可用IPv4地址的子网。这是因为EKS目前并不完全支持纯IPv6集群;它依赖于IPv4进行某些内部通信。
    2. 不带IPv4 NAT网关的双栈设置:

      • 在双栈模式下,即使您的VPC和子网具有IPv6地址,节点仍然需要IPv4连接,特别是用于与未启用IPv6的公共服务通信(例如,从公共存储库中提取映像)。没有IPv4 NAT网关,这些节点无法访问互联网,导致EKS节点组创建失败。

    可能的解决方案:

    1. 混合方法:

      • 双栈子网:继续为EKS集群使用双栈(IPv4和IPv6)子网。您可以通过为每个区域配置一个NAT网关或减少到IPv4端点的出站流量来最大限度地降低NAT网关成本。
      • 仅出口互联网网关:对于IPv6流量,请使用仅出口的互联网网关,该网关应处理IPv6的出站流量,而无需额外费用。
    2. 控制平面使用IPv4,节点使用IPv6:

      • 您可以考虑将IPv4用于EKS控制平面和相关子网,同时为工作节点和应用程序流量启用IPv6。此设置可以减少IPv4占用空间,同时保持控制平面操作所需的IPv4连接。
    3. 降低NAT网关成本:

      • 如果IPv4连接对某些组件至关重要,您可以优化NAT网关的使用:
        • 对于较小的工作负载,使用实例NAT(配置为NAT实例的堡垒主机)。
        • 利用VPC端点直接连接到AWS服务,减少NAT网关流量。
    4. 监控AWS更新:

      • 请密切关注AWS的更新,因为所有服务的全面IPv6支持正在逐步改进。当EKS和其他服务完全支持IPv6时,使用将变得更加顺畅。

    地形调整:

    • 确保您的Terraform设置正确管理双堆栈子网,并且您对IPv4连接有适当的策略。
    • 如果可能,请将您的设置配置为每个区域使用一个NAT网关,以最大限度地降低成本。
    • 确保EKS模块和任何依赖资源(如安全组)配置为有效支持混合或双栈模型。