Nowsath for AWS Community Builders

Posted on Sep 1, 2024 • Edited on Sep 7, 2024

Troubleshooting: GitHub Actions + Terraform + EKS + Helm

#githubactions #terraform #eks #helm

In this post, I've compiled a list of issues I encountered while setting up an EKS Fargate cluster using Terraform code and GitHub Actions, along with their corresponding solutions.

Error 1: Connect: connection refused for service account.



│ Error: Get "http://localhost/api/v1/namespaces/kube-system/serviceaccounts/aws-load-balancer-controller": dial tcp [::1]:80: connect: connection refused

│ 
│   with kubernetes_service_account.service-account,
│   on aws-alb-controller.tf line 50, in resource "kubernetes_service_account" "service-account":
│   50: resource "kubernetes_service_account" "service-account" {
│

Error 2: Connect: connection refused for namespace.



│ Error: Get "http://localhost/api/v1/namespaces/aws-observability": dial tcp [::1]:80: connect: connection refused
│ 
│   with kubernetes_namespace.aws_observability,
│   on namespaces.tf line 1, in resource "kubernetes_namespace" "aws_observability":
│    1: resource "kubernetes_namespace" "aws_observability" {

Error 3: Kubernetes cluster unreachable.



│ Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
│ 
│   with helm_release.alb-controller,
│   on aws-alb-controller.tf line 69, in resource "helm_release" "alb-controller":
│   69: resource "helm_release" "alb-controller" {

Solutions for above errors 1, 2, 3:
These issues often occur when you attempt to run Terraform commands for updates on an EKS cluster after its creation using the EKS module.

If you've used the Kubernetes provider as shown below, you're likely to encounter these types of errors.



provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_name
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_name
}

To resolve this, you should configure the Kubernetes provider as follows:



provider "kubernetes" {
  host = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_name
}

In this configuration, the Kubernetes provider's host and cluster_ca_certificate values are sourced directly from the outputs of the eks module.

Pre-computed Outputs: The module.eks.cluster_endpoint and module.eks.cluster_certificate_authority_data are outputs that are likely computed and stored when the EKS module is applied. This means that once the module is applied, these outputs remain constant unless the cluster configuration changes.
No Additional Data Source Calls: Since the endpoint and certificate authority data are provided by the module's outputs, Terraform doesn't need to make an additional API call to AWS to retrieve this information each time you run terraform plan or apply. This reduces the likelihood of timing issues or inconsistencies that might cause connection problems.

Error 4: Connect: Missing required argument.



│ Error: Missing required argument
│ 
│   with data.aws_eks_cluster_auth.cluster,
│   on aws-alb-controller.tf line 21, in data "aws_eks_cluster_auth" "cluster":
│   21:   name = module.eks.cluster_id
│ 
│ The argument "name" is required, but no definition was found.

Solutions:

This error suggests that the name argument is required for the aws_eks_cluster_auth data source, but it's not being provided correctly. Specifically, the module.eks.cluster_id is not returning a value, or the value it is returning is not recognized as the name of the EKS cluster.

i. Check the output of module.eks.cluster_id: Ensure that module.eks.cluster_id is correctly referencing the EKS cluster ID. It should be an output of the EKS module. Verify that the cluster_id output is correctly defined in your EKS module.



output "cluster_id" {
  value = module.eks.cluster_id
}

ii. Verify the EKS Module Version: Make sure you are using the correct version of the terraform-aws-modules/eks/aws module. If the module version is incorrect or outdated, it may not include the cluster_id output.

iii. Direct Reference: If the module output is correctly defined but still causing issues, try directly referencing the cluster name instead of the 'cluster_id'



data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_name
}

Error 5: Required plugins are not installed.




│ Error: Required plugins are not installed
│ 
│ The installed provider plugins are not consistent with the packages
│ selected in the dependency lock file:
│   - registry.terraform.io/hashicorp/null: there is no package for registry.terraform.io/hashicorp/null 3.2.2 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/time: there is no package for registry.terraform.io/hashicorp/time 0.11.2 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/tls: there is no package for registry.terraform.io/hashicorp/tls 4.0.5 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/aws: there is no package for registry.terraform.io/hashicorp/aws 5.53.0 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/cloudinit: there is no package for registry.terraform.io/hashicorp/cloudinit 2.3.4 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/helm: there is no package for registry.terraform.io/hashicorp/helm 2.14.0 cached in .terraform/providers
│   - registry.terraform.io/hashicorp/kubernetes: there is no package for registry.terraform.io/hashicorp/kubernetes 2.31.0 cached in .terraform/providers
│ 
│ Terraform uses external plugins to integrate with a variety of different
│ infrastructure services. To download the plugins required for this
│ configuration, run:
│   terraform init

Solutions:

The error you're seeing indicates that Terraform's provider plugins are not correctly installed or do not match the versions specified in your dependency lock file (.terraform.lock.hcl). This can happen if you haven't run terraform init after modifying your Terraform configuration or if the plugins have been removed from the cache.

Here’s how to resolve this issue:

i. Run terraform init: The error message itself suggests running terraform init. This command initializes the Terraform working directory by downloading the required provider plugins and setting up the necessary backend.



terraform init

This should download and install all the required plugins based on your configuration and lock file.

ii. Force Re-initialization: If running terraform init alone does not solve the problem, you can try forcing a reinitialization, which will re-download the providers and update the .terraform directory:



terraform init -upgrade

The -upgrade flag will ensure that Terraform checks for the latest versions of the required plugins and updates your local cache accordingly.

iii. Clear and Re-initialize the Terraform Directory: If the above steps still don’t resolve the issue, you may want to clear out the .terraform directory and reinitialize from scratch:



rm -rf .terraform
terraform init

This will delete the existing .terraform directory (where Terraform caches plugins and stores state-related files) and then reinitialize the working directory.

Error 6: Backend configuration changed.



│ Error: Backend configuration changed
│ 
│ A change in the backend configuration has been detected, which may require migrating existing state.
│ 
│ If you wish to attempt automatic migration of the state, use "terraform init -migrate-state".
│ If you wish to store the current configuration with no changes to the state, use "terraform init -reconfigure".

Solutions:

The error message you're seeing indicates that Terraform has detected a change in your backend configuration. The backend is where Terraform stores the state file, which keeps track of your infrastructure resources. When the backend configuration changes, Terraform needs to decide how to handle the existing state.

You have two options depending on your situation:
i. Automatic State Migration:

When to Use: If you've changed the backend configuration (e.g., switched from local state to a remote backend like S3 or changed the backend's parameters), and you want Terraform to automatically migrate your existing state to the new backend.



terraform init -migrate-state

What It Does: This command will attempt to migrate your existing state file to the new backend. It's a safe option if you're confident that the backend change is intentional and you want to keep your infrastructure's state intact in the new location.

ii. Reconfigure Without Migrating State:

When to Use: If you made changes to the backend configuration but do not need to migrate the state (e.g., you're working in a different environment, or you've changed a configuration that doesn't affect the state).



terraform init -reconfigure

What It Does: This command reconfigures the backend without migrating the state. It is useful if you're setting up Terraform in a new environment or if the state migration is unnecessary or handled separately.

Error 7: Helm release "" was created but has a failed status.



│ Warning: Helm release "" was created but has a failed status. Use the `helm` command to investigate the error, correct it, then run Terraform again.
│ 
│   with helm_release.alb-controller,
│   on aws-alb-controller.tf line 75, in resource "helm_release" "alb-controller":
│   75: resource "helm_release" "alb-controller" {

│ Warning: Helm release "" was created but has a failed status. Use the `helm` command to investigate the error, correct it, then run Terraform again.
│ 
│   with helm_release.alb-controller,
│   on aws-alb-controller.tf line 75, in resource "helm_release" "alb-controller":
│   75: resource "helm_release" "alb-controller" {


│ Error: context deadline exceeded
│ 
│   with helm_release.alb-controller,
│   on aws-alb-controller.tf line 75, in resource "helm_release" "alb-controller":
│   75: resource "helm_release" "alb-controller" {

Solutions:

The error you're encountering indicates that the Helm release for the ALB (Application Load Balancer) controller was created but ended up in a failed state. Additionally, the "context deadline exceeded" error suggests that the operation took too long to complete, possibly due to network issues, resource constraints, or misconfiguration.

To fix this, run 'terraform apply' again it will automatically replace the tainted helm_release.alb-controller as shown in the below image.

Error 8: No value for required variable.



│ Error: No value for required variable
│ 
│   on variables.tf line 85:
│   85: variable "route53_access_key" {
│ 
│ The root module input variable "route53_access_key" is not set, and has no
│ default value. Use a -var or -var-file command line argument to provide a
│ value for this variable.

Solutions:

The error message indicates that the Terraform configuration requires an input variable named route53_access_key, which has not been provided a value and doesn't have a default value set in your variables.tf file.

Error 9: InvalidArgument: The parameter Origin DomainName does not refer to a valid S3 bucket.



│ Error: updating CloudFront Distribution (E1B4377XXXX): operation error CloudFront: UpdateDistribution, https response error StatusCode: 400, RequestID: 0b25a733-e79-4xx7-bxx6-5exxx1cb405, InvalidArgument: The parameter Origin DomainName does not refer to a valid S3 bucket.
│ 
│   with aws_cloudfront_distribution.cms_cf,
│   on cloudfronts.tf line 1, in resource "aws_cloudfront_distribution" "cms_cf":
│    1: resource "aws_cloudfront_distribution" "cms_cf" {

Solutions:

The error message indicates that the Origin DomainName specified in your CloudFront distribution configuration does not point to a valid S3 bucket. This commonly happens when:

i. Incorrect S3 bucket name: Ensure that the S3 bucket name you are using in your CloudFront distribution is correct and exists.

ii. S3 bucket URL format: The Origin DomainName for an S3 bucket should follow this pattern:

For a S3 bucket: bucket-name.s3.amazonaws.com
For a regional S3 bucket: bucket-name.s3.<region>.amazonaws.com

iii. Bucket permissions: Ensure the S3 bucket has the correct permissions to allow access from CloudFront. The bucket policy should allow CloudFront access.

You can check the origin block in your CloudFront resource, which should look like this:



resource "aws_cloudfront_distribution" "cms_cf" {
  origin {
    domain_name = "your-bucket-name.s3.amazonaws.com"
    origin_id   = "S3-your-bucket-name"

    s3_origin_config {
      origin_access_identity = aws_cloudfront_origin_access_identity.example.cloudfront_access_identity_path
    }
  }

  // Other CloudFront configuration
}

Error 10: Reference to undeclared input variable.



│ Error: Reference to undeclared input variable
│ 
│   on eks.tf line 65 in module "eks":
│   65:       namespace = "${var.namespace[0]}"
│ 
│ An input variable with the name "namespace" has not been declared. | Did you mean "namespaces?

Solutions:

As this error message indicates that in your eks.tf file, you are trying to reference the variable namespace, but it has not been declared.

Check if the intended variable is namespace or namespaces, and adjust the code and declaration accordingly. Also ensure that the variable is declared in your variables.tf or in the same file where you're using it.

Thanks for reading! I'll update this post regularly. Please share any other issues in the comments. 🤝😊

DEV Community

Troubleshooting: GitHub Actions + Terraform + EKS + Helm

Top comments (0)

Read next

Things to consider before using EKS Auto Mode

Terraform state with native s3 locking

Mastering CI/CD: How to Build a Robust Pipeline with GitHub Actions, Docker, and ECS

🚀 Crash Course: Terraform Basics for Freshers