DEV Community

Cover image for Detect Infrastructure Drift with Brainboard
Mike Tyson of the Cloud for Brainboard

Posted on

Detect Infrastructure Drift with Brainboard

Managing infrastructure drift in cloud environments is a critical aspect of maintaining the reliability and security of cloud architectures.

Here's an overview of drift detection and remediation:

Understanding Infrastructure Drift:

  • Source of Truth: Modern cloud infrastructure, managed by multiple people, necessitates a unique source of truth, such as Brainboard, Git, or local files, to track changes and troubleshoot errors effectively. Constant monitoring is needed to ensure the real infrastructure hasn't drifted from its source​​.
  • Definition of Drift: Drift occurs when the actual state of the deployed infrastructure diverges from the expected state described in the code. It typically happens due to direct/manual changes made to the deployed infrastructure outside of IaC tools' control​​.
  • Types of Drift: There are two main types of drift: between environments (differences in deployed infrastructure across environments like staging and production) and between the code and the infrastructure (differences between provisioned cloud infrastructure and its code representation)​​.

Drift Detection:

  • Overview: Brainboard provides options for detecting drift in cloud infrastructure, including manual and automated workflows​​.
  • Manual Workflow: Create a workflow to check for drift manually by going to the CI/CD page, creating a new workflow or using a public template, adding a drift detection task, and running the pipeline​​.
  • Scheduled Automatic Detection: Set up automatic drift detection by creating a new workflow, opening workflow settings, activating the cron schedule, and enabling notifications for drift detection failures​​.
  • Output and Best Practices: When a drift is detected, the workflow will be marked as failed, and it's a good practice to use automatic scheduled detection for both critical and non-critical workloads​​​​.

Remediation Strategies:

  • Override the Infrastructure: Redeploy the code that describes the infrastructure, either automatically within the drift detection workflow or manually after inspecting the output of the drift detection. Automatic remediation should be used cautiously and typically requires team collaboration​​.
  • Bring Changes to the Code: If the changes made to the provisioned infrastructure are legitimate (e.g., during a security incident), incorporate these changes into the code. This approach is useful when immediate action on the cloud provider's console is necessary​​.

In summary, effectively managing drift involves having a clear source of truth, regularly detecting drift through manual or automated workflows, and implementing appropriate remediation strategies. These practices help maintain the integrity of the cloud infrastructure, ensuring it remains aligned with the defined code and architectural standards.

Top comments (0)