DEV Community

tonybui1812
tonybui1812

Posted on

A logging strategy in kubernetes

Basically, we store log in server drives, but in container with pods, store log inside pods seems not a good idea. Logs would lose if we deleted pod.
Storing logs inside pods in a Kubernetes cluster can be problematic because, by default, logs are ephemeral and will be lost if the pod is deleted. This behavior is due to the nature of containers and pods in Kubernetes, which prioritize statelessness and easy scaling.

To address this issue and ensure log persistence and centralization in a Kubernetes environment, it's common to implement a logging strategy that involves the following components:

  • Log Aggregation and Forwarding: Use a log shipper or agent like Fluentd, Fluent Bit, Filebeat, or others to collect and forward logs from containers running in pods to a centralized logging destination. These agents can continuously stream logs, ensuring that they are captured even if a pod is deleted.
  • Centralized Log Storage: Store logs in a centralized location, such as Elasticsearch, a dedicated log storage service, or a cloud-based log management solution. Centralization makes it easier to search, analyze, and retain logs for longer periods.
  • Log Rotation and Retention: Implement log rotation and retention policies within your log aggregator to manage log file sizes and retention periods. This ensures that you don't exhaust storage resources with excessive logs.
  • Security: Ensure that your log forwarding and storage are secure. Use encryption and access controls to protect sensitive log data.
  • Monitoring and Alerting: Set up monitoring and alerting for your log aggregation system to proactively identify issues or anomalies in your applications.

By implementing this logging strategy, you can address the challenge of losing logs when pods are deleted and maintain a centralized and persistent log storage solution that is more suitable for containerized environments like Kubernetes.

Centralized logging with Fluentd, Elasticsearch, and Kibana (EFK stack)

Here's a step-by-step example of how to implement centralized logging in a Kubernetes cluster using Fluentd, Elasticsearch, and Kibana (EFK stack) to address the issue of logs being lost when pods are deleted:

  1. Set Up Elasticsearch and Kibana:
  • Deploy Elasticsearch and Kibana in your Kubernetes cluster using YAML manifests or Helm charts. Ensure that they are accessible and properly configured for your environment.
  1. Deploy Fluentd DaemonSet:
  • Create a Fluentd DaemonSet in your Kubernetes cluster. This DaemonSet ensures that a Fluentd pod runs on every node in your cluster and collects logs from all running pods.
   apiVersion: apps/v1
   kind: DaemonSet
   metadata:
     name: fluentd
     namespace: logging
   spec:
     selector:
       matchLabels:
         app: fluentd-logging
     template:
       metadata:
         labels:
           app: fluentd-logging
       spec:
         containers:
         - name: fluentd
           image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
           ...
Enter fullscreen mode Exit fullscreen mode
  1. Configure Fluentd for Log Forwarding:
  • In your Fluentd configuration, specify the Elasticsearch endpoint where logs should be sent. Fluentd will collect logs from all pods and forward them to Elasticsearch.
  1. Deploy Applications with Log Configuration:
  • Deploy your applications in Kubernetes pods. Ensure that your applications are configured to write logs to stdout and stderr. This is a common practice in containerized applications.
  1. Monitor Logs in Kibana:
  • Access Kibana to monitor and analyze your logs. Create visualizations and dashboards to gain insights from your log data.
  1. Log Retention and Management:
  • Implement log retention policies in Elasticsearch to manage log data storage and cleanup.

With this setup, Fluentd collects logs from all running pods, including logs from your applications. These logs are forwarded to Elasticsearch, which acts as the central log storage and indexing engine. Kibana provides a user-friendly interface for searching, analyzing, and visualizing your log data.

The key advantage of this approach is that logs are not lost when pods are deleted or recreated. They are continuously collected and stored centrally, making it easier to troubleshoot issues, monitor application behavior, and gain insights into your Kubernetes environment.

Please note that the specific configurations may vary depending on your Kubernetes cluster and the versions of the components you use. Be sure to refer to official documentation and guides for each component for detailed setup instructions tailored to your environment.

Configure your application

Here's how you can adapt a Spring Boot application for this setup:

  1. Use Spring Boot Logging Framework:

    • Spring Boot uses a built-in logging framework, which is usually either Logback or Java Util Logging (Jul). Ensure that your application uses one of these logging frameworks as they can be configured to log to stdout.
  2. Configure Logback for Stdout (if applicable):

    • If your Spring Boot application uses Logback for logging, configure it to log to stdout by specifying an appropriate Logback configuration file (e.g., logback-spring.xml).
   <configuration>
     <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
       <encoder>
         <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
       </encoder>
     </appender>
     <root level="INFO">
       <appender-ref ref="STDOUT" />
     </root>
   </configuration>
Enter fullscreen mode Exit fullscreen mode

This configuration will direct log output to stdout.

  1. Use Structured Logging (Optional):

    • Consider using structured logging in your Spring Boot application to format log entries as JSON objects, as mentioned earlier. This will make it easier for Fluentd to parse and forward logs to Elasticsearch.
  2. Test Logging Configuration Locally:

    • Before deploying your Spring Boot application to Kubernetes, test its logging configuration locally to ensure that logs are appearing in the console or stdout as expected.
  3. Dockerize Your Spring Boot Application:

    • Create a Docker image for your Spring Boot application, ensuring that it includes the necessary configurations, dependencies, and libraries. Be sure to base your Docker image on a suitable base image with a Java runtime.
  4. Deploy to Kubernetes:

    • Deploy your Dockerized Spring Boot application to Kubernetes as a pod or deployment. Ensure that the Fluentd DaemonSet is running in your Kubernetes cluster to collect and forward logs.
  5. Monitor Logs in Kibana:

    • Access Kibana to monitor and analyze logs generated by your Spring Boot application. Create visualizations and dashboards to gain insights into your application's behavior.

By configuring your Spring Boot application to log to stdout and following best practices for containerized logging, you'll be able to seamlessly integrate it into the Fluentd-based centralized logging setup in Kubernetes. This allows for efficient log collection and centralized storage and analysis using Elasticsearch and Kibana.

A log shipper or agent

When we use stdout configuration then how the other service will collect it. What will collect it?

When you configure your application to log to standard output (stdout) within a containerized environment like Kubernetes, you typically rely on a log collector or agent to collect those logs and forward them to a centralized logging system. One common choice for log collection in Kubernetes is to use a log shipper or agent like Fluentd, Fluent Bit, or Filebeat. Here's how it works:

  1. Application Configuration:

    • Your Spring Boot application is configured to log messages to stdout. This is a standard practice for containerized applications.
  2. Log Shipper/Agent:

    • You deploy a log shipper or agent (e.g., Fluentd, Fluent Bit, or Filebeat) as a sidecar container or a DaemonSet within the same Kubernetes pod as your application. This agent is responsible for collecting logs from stdout.
  3. Log Forwarding:

    • The log shipper/agent continuously reads log messages from stdout and forwards them to a centralized logging destination. This destination can be an Elasticsearch cluster, a log management service, or another log aggregation system.
  4. Centralized Logging System:

    • The centralized logging system, such as Elasticsearch in combination with Kibana (ELK stack) or a cloud-based log management service, stores and indexes the log data.
  5. Visualization and Analysis:

    • You can access the centralized logging system to search, analyze, and visualize your log data using tools like Kibana or dedicated log management dashboards.

Here's a simplified example of a Kubernetes Pod configuration with an application container and a log shipper/agent sidecar container:

apiVersion: v1
kind: Pod
metadata:
  name: my-app-pod
spec:
  containers:
    - name: my-app-container
      image: my-spring-boot-app:latest
      # Configure Spring Boot to log to stdout
      # ...

    - name: log-shipper
      image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
      # Configure the log shipper/agent to collect logs from stdout
      # ...
Enter fullscreen mode Exit fullscreen mode

In this setup, the log shipper/agent is responsible for collecting the logs from the Spring Boot application's stdout and forwarding them to the designated logging destination, which could be Elasticsearch. The centralized logging system then indexes and stores these logs for further analysis.

This approach allows you to collect and centralize logs from multiple pods and containers in your Kubernetes cluster, making it easier to manage and analyze log data effectively.

Fluent and Elasticsearch

  1. Fluentd:

    • Fluentd is a log collector and log forwarding agent.
    • Its primary role is to collect log data from various sources, including the standard output (stdout) of applications running in containers.
    • Fluentd can transform and enrich log data and then forward it to different destinations, such as Elasticsearch, databases, or cloud-based log management services.
    • In Kubernetes, Fluentd is commonly used to collect logs from containers and forward them to Elasticsearch for storage and indexing.
  2. Elasticsearch:

    • Elasticsearch is a distributed, RESTful search and analytics engine.
    • It's designed for storing, searching, and analyzing large volumes of data, including log data.
    • Elasticsearch is often used as a central repository for log storage and indexing. It can efficiently index and query logs, making it suitable for log analysis and searching.
    • Elasticsearch can be combined with Kibana, a data visualization and exploration tool, to create dashboards and visualizations for log data.

In a typical log aggregation and analysis setup:

  • Fluentd is responsible for collecting and forwarding log data from various sources (e.g., containers) to a central destination, such as Elasticsearch.
  • Elasticsearch is used as the central storage and indexing system for log data. It stores logs in a structured format, making them easily searchable and analyzable.
  • Kibana can be used to interactively visualize and explore log data stored in Elasticsearch, allowing users to create dashboards, perform queries, and gain insights from the log information.

Together, Fluentd and Elasticsearch form a common combination for log management in Kubernetes environments. Fluentd collects, transforms, and forwards logs, while Elasticsearch stores, indexes, and allows you to search and analyze those logs effectively.

Top comments (0)