Shuhei Kitagawa

Posted on Apr 17, 2023

Kubernetes Event Aggregation and Spam Filtering in client-go

#kubernetes #cloudnative #go

TL;DR

clint-go has features for event aggregation, which groups similar events into one, and spam filtering, which applies rate limits to events.
Event aggregation uses an aggregation key generated by the EventAggregatorByReasonFunc as a key, and if the same event is published 10 or more times within 10 minutes, instead of posting a new event, update the existing to save etcd capacity.
Spam filtering performs using the value generated by the getSpamKey function as a key. The spam filter uses a rate limiter based on the token bucket algorithm, with an initial 25 tokens and a refill rate of 1 token per 5 minutes. If an event is published beyond the limit, the event is discarded.

Background

When I wanted to create a CustomController that detects the completion of a child Job of a CronJob and performs an action, I noticed that CronJobController (v2) published an Event called SawCompletedJob when the child Job completed. I thought, "Oh, I can simply use the Event as a cue."

However, when I created a CronJob that ran once a minute in a local cluster (kind), I discovered that a SawCompletedJob event became unobservable from the middle of the run (about the 10th run). Specifically, the Events looked like the ones below.

$ kubectl alpha events --for cronjob/hello
...
8m29s (x2 over 8m29s)   Normal   SawCompletedJob    CronJob/hello   Saw completed job: hello-28023907, status: Complete
8m29s                   Normal   SuccessfulDelete   CronJob/hello   Deleted job hello-28023904
7m35s                   Normal   SuccessfulCreate   CronJob/hello   Created job hello-28023908
7m28s (x2 over 7m28s)   Normal   SawCompletedJob    CronJob/hello   Saw completed job: hello-28023908, status: Complete
7m28s                   Normal   SuccessfulDelete   CronJob/hello   Deleted job hello-28023905
6m35s                   Normal   SuccessfulCreate   CronJob/hello   Created job hello-28023909
6m28s                   Normal   SawCompletedJob    CronJob/hello   Saw completed job: hello-28023909, status: Complete
2m35s (x3 over 4m35s)   Normal   SuccessfulCreate   CronJob/hello   (combined from similar events): Created job hello-28023913

SawCompletedJob events were published until 6m28s, but they became unobservable after that. SuccessfulCreate events were published in an aggregated form, but we cannot see all of them. By the way, all events from child Jobs and child Pods are observable.

$ kubectl alpha events

4m18s                 Normal    Scheduled                 Pod/hello-28023914-frb94   Successfully assigned default/hello-28023914-frb94 to kind-control-plane
4m11s                 Normal    Completed                 Job/hello-28023914         Job completed
3m18s                 Normal    Started                   Pod/hello-28023915-5fsh5   Started container hello
3m18s                 Normal    SuccessfulCreate          Job/hello-28023915         Created pod: hello-28023915-5fsh5
3m18s                 Normal    Created                   Pod/hello-28023915-5fsh5   Created container hello
3m18s                 Normal    Pulled                    Pod/hello-28023915-5fsh5   Container image "busybox:1.28" already present on machine
3m18s                 Normal    Scheduled                 Pod/hello-28023915-5fsh5   Successfully assigned default/hello-28023915-5fsh5 to kind-control-plane
3m11s                 Normal    Completed                 Job/hello-28023915         Job completed
2m18s                 Normal    Started                   Pod/hello-28023916-qbqqk   Started container hello
2m18s                 Normal    Pulled                    Pod/hello-28023916-qbqqk   Container image "busybox:1.28" already present on machine
2m18s                 Normal    Created                   Pod/hello-28023916-qbqqk   Created container hello
2m18s                 Normal    SuccessfulCreate          Job/hello-28023916         Created pod: hello-28023916-qbqqk
2m18s                 Normal    Scheduled                 Pod/hello-28023916-qbqqk   Successfully assigned default/hello-28023916-qbqqk to kind-control-plane
2m11s                 Normal    Completed                 Job/hello-28023916         Job completed
78s                   Normal    SuccessfulCreate          Job/hello-28023917         Created pod: hello-28023917-kpxvn
78s                   Normal    Created                   Pod/hello-28023917-kpxvn   Created container hello
78s                   Normal    Pulled                    Pod/hello-28023917-kpxvn   Container image "busybox:1.28" already present on machine
78s                   Normal    Started                   Pod/hello-28023917-kpxvn   Started container hello
78s                   Normal    Scheduled                 Pod/hello-28023917-kpxvn   Successfully assigned default/hello-28023917-kpxvn to kind-control-plane
71s                   Normal    Completed                 Job/hello-28023917         Job completed
18s (x8 over 7m18s)   Normal    SuccessfulCreate          CronJob/hello              (combined from similar events): Created job hello-28023918
18s                   Normal    Started                   Pod/hello-28023918-grvbz   Started container hello
18s                   Normal    Created                   Pod/hello-28023918-grvbz   Created container hello
18s                   Normal    Pulled                    Pod/hello-28023918-grvbz   Container image "busybox:1.28" already present on machine
18s                   Normal    SuccessfulCreate          Job/hello-28023918         Created pod: hello-28023918-grvbz
18s                   Normal    Scheduled                 Pod/hello-28023918-grvbz   Successfully assigned default/hello-28023918-grvbz to kind-control-plane
11s                   Normal    Completed                 Job/hello-28023918         Job completed

As I couldn't find any specific description of this behavior in the Kubernetes official documentation, I investigated it by reading the source code to figure out what was happening.

Kubernetes Event Publication Flow

client-go sends Kubernetes Events to kube-apiserver through and etcd stores them. client-go is responsible for event aggregation and spam filtering, but the flow of client-go sending Events to kube-apiserver is quite complex, so we will explain it first. Note that we will use the SawCompleteJob Event of CronJobController as an example, but please note that the details may vary on each controller.

CronJobController publishes an Event via the Recorder's Eventf method. The method internally calls the Broadcaster's ActionOrDrop method and sends the Event to the incoming channel.
The Event in the incoming channel is retrieved by the loop goroutine and forwarded to each Watcher's result channel.
The Event Watcher goroutine calls the eventHandler for the Event received from the result channel. The eventHandler calls recordToSink in the eventBroadcasterImpl, where EventCorrelator performs event aggregation and spam filtering and then calls recordEvent to post an Event (or update the Event if it has been aggregated).

Note: Starting loop goroutine

The loop goroutine starts through the NewLongQueueBroadcaster function, which is called through the NewBroadcaster function. The NewBroadcaster function is called in the NewControllerV2 function.

Note: Starting Event Watcher

The CronJobController calls the StartRecordingToSink method of eventBroadcasterImpl, which starts the Event Watcher from the StartEventWatcher method. The StartEventWatcher method initializes and registers the Watcher through the Watch method of the Broadcaster. What I found interesting is that the registration process of the Watcher itself is sent to the incoming channel, and the loop goroutine executes it, making the events published before the start of the Watcher invisible to it (the comment calls it as a "terrible hack" though).

EventCorrelator

EventCorrelator implements the core logic for Kubernetes Event aggregation and spam filtering. EventCorrelator is initialized through the NewEventCorrelatorWithOptions function called in the StartRecordingToSink method of eventBroadcasterImpl. Note that e.options is empty, so the Controller uses the default values. eventBroadcasterImpl's recordToSink method calls EventCorrelate method, which aggregates Events and applies tge spam filter.

Aggregation

The EventCorrelator's EventAggregate method and the eventObserve method of eventLogger are used for Event aggregation. The source code has detailed comments, so it's recommended to refer to it directly for more information, but here's a brief overview of the process:

Calculate aggregationKey and localKey using EventAggregatorByReasonFunc. aggregationKey consists of event.Source, event.InvolvedObject, event.Type, event.Reason, event.ReportingController, and event.ReportingInstance, while localKey is event.Message.
Search the cache of EventAggregator using the the aggregationKey. The cache value is aggregateRecord. If the number of the localKeys within the maxIntervalInSeconds (default 600 seconds) is greater than or equal to the maxEvents (default 10), return aggregationKey as the key, otherwise return the eventKey as the key. Note that the eventKey should be unique to each Event.
Call the eventObserve method of eventLogger with the key returned from the EventCorrelate method, and search the cache of eventLogger. The cache value is eventLog. If it hits the cache (i.e., the key is aggregationKey), compute the patch to update the Event.

Spam Filtering

For spam filtering, the filterFunc of EventCorrelator is called to apply it. The actual implementation of filterFunc is the Filter method of EventSourceObjectSpamFilter. Again, it's recommended to refer to the source code for details, but here's a brief overview of the process:

Calculate the eventKey from the Event using the getSpamKey function.
Search the cache of EventSourceObjectSpamFilter using the eventKey. The cache value is spamRecord, which contains the rate limiter. The default values for qps and burst of the rate limiter are 1/300 and 25, respectively. According to the comment, the rate limiter uses the token bucket algorithm, so there are initially 25 tokens, and then one token is refilled every 5 minutes.
Call the TryAccept method of tokenBucketPassiveRateLimiter to check the rate limit. If it exceeds it, discard the Event.

Why did SawCompletedJob Event become unobservable?

Taking the above into consideration, let's think about why the SawCompleteJob Event became unobservable. In short, it is likely due to be caused by the spam filter.

CronJobController issues three Events, SuccessfulCreate, SawCompletedJob, and SuccessfulDelete, per child Job every minute (Strictly speaking, it publishes SuccessfulDelete only when it reaches the HistoryLimit).
The Controller uses a spam filter whose the key is solely based on the Source and InvolvedObject (See getSpamKey function). Therefore, these thee types of Events are identified as the same Event.
The Controller consumed the first 25 tokens at a rate of three tokens per minute. One token is refilled every five minutes, but around nine minutes, the tokens started running out. After that, A toke was refilled every five minutes, but it was consumed by a (aggregated) SuccessfulCreate Event, so SawCompletedJob and SuccessfulDelete were never published thereafter.

Note: Event Types

I'll list SuccessfulCreate、 SawCompletedJob、SuccessfulDelete events' involvedObject and source below.

SuccessfulCreate Event

apiVersion: v1
kind: Event
involvedObject:
  apiVersion: batch/v1
  kind: CronJob
  name: hello
  namespace: default
  resourceVersion: "289520"
  uid: 5f3cfeca-8a83-452a-beb9-7a5f9c1eff63
source:
  component: cronjob-controller
...
reason: SuccessfulCreate
message: Created job hello-28025408

SawCompletedJob Event

apiVersion: v1
kind: Event
involvedObject:
  apiVersion: batch/v1
  kind: CronJob
  name: hello
  namespace: default
  resourceVersion: "289020"
  uid: 5f3cfeca-8a83-452a-beb9-7a5f9c1eff63
source:
  component: cronjob-controller
...
reason: SawCompletedJob
message: 'Saw completed job: hello-28025408, status: Complete'

SuccessfulDelete Event

apiVersion: v1
kind: Event
involvedObject:
  apiVersion: batch/v1
  kind: CronJob
  name: hello
  namespace: default
  resourceVersion: "289520"
  uid: 5f3cfeca-8a83-452a-beb9-7a5f9c1eff63
source:
  component: cronjob-controller
...
reason: SuccessfulDelete
message: Deleted job hello-28025408

DEV Community