Centralized logging (paradoxically) is a need in the distributed systems world. How would one possibly go through doing that?
This is a post on some approaches that we tried using Opentelemetry collector and an AppDynamics (for logging data store and visualization)
1. PULL model - Log aggregation through agents
- Made popular by the Java way of doing things
- Suitable for webserver/daemon kind of long running systems
- Agent needs permissions, could create READ locks, consume resources and could be a network hog
- Agents can process and filter the logs before sending, there by reducing the load on the collector
2. PUSH model
- Logs are shipped out by the applications using configurable SDKs and libraries
- Decoupled in nature
- Suitable for transient, run-anywhere, script kind of systems
- Intermediary will be a resource hog
We proceeded with the PUSH model using Kafka as the intermediary due to it's simplicity and reliability
First cut
Tried with in-house python-kafka-log-handler, promtail for agent, loki for processor
- Didn't work as Loki could not EXPORT out in Otel format
Second try
Directly PUSH logs from Loki to AppDynamics backend
- Didn't work as Loki and AppD won't talk the same language
One more try - Otel collector to the rescue
Avoid all intermediaries - Otel collector supports Kafka as a receiver and OtelHttp as an exporter to push to AppD backend (which is fully Otel compliant)
- This worked like a charm!
Foot notes:
- opentelemtry has two flavors of collectors -
opentelemetry-collector
the minimum stable andopentelemetry-collector-contrib
which supports a ton of receivers and exporters -
opentelemetry-collector-contrib
has loki support for both receiver and transporter; but we could not figure out a simpler way of pushing logs from applications to loki(processor/aggregator)
Top comments (0)