Hey community,
I have been using otel-collector for my org ( x Tbs/day ) observability in k8s setup for sometime. Following is my experience. Do you had the same or was it different and how did you overcome it?
Long Term ( 6 months + of using ) :
Poor data-loss detecting capabilities. I have been loosing data but no good way to see that. Agent/collector pods prints error logs but since pipeline doesn't work so it doesn't reach the log-system
No UI to view/monitor my existing connections and pick and drop functionalities
No easy way to inject transformers, for example i need to change format of some data for SIEM/snowflake, drop some log data like cribl, i should be able to do it within otel itself.
Short term ( while setup ) :
No grpc-native load balancer in otel. Horizontal scaling became an issue, as the agent runs on grpc and owing to no native grpc-load balancer directly operating over otel, resulted in oversizing my clusters unnecessarily.
Distributed tracing needs more automation, i had to manually stitch at various places.
Anyone else faced similar issues or others???
Top comments (0)