Couchbase is the most popular NoSQL database. In Trendyol, we are using Couchbase to store all microservice data. (Products metadata, Stock, Personalization)
We will talk about how to tune our couchbase java client for more throughput, low memory, and less latency.
When we profile our application couchbase client has consumed %8.3 cpu when it is idle and SimplePauseDetector create 3 worker thread for tracking latency stats.
1. Couchbase library is using LatencyUtils. It tracks your application latency status. In Trendyol we use Newrelic for tracking our application latency. We don’t need to track for internal couchbase latency so we disabled for Runtime Metrics Collector.
DefaultCouchbaseEnvironment._builder_().runtimeMetricsCollectorConfig(DefaultMetricsCollectorConfig._disabled_())
2. Another latency tracker is Network Latency Metrics Collector.
DefaultCouchbaseEnvironment._builder_().networkLatencyMetricsCollectorConfig(DefaultLatencyMetricsCollectorConfig._disabled_())
3. Increase configPollFloorInterval value it’s default 50ms.
This property gets couchbase configuration from the server to detect cluster changes in a timely fashion config polling from couchbase cluster topology. We changed this value as 200ms
DefaultCouchbaseEnvironment._builder_().configPollFloorInterval(200)
4. Decrease Key-Value timeout (default 2500ms) we generally use 500 ms.
5. Couchbase java client is using the netty framework for async network operation. You can find more details here.
For a more high throughput scenario on Linux Netty provides a way to use edge-triggered epoll instead of going through JVM NIO. This provides better throughput, lower latency, and less garbage. We changed it to Linux native transport.
DefaultCouchbaseEnvironment.Builder builder = DefaultCouchbaseEnvironment._builder_();EpollEventLoopGroup elg = new EpollEventLoopGroup(DefaultCouchbaseEnvironment._IO\_POOL\_SIZE_, new DefaultThreadFactory("cb-io", false));ShutdownHook hook = new IoPoolShutdownHook(elg);builder.ioPool(elg, hook);
builder.kvIoPool(elg, hook);
6. Disable operation tracing server and client duration. This property provides scope management.
DefaultCouchbaseEnvironment._builder_().operationTracingEnabled(false)
.operationTracingServerDurationEnabled(false)
7. Couchbase is using the jackson library for encoding and decoding. It doesn’t use the afterburner module by default.
The module will add dynamic bytecode generation for standard Jackson POJO serializers and deserializers, eliminating majority of remaining data binding overhead.
Use always JsonObject for low latency and memory. JsonObject provides us a generic model to parse to pojo class.
Note: Couchbase 3.0.0-beta.2 client has uses it by default. https://issues.couchbase.com/browse/JCBC-1487
8. Use RxJava for batching operations.
Observable
._from_(ids)
.observeOn(Schedulers._io_())
.flatMap(asyncBucket::get)
.timeout(_BULK\_TIMEOUT\_IN\_MS_, _MILLISECONDS_)
.onErrorResumeNext(throwable -> {
_LOGGER_.warn("Data fetched from replica");
return Observable._from_(ids).flatMap(id -> asyncBucket.getFromReplica(id, ReplicaMode._FIRST_)).timeout(_BULK\_TIMEOUT\_IN\_MS_, TimeUnit._MILLISECONDS_);
})
.toList();
Measuring Performance
We can now run all test just one Kubernetes pod it is 1 Gb memory and no limit for cpu. Before start running the test, we warm up the container.
No Tuning Java Client Result
With standard No Tuning Client, we get around 4328.13 operations per second. Max rpm is 265k
Average response time 8*.41* ms and 3.5 GB read
Resource usage
Tuning Java Client Result
With Tuning Client, we get around 5203.49 operations per second.
Max rpm is 313k
Average response time 6.12 ms and 4,21 GB read
Resource usage
Conclusion:
As a result, There are several tuning options for your case. Tuning Couchbase Java Client more than 100k rpm handle with low latency, low cpu, and memory.
Original article: https://medium.com/trendyol-tech/couchbase-java-client-tuning-b98690f5a675
Top comments (0)