This week I had the opportunity to work on DiceDB which is a drop-in replacement of Redis
and is much faster than it as well
Background
Redis
is an open-source in-memory database, that can be used for the purpose of storing data, caching, or as a message broker
DiceDB
improves on the benchmarks and also differs in two aspects from Redis
i.e.
- It is multi-threaded
- Listens to SQL query and informs the client about the changes as soon as possible
My Contribution
Issue
I saw an issue that was related to auditing the documentation concerning the PFMERGE
command
To briefly describe what the issue was :
- The documentation for the
PFMERGE
command might have become stale, audit and fix it - As the tool is a drop-in replacement of
Redis
, check if the functionality of the command matches the functionality mentioned in theRedis
documentation - Make the documentation for the command consistent with the new proposed format that is to make appropriate use of headers, add terminal examples and use proper table format for arguments and error output types
My Approach
I've been wanting to learn Redis
for a while, but have never taken the time to do so, this issue helped me learn more about how Redis
operates and how DiceDB
is different from it
To get myself acquainted with the technology, I went to the official documentation page of Redis, read through the documentation, different data types, and quickstart guide, and then ran a Docker container for the same on my local machine to experiment with it
After I was done exploring Redis
, I moved on to get DiceDB
working on my machine, most of the commands that I ran in DiceDB
ran through the help of redis-cli
tool, which allowed me to connect to the Docker instance of DiceDB
running on my machine
PFMERGE
I was acquainted with common data types like, JSON
, String
, Sets
and Lists
But this time around I came across a new Data Type, known as HyperLogLog, which estimates the cardinality of the elements in its set, this was a type of a probabilistic data structure
Any command that started with a PF
probably dealt with the use of HyperLogLog Data structure, my task at hand was to see how PFMERGE
was performing and how is it giving outputs and handling error
Some of the examples I ran here were
127.0.0.1:7379> PFADD hll1 "a" "b" "c"
(integer) 1
127.0.0.1:7379> PFADD hll2 "c" "d" "e"
(integer) 1
127.0.0.1:7379> PFADD hll3 "e" "f" "g"
(integer) 1
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 hll3
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 7
127.0.0.1:7379> PFADD hll_merged "x" "y" "z"
(integer) 1
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 hll3
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 7
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 non_existent_key
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 5
all of these examples gave me an overview of how this command worked
Making Changes
Once I was done with all of the pre-requisite stuff, I finally went on to audit the documentation and my changes were as follows
- Changed the terminal to reflect the correct port number being used to access the docker instance
- Added another example, in the example usage section demonstrating invalid usage of the command
- Converted the formatting of the
Return Values
andParameters
sections to a table format - Modified the expected behavior to match the functionality of how the command was working
A full descriptive view of my PR can be found here
Conclusion
Contributing to DiceDB this week provided me with valuable insights into in-memory databases and the HyperLogLog data structure. By auditing and updating the documentation for the PFMERGE command, I ensured that DiceDB's documentation remains accurate and user-friendly
Top comments (2)
Congrats on making a useful contribution to the project!
Thank you for the kind words!