The main limitation of Lamport Clocks was that they were unable to assist us in determining the causal relationship between two events.
If you're not sure what a causal relationship is, read my article on Message Ordering first.
The reason for this is that we had a single timestamp that was being updated globally, which did not provide us with a complete picture of the sequence of events that occurred.
Vector Clocks
Instead of each node maintaining a single timestamp, we maintain a list of timestamps - one for each node in every node.
For example, if we have a three-node system, the vector timestamp would look like [ T1, T2, T3 ], where T1 is the logical time of Node 1, T2 is the logical time of Node 2, and T3 is the logical time of Node 3.
The entire list forms a single logical clock, known as vector clocks.
How do Vector Clocks work?
- When a node boots up, it determines the number of nodes in the cluster (N) and fills an array of size N with 0's.
- Before executing an event, the node increments the clock of that node's time in the list.
N[i] = N[i] + 1
- When a message is sent, the time for that node is incremented, and the vector clock is attached to the message.
- When a message is received,
- It performs the action
- Update each list element by taking the maximum of that element's own list and the list received.
We say Event A caused Event B if A's vector timestamp is strictly less than B's vector timestamp.
We can also say that Event A and Event B are concurrent if their timestamps cannot be compared.
Pseudocode
on initialisation at node Ni do
T := [0,0,..0] //One element for each node
end on
on any event occuring at node Ni do
T[i] := T[i] + 1
end on
on request to send message m from node Ni do
T[i] := T[i] + 1
send_message(m, T)
end on
on receiving message (m, T') at node Ni do
T[j] := max(T[j], T'[j]) for j in (1..n)
T[i] := T[i] + 1;
process_message(m)
end on
Issues with Vector Clocks
The main problem with vector clocks is that they require a large amount of storage because each node must store N timestamps based on the number of nodes.
Furthermore, as the number of nodes increases, we will be sending a significant amount of data because the entire vector clock with all node timestamps must be sent even if only two nodes are interacting.
Top comments (0)