Originally posted at dylanpaulus.com
In Postgres: The Graph Database You Didn't Know You Had we covered how we can store a DAG (Directed Acyclic Graph) in a relational database.
We looked at one representation of a graph data structure called an adjacency list--a table with two columns representing how nodes connect.
The columns store connections from the source or "from" node to a destination or "to" node.
Edges | |
---|---|
Source | Destination |
A | B |
A | C |
A | B | C | |
---|---|---|---|
A | 0 | 1 | 1 |
B | 0 | 0 | 0 |
C | 0 | 0 | 0 |
A | B | C | |
---|---|---|---|
A | 0 | 1 | 1 |
B | 1 | 0 | 0 |
C | 1 | 0 | 0 |
The connections mirror each other across this line. Our matrix is considered symmetric when this occurs.
Symmetric matrices appear in all undirected graphs, but not DAGs!
This is great, but why would we use an Adjacency Matrix?
Matrices are incredibly fast at checking if two nodes are connected
Instead of having to loop through all the edges in an adjacency list to find if two nodes are connected, an adjacency matrix acts
as a hash table. In our example graph above, if we wanted to check if A has an edge to B it would be as simple as checking the [B][A]
position
in the matrix.
Note: Since we're working with DAG's and 2D arrays, the order we look up a connection is flipped: [Destination][Source]
.
A = 0
B = 1
has_connection = adjacency_matrix[B][A]
Removing and adding a connection between nodes is similar and run in constant time.
Need to connect two nodes? Modify the position in the matrix to be 1
(adjacency_matrix[destiation][source] = 1
).
What about deleting an edge? Update the value to be 0
(adjacency_matrix[destiation][source] = 0
)!
Why would I not use an Adjacency Matrix?
They're huge!
If we look at an adjacency matrix we need to store every bit of information about the graph, the connected nodes (1's), and not connected nodes(0's).
Compared to an adjacency list which stores only the connections; adjacency matrices take up a lot of space!
It's expensive to add/remove nodes
We talked about how adjacency matrices are fast at looking up, adding, and removing connections between nodes.
On the flip side, adjacency matrices are slow when adding or removing nodes from the graph.
To add a node we need to copy our matrix to a new matrix with an additional row and column(of size N+1 x N+1).
Then, populate the matrix with the connections to or from the new node.
To compare, adding a new node to an adjacency list is a matter of pushing a new node to the list of nodes.
End
There are many different ways to represent and visualize a graph.
A different representation may be better based on your personal use cases and requirements.
Need to quickly look up which nodes are connected or need to quickly add or remove edges? Then the adjacency matrix may be the best choice.
Are you constantly adding nodes to the graph and are limited in your memory footprint? The adjacency list is the representation for you!
Top comments (0)