Graphs and Graph Algorithms

The concept of graphs comes from the branch of mathematics called graph theory. The graphs we'll be talking about here are not to be confused with things like bar graphs or line graphs.

Graphs are used to solve a number of problems in computing. They don't have as much structure as the other data structures we've looked at in this book and as such, carrying out things such as traversals are more unconventional.

By the end of this chapter you should be able to do the following:

Graph Structures

I haven't gone into much mathematical detail in this book but I think it is fitting to do so here as I believe the more detailed explanation of them will benefit a lot of you in your understanding. However, before I do that, I'll give an informal definition of what graphs are.

A graph is a set of vertices (nodes) and edges that form connections between vertices. That is essentially what they are - much like trees (which are in fact graphs) in the way that the nodes of the tree are connected by "lines", which we now know to be edges.

Formally, a graph G is a set V of vertices and a collection E of pairs of vertices from V, called edges. A graph is a way of representing connections or relationships between pairs of objects from some set of vertices, V.

There are three main types of graphs - undirected graphs, directed graphs, and weighted graphs. There are also combinations of those which we'll come to naturally but we'll look at the three main types in detail first. However, before we do that, take a look at the below diagram of a basic graph and we'll define some terminology before we continue any further:




Let's go through 7 terms used in describing the above graphs:

Now that we have the terminology out of the way, let's begin by looking at different types of graphs we can encounter.

Undirected and Directed Graphs

Graphs can fit into two categories, undirected or directed graphs. In an undirected graph, the edges are simply represented as lines between vertices. There is no additional information supplied about the relationship between the vertices other than the fact they are connected in some way. The graph from the previous section is an example of an undirected graph. More formally, a graph is said to be undirected if all edges are not ordered. In other words if an edge (u, v) is not ordered.




In a directed graph, the edges provide us with more information. They are represented as lines with arrows which will point in the direction the edge connects two vertices. The arrows represent the flow of direction. In the below diagram, one can only move from A to B and not B to A as we could with undirected graphs.




Weighted Graphs

A weighted graph adds even more information to the edges. This can be a numerical value (and will be throughout this book) but it can be whatever you see fit. The below diagram is an undirected weighted graph that represented the distances between Earth, Mars and Saturn. The distances are represented as numerical values between the vertices and are in astronomical units.




When describing graphs we often represent the relationship as something like EM which in our case represents the edge connecting the vertex E and the vertex M. From the graph we can tell that EM is 8 astronomical units closer than ES.

In the above diagram, ES and EMS represent two different path. A path is simply a sequence of edges you pass through between nodes. Following these paths we can tell that the journey from Earth to Saturn is a distance of 8.52 astronomical units, but the journey to Saturn from Earth, passing Mars on the way is 8.51 astronomical units. This tells us that we would arrive at Saturn faster if we went to Mars then Saturn rather than straight to Saturn from Earth.

Graphs in Code

We represent graphs slightly differently to how we have represented other data structures. Graphs can be represented in two main ways. One way is to use something called an adjacency matrix, the other way is an adjacency list.

Adjacency Lists

A simple dictionary can be used to represent a graph. The keys of the dictionary are each of the vertices in the graph. The value at each key is a list that connects the vertex. For demonstration, take the below graph and we'll represent it as an adjacency list.




The above graph would be represented as:

Below is how we represent this in code:

Now we can easily that vertex D has the adjacent vertices A and C whereas vertex E only has vertex B as it's neighbour.

Adjacency Matrix

An adjacency matrix is a way of representing a graph G = {V, E} as a matrix of Booleans. A matrix is a two dimensional array. Another way of viewing this is as a table. The idea here is to represent the cells with a 1 or 0 depending on whether two vertices are connected by an edge.

The size of the matrix is V x V where V is the number of vertices in the graph and the value of an entry, is either 1 or 0 depending on whether there is an edge from vertex i to vertex j.

Take the below graph as an example



The adjacency matrix for the above graph would look as follows




There are a wide range of pros to using an adjacency matrix to represent a graph. The basic operations such as adding an edge, removing an edge, and checking whether there is an edge between two vertices are very time efficient, constant time operations.

If the graph is dense and the number of edges is large, an adjacency matrix should be the first choice. Even if the graph and the adjacency matrix is sparse, we can represent it using data structures for sparse matrices.

Perhaps the biggest advantage of representing a graph as an adjacency matrix comes from recent advances in computing hardware. The power of newer GPUs enable us to perform even expensive matrix operations in a time efficient manner.

Due to the nature of matrices, we can get important insights into the nature of the graph and the relationships between the vertices within it.

Given an adjacency list, it should be possible to create an adjacency matrix.

The adjacency list for the above graph is as follows:

To create the adjacency matrix we need to create a two dimensional array of V x V elements. To do this we'll firstly convert the keys of our adjacency list dictionary to a list then define the number of rows and columns from this.

If we print this out we'll end up with the following

Now we need to populate the cells with the correct values. To help us do this we'll create an edge list. This is a list of all edges in our graph

Printing the edge list will give:

Now that we have our edge list, we can easily populate our adjacency matrix with the correct values:

Now if we print our adjacency matrix, we get the following:

The above adjacency matrix matches the adjacency matrix we looked at earlier.

The Graph Class

With other data structures we've looked at we had classes to represent the data structures. We can do this here to! Firstly, let's take a look at the Vertex class.

You should understand most of the above code with the exception of the special hash method. This is a method we can override to allow objects of the class to be keys of dictionaries or sets. The hash() function is a Python built-in function that generates a hash - similar to the hash function we created in the Hash Tables chapter. The Python built-in hash function uses a combines ideas from both polynomial hash codes and a cyclic-shift hash codes. It works on string and tuple types.

In Python, all objects are assigned a unique ID when created. The id() function returns the unique ID of an object. In this case, we are getting the ID for the object itself. In total, we are hashing the ID of the object.

Now let's look at the Edge class, this is a little more complicated than the Vertex class.

Again the above code should be self explanatory, however, you may be confused by the element attribute. We include this for the case our edge is a weighted edge.

Next we'll need to code our Graph class. This can get a little complicated as we'll also have to account for both undirected and directed graphs.

A graph will have two dictionary attributes - incoming and outgoing. In the case of directed graphs, we'll have a the incoming dictionary, otherwise we'll just need the outgoing dictionary

There is a lot going on above so let me break down what's happening and then we'll look at how to use the class. The graph has two dictionaries. The outgoing dictionary is a dictionary whose keys are the vertices in the graph and whose values are a dictionary of the vertices each connects to whose value are the edges that connect the two vertices. Now that's a mouthful and probably still confusing so let me show the outgoing and incoming dictionaries for the below directed and undirected graphs:




The outgoing dictionary for the undirected graph on the left above is:

As this is an undirected graph, the incoming dictionary is also the same.

I should note, that if you try to print these out, you'll get a dictionary of pointers to the vertices and edges. I have replaced the pointers with the values of the vertices and Edge(x, y) for the pointers to the edges.

The outgoing dictionary for the directed graph on the left above is:

The incoming dictionary for the directed graph is:

I hope this has made what's going on with these dictionaries much clearer.

The methods for checking if the graph is directed and getting the vertex count are pretty straight forward and self explanatory. The method for inserting a vertex should also be easily understand now that you understand how the incoming and outgoing dictionaries are structured.

The method for retrieving the incident edges for a given vector may cause some confusion so I'll explain how it works. Firstly, it's a generator function which allows us to repeatedly call next() on it to retrieve the next edge for a given vertex. We first get the dictionary of adjacent vertices to a given vertex. Then we yield the values which are the edges between the given vertex and each adjacent vertex.

The method for inserting an edge is quite neat as we don't need to worry about whether a graph is directed or undirect as that was handled for us in the constructor for the graph.

Next, we have a method for generating the adjacency list of the matrix. To build the list we are iterating over each of the vertices in the graph and adding the elements of each vertex to a dictionary. The values at each of those keys is a list of elements of each vertex that is on the opposite side of the edge connecting the two vertices.

As we have a method to generate the adjacency list, the method to generate the adjacency matrix is unchanged from earlier!

Let's look at how to use the class. We'll be building the graph from above:

Which outputs:

To generate the directed version of the graph from above all we need to do is pass True when creating the instance of the graph:

Now when making calls to generating both the adjacency list and adjacency matrix we'll be returned the list and matrix having accounted for the fact the graph is directed!

Graph Traversals

We have written algorithms to traverse many types of linked lists and trees in this book. As with all types of those data structures we have look at, they have a well defined structure the algorithms for traversing them were relatively straight forward, both to write and understand and were quite intuitive. However, as we've seen with graphs, they don't have a well defined structure in the sense that any "node" can be connected to any other node, this can end up forming loops as we've seen and in large graphs these loops could appear anywhere in the structure. The ability for loops to form in the graph is just one issue we'll encounter when trying to come up with a traversal algorithm.

Thankfully, there are many traversal algorithms for graphs that have been developed. In this section we'll be taking a look at two of them. A common strategy is to follow a path until a dead end is reached, then walking back up the path we've travelled until we reach a point where we can explore a path that we haven't already taken. This is called back-tracking. We can also iteratively move from one node to another in order to traverse the full graph or part of it. To remember the paths we've taken and the nodes we've visited along the way, we normally track which vertices have been visited and which have not.

Breadth-first Search

The first traversal algorithm, or more specifically, the first search algorithm we'll look at is called breadth-first search. The algorithm works by picking a vertex in the graph as it's root. It then visits each of it's neighbouring vertices, after which it explores the neighbours of those vertices and so on until it has explored all vertices or we reach a stopping condition (finding the vertex we were searching for as an example). We keep track of the nodes we've visited along the way so we don't revisit them or enter an infinite loop of revisiting the same nodes.

To help in our understanding let's take a look at the below diagram:



We start by picking a vertex, A in this case and add it to the visited list. We then add the neighbours of A to the queue of nodes to visit, which are B and C here. Next, we remove the first vertex from the queue which is B and we add that to the visited list. We then add B's neighbours to the queue. We should add the vertex A to the queue but it's already been visited so we don't add it. Similarly, we should also add C to the queue but it's already in the queue so we skip it so we just add the vertex D. Next, we remove C from the queue and add it to the visited list. We then add the vertices that are neighbours of C that haven't already been visited or in the queue to be visited, which is just E in this case. Now our queue contains the vertices D and E so we repeat this process. We remove D from the queue and add it to the visited list. As D has no neighbours that haven't been visited or are in the queue to be visited there is nothing further we can do so we take the next vertex from the queue which is E. As there are no neighbours of E that we haven't visited or are in the queue to be visited we don't do anything. The queue is now empty so we stop. When the queue becomes empty we know we have traversed the graph.

In the case of searching for a specific vertex in the graph, we check the value of the vertex before we add it to the visited queue. If it is the vertex we are searching for then we can simply stop.

To implement this algorithm, we can use the Queue you developed earlier on in the book, but for simplicity I will be using a list and using pop(0) to remove the vertex from the front of the list.

The above algorithm works by passing a graph and a starting vertex, s, in the graph. We then start our traversal from that point. The nice thing about the two traversal algorithms we'll look at, is that we can start the traversal from anywhere in our graph.

Although the algorithm works, it's not very useful in terms of searching for a vertex so let's modify the above algorithm to add the capability to stop when we find a vertex we're looking for. In this case, we'll be looking for the element of the vertex.

In the above modified algorithm, we're searching for an element of a vertex. If we find it we return the vertex and the length of the visited list, otherwise we return None and the length of the visited list.

Let's look at how we can use this algorithm. Firstly, we should reconstruct the graph from the above diagrams:

Now that we have our graph, let's search for the vertex whose element is "D" and we'll start from vertex "A".

The above code will output:

Now, let's search for an element, "L". There are no vertices in the graph with that element so we should see the message from the else block:

Which is as we expect!

An interesting thing about the nature of breadth first search is that in un-weighted graph where all paths are equal, it will find the shortest path between any two nodes.

Depth-first Search

The next traversal algorithm we're going to look at is depth-first search. The way depth-first search works is, instead of visiting the vertices in the order we discovered them, we visited the first vertex we find, then we visit the first neighbouring vertex of that vertex, and so on until we reach a dead end. When we reach a dead end we back-track until we can explore a path we haven't followed yet. This is illustrated below:




In the above illustration, we start from vertex A. We add vertex A to the visited list then we add the first neighbouring vertex to the stack which is B. We then visit by and add it to the visited list. We then add the first neighbouring vertex to the stack. This would be A, but we've already visited it so we skip it and add C instead. We pop from the stack which gives C so we visit C and add the first vertex to the stack which is neighbouring it. This would be A but it's already been visited, so we move to B which has already been visited so we skip it too and add D instead. We then add D to the visited list and choose the next unvisited vertex for which there are none so we must backtrack to search for a path we can follow so we move back C and look for next unvisited neighbouring vertex which is E so add it to the stack and visit it. We add E to the visited list and add the next unvisited node to the stack for which there are none so we move back C and add the next unvisited vertex to the stack for there are none so we move back to B and add the next unvisited vertex to the stack for which there are none so we move back to A and add the next unvisited vertex to the stack for which there are none. We can't backtrack any further and the stack is empty so we are done.

We can implement this the exact same way as we did with breadth-first search but we swap a queue for a stack.

Let's take a look at the implementation of depth-first search:

Notice the only change we needed to make: We swapped renamed queue to stack and replaced pop(0) with pop() (which pops from the end).

When you use depth first search and search for the vertex "B" for example, you'd expect to find this after visiting 2 nodes (A and B), however in my case, the algorithm visited B last. Why is this? In the above diagrams we visited the nodes in alphabetical order but remember, our graph is a dictionary which means we don't have any order as our hashes can change. When I ran this algorithm, A was visited first, then C, then E, then it back-tracked to C and visited D, then back-tracked to C and visited B. However, it sill followed a depth first approach, that is, follow a path until a dead end is reached, then back-track to take another path until all possible paths are exhausted.

Which output:

Both of these traversal algorithms work with both directed and undirected graphs. Also, both our traversal algorithms are fit to work on our Graph class. However, the principles from both can be applied to any form of a graph so we can fit the algorithms to work on adjacency lists and adjacency matrices. As Trees are a special case of graphs, we can apply these algorithms to trees too, however as trees tend to have a better defined structure we have better search algorithms for them.

Shortest Path

We discussed in the previous section how breadth-first search can be used to find the shortest path between any two nodes in a graph were all edges are considered equal, however, this strategy doesn't work in many situations. Consider a weighted graph were the vertices represent cities and the weighted edges represent the distances between the cities. Likewise, we can use a graph to represent a computer network (such as the Internet), and we might be interested in finding the fastest way to route a data packet between two computers. In this case, it might not be appropriate for all edges to be equal. For example, one edge might represent a low-bandwidth connection while another might represent a high-speed fibre-optic connection.

Consider the illustration below of a partial map of Europe with some major airports and distances between them:



For the purposes of this section, it would be better to give a mathematical definition of the shortest path in a weighted graph. Let G be a weighted graph. The length (or weight) of the path is the sum of the weights of the edges of the path P. That is, if P = ((V0, V1), (V1, V2), ..., (Vn-1, Vn)), then the length of P, denoted w(P), is defined as:

The distance from any two vertices, is the length of a minimum-length path (also called the shortest path) from both vertices, if such a path exists.

Dijkstra's Algorithm

Before we look at Dijkstra's algorithm, we need to gain an understanding of another data structure. We won't be looking at this data structure in detail, instead we'll be using the built in version that comes with Python.

This new algorithm is called a Heap Queue, also known as the priority queue algorithm. A heap is a binary tree for which every parent node has a value less than or equal to any of its children.

A priority queue is a data structure similar to a queue in which every element in the queue also has a "priority" associated with it. In a priority queue, an element with high priority is served before an element with a lower priority. We can use the Python built-in heapq to provide a priority queue for us.

Dijkstra's Algorithm allows you to calculate the shortest path between one node and every other node in the graph. The algorithm works on weighted graphs.

Below is a description of the algorithm:

  1. Mark the selected initial vertex with a current distance of 0 and the rest of the vertices have a distance of infinity

  2. Set the non-visited node with the smallest current distance as the current vertex, C

  3. For each neighbour, N, of the current vertex, C:

    1. Add the current distance of C with the weight of the edge connecting C to N.
    2. If the sum of the current distance and the weight is less than or equal to the current distance of N, set it as the new current distance of N
  4. Mark the current node C as visited

  5. If there are non-visited nodes, go to step 2.

That may seem a little abstract so let's take a look at an example. There are 12 diagrams below starting from top left, moving from left to right, row by row, I'll explain what happens at each of the 12 steps.


Below is a step by step of what's happening in each diagram. Our initial vertex is C:

  1. We start off in our initial state - a weighted graph.
  2. We mark every vertex with it's minimum distance to vertex C (the initial vertex). For vertex C, this distance is 0. For the rest of the vertices, as we still don't know the minimum distance, it starts as infinity (Inf). The red dot indicates that this is the current vertex we're looking at.
  3. Next, we check the neighbours of the current vertex (A, B, and D) in no specific order. Let's begin with B. We add the minimum distance of the current vertex (in this case, 0) with the weight of the edge that connects our current vertex with B (in this case, 7), and we obtain 0 + 7 = 7. We compare that value with the minimum distance of B (infinity). The minimum value of the two is the one that remains the minimum distance of B. (In this case, 7 is less than infinity). We repeat that process for vertices A and D. Once we have, the minimum distances for A and D are 1 and 2 respectively.
  4. Now, as we have checked all the neighbouring vertices of C and calculated the minimum distances for them, we can mark C as complete - this is indicated by a green circle beside the vertex.
  5. We now need to pick a new current vertex. We select this from the unvisited vertices with the smallest minimum distance which is A in this case. I have now marked that vertex with a red dot to indicate it is the new current vertex. Now we repeat the algorithm. We check the neighbours of the current vertex, ignoring the visited vertices (A in this case). This means we only check B. For B, we add 1 (the minimum distance of A, our current vertex) with 3 (the weight of the edge connecting A and B) to obtain 4. We compare 4 with the minimum distance of B (which is currently 7). As 4 is less than 7, 4 becomes the new minimum distance for B.
  6. We mark this the vertex A as complete and pick a new current vertex. Again, to do this we select the un-visited vertex with the smallest minimum distance which is D. We repeat the algorithm again. This time we check B, and E. For B, we obtain 2 + 5 = 7. We compare the value of B's minimum distance with 7. As 4 is less than 7 we leave the minimum distance for B as it is. For E, we obtain 2 + 9, compare it with the minimum distance of E (infinity) and set the minimum distance for E as the smallest of the two, which is 9.
  7. We mark the vertex D as complete and pick a new current vertex. Again, we select the vertex from the un-visited vertices with the smallest minimum distance which is B. Now we only need to check E. We obtain 4 + 1 = 5 as a new distance to E, which is less than E's minimum distance of 9, so the minimum distance for E becomes 5. We mark mark B as complete and select the next vertex from the un-visited vertices with the smallest minimum distance which is E and that becomes our new current vertex.
  8. As there are no un-visited neighbours to E, we can mark it as visited.
  9. Now we are done as there are no more un-visited vertices. If there were we'd continue on doing what we've been doing. Now the minimum distance of each vertex actually represents the minimum distance from C to that vertex.

As you may have guessed, we'll be using the priority queue (heapq) to keep track of what vertices to visit next. The vertex with the smallest minimum distance will be at the top of the queue and the vertex with the largest minimum distance will be at the bottom.

Below is the code for Dijkstra's algorithm. I have commented the code and I'll give an explanation of what's happening with the heap below the code but read through it line by line and try gain an understanding of what's happening:

There is a lot going on here but I think you should understand just about all of it, except for stuff related to the heap, so I'll explain what's going on with those pieces of code.

The heapq library provides a heapify() function. This will create a heap (which is just a list but is ordered according to the heap property) from the unvisited list in-place.

The unvisited list is a list of tuples. The reason this must be a list of tuples is because we want to keep pointers to the vertex objects in there along with the distance information. However, we have also included the id() of the vertices. We need to do this because the heapify function needs to compare the tuples in order to sort them correctly. The heapify() function will compare tuples using <=.

A problem arises when comparing tuples however, as, when the first items of the tuples are not unique, Python will compare the second items in the tuples. If our tuples were only (distances[v], v) and if comparing two tuples and the distances were the same, heapify() would look at the second items in the tuples it's comparing. As we didn't override the __lte__() method in the Vertex class, the function would fail to compare the two vertex objects and we would get an error. To stop this from happening I've included the unique id(v), of each vertex. This way, if the distances are equal, we can distinguish the tuple pairs by the IDs of the vertices they represent. We have also included the pointers to the vertices in the tuple because, when we want to fetch the next item from the priority queue (the heap) we can also get the vertex.

This line pops the highest priority item (the vertex with the smallest minimum distance) from the heap.

Now that we have updated the the minimum distances for all neighbours of the current vertex and marked the vertex as complete, we need to rebuild the heap as each of the tuples in it contain outdated distance information.

To demonstrate the function, I have recreated the graph from the walkthrough example. This time, we've added weights to the edges:

We will get the following output:

Which are the distances we ended up with at the end of our walkthrough example.

You may be thinking that it's great that we can find the shortest distances between the a given vertex and all others in a graph, however we don't know the paths that were taken. We can solve this by keeping a dictionary of paths. The dictionaries structure will be the exact same as the adjacency list matrix but they keys are the vertices and the values are a list of vertices - the shortest path.

Let's look at the revised Dijkstra's algorithm:

The change here to also return the shortest paths is quite simple. First we initialize a paths dictionary the same way we did with the distances dictionary, except we set the initial values to None:

Next, we set the path for the source vertex as itself. The paths will be list of vertices taken to arrive at that vertex.

And finally, when we are updating the distance, this is when we update the paths. To update the path to the latest shortest path (as it can change when we progress through the algorithm) we set the path for the vertex we're updating the distance for as the path taken for the current vertex plus the vertex itself.

From the walkthrough diagrams, in the case of the vertex B, it's initial path (from fig 3 of the walkthrough) would be C -> B. However, when we visit A, we find out that going to A first, then B is actually faster than going straight to B from C so we updated the distance for B to 4. We also update the path at this point as the shortest path is no longer C -> B it's the path to the current vertex (A) which is C -> A, plus B tagged onto the end of the list giving us the final path of C -> A -> B. This path is final because it is never updated again in the algorithm but it could have been if we discovered a shorter path down the line.

I am also returning the paths in the revised function.

We can now use our function to show the shortest paths as follows:

Which will give us the following outputs:

You can verify these against the walkthrough examples.

Dijkstra's algorithm has a number of real world applications and really important ones too. It's used in network routing protocols. It's used in "Intermediate System to Intermediate System" (IS-IS) which is a routing protocol designed to move information efficiently within a computer network, a group of physically connected computers or similar devices. It accomplishes this by determining the best route for data through a packet switching network. It's also used in Open Shortest Path First (OSPF) which is a routing protocol for Internet Protocol (IP) networks (which plays a big role in linking all the networks that make up the internet).

Least-cost paths are calculated for instance to establish tracks of electricity lines or oil pipelines. Dijkstra's algorithm can be used to solve these problems. The algorithm has also been used to calculate optimal long-distance footpaths in Ethiopia and contrast them with the situation on the ground.