CS141 Lab 6

Introduction to Graph Theory

A graph is a set of vertices or nodes, connected by edges or arcs. Depending on the applications, edges may or may not have a direction, and vertices and/or edges may be assigned weights, i.e. numbers.

If the edges have a direction associated with them (indicated by an arrow in the graphical representation) we have a directed graph.

Sometimes the edges in a graph have weights associated with them. These are weighted graphs (both directed and undirected).

Figure 1,  Undirected Graph

Figure 2, Directed Graph

Graph G is a pair (V, E) of sets V and E, where V is a set of vertices (nodes) in the graph, and E is a set of pairs (a, b), where a, b Î V.

In the above example,

G = ( V, E )   V = ( 1,2,3,4 )    E = ( (1,2),(2,3),(1,4),(2,4) )

 

An edge connects two vertices; these two vertices are said to be incident to the edge.

The degree of a vertex is the number of edges incident to it, with loops being counted twice.

In a directed graph, we distinguish the out degree (=the number of edges leaving a vertex) and the in degree (=the number of edges entering a vertex). The degree of a vertex is equal to the sum of the out degree and the in degree.

Two vertices are considered adjacent if an edge exists between them.

The set of neighbors for a vertex consists of all vertices adjacent to it.

A path is a sequence of vertices in that each vertex is adjacent to both the preceding and succeeding vertex. A path is considered simple if none of the nodes in the path are repeated.

The length of a path is the number of edges that the path uses, counting multiple edges multiple times.

If it is possible to establish a path from any vertex to any other vertex of a graph, the graph is said to be connected.

A cycle is a path that begins and ends with the same vertex and has length at least two. A simple cycle is a cycle in which the beginning node only appears once more, as the ending node and other nodes appear only once.

A graph is called acyclic if it contains no simple cycles.

 

In computer programs, graphs are usually represented in two ways: as a collection of adjacency lists or as an adjacency matrix. 

This is a compact way of representing sparse graphs – those graphs where |E| << |V|2. The graph is represented with an array of |V| lists. Each vertex v has a list of the vertices that are adjacent to v in G. Adjacency list representation of our graph from Figure 1 looks like this:

 

                                             Figure 3.

An adjacency list representation of the directed graph from Figure 2 looks like this:

 

                                            Figure 4.

 

The sum of lenghts of all the adjacency lists for an undirected graph is 2|E|, since for every edge (a,b), there will be a list element “b” on the a’s list and a list element “a” on the b’s list.

The sum of lenghts of all the adjacency lists for a directed graph is |E|, since for every edge(a,b), there will only be a list element “b” on the a’s list.

Weigthed graphs are easy to represent with adjacency lists: if the weight of the edge (a,b) is W, just store W along with the “b” element on the a’s list. 

Sometimes, it is advantageous to represent graphs with an adjacency matrix. If the graph is dense (that is, |E| is close to |V|2), adjacency matrix might be more memory efficient. Moreover, if your graph algorithm needs a fast check whether an edge (a,b) is in the graph, adjacency matrix is preferred.

Adjacency matrix M is a |V| ´ |V| matrix, where the dimensions represent vertices (numbered in some arbitrary manner), and the elements are

 M(i,j) = 1 if (i,j)ÎE, 0 otherwise.

An adjacency matrix of the graph from Figure 1. is:

 

1

2

3

4

1

0

1

0

1

2

1

0

1

1

3

0

1

0

0

4

1

1

0

0

Adjacency matrix of the directed graph from Figure 2 is:

 

1

2

3

4

1

0

1

0

1

2

0

0

1

1

3

0

0

0

0

4

0

0

0

0

The size of the adjacency matrix is always |V|2, regardless of the size of |E|. If |E| is close to |V|2, adjacency matrix can be much more efficient, since only one bit is needed for every vertice pair, as opossed to one pointer (usually 4 bytes) for every edge.

Weighted graphs are also easy to represent with adjacency matrices. Just store the weight in the matrix instead of only 0’s and 1’s.

(We may cover some of them in the following weeks)

The shortest path problem in graph theory is the following: Given a weighted graph, (that is a set N of nodes, a set E of edges and a real-valued function f : E -> R), and given further two elements n, n' of N, find a path P from n to n', so that

   ∑ f(p)
 p ∈ P

is minimal among all paths connecting n to n'.

The most important algorithm for solving this problem in case all edge weights are greater than or equal to zero is Dijkstra's algorithm.

A related problem is the traveling salesman problem, which is the problem of finding the shortest path that goes through every node exactly once, and returns to the start.

A minimum spanning tree is a tree formed from a subset of the edges in a given undirected graph, with two properties:

One example would be a cable TV company laying cable to a new neighborhood. If it is constrained to bury the cable only along certain paths, then there would be a graph representing which points are connected by those paths. Some of those paths might be more expensive, because they are longer, or require the cable to be buried deeper. A spanning tree for that graph would be a subset of those paths that has no cycles but still connects to every house. There might be several spanning trees possible. A minimum spanning tree would be one with the lowest total cost. In case of a tie, there could be several minimum spanning trees.

There are now two algorithms commonly used, Prim's algorithm and Kruskal's algorithm.