Depth-first search (C Plus Plus)

From LiteratePrograms

Jump to: navigation, search
Other implementations: C++ | Java | Python

Depth-first search is simple to implement in a procedural language like C++ because it always expands the node most recently seen, making it ideal to implement with a stack (such as the procedural call stack).

Contents

Graph representation

Since we'll need to traverse through all the children of each node we visit, we choose an adjacency-list representation for the graph, templating over the node data type and creating a std::list of successor vertices:

<<type definition headers>>=
#include <list>
<<vertex class>>=
template <typename T>
class vertex
{
private:
    T data;
    std::list<vertex<T>*> _successors;
public:
    vertex public methods
};

Each vertex is initialized with its vertex data or label, which we provide an accessor for:

<<vertex public methods>>=
vertex<T>(const T data) { this->data = data; }
T get_data() const { return data; }

For this simple example, we provide direct access to the successors list, although this violates encapsulation:

<<vertex public methods>>=
typedef typename std::list<vertex<T>*>::const_iterator successor_list_iterator;
std::list<vertex<T>*>& successors() { return _successors; }

We also include a typedef to briefen long iterator type names. Now we're prepared to write our depth-first search.

Depth first search

The search will take three parameters:

  • A start node, indicating the first node to expand
  • A goal predicate, which determines whether a given vertex is a goal vertex (a vertex on which the search stops successfully)
  • A partial result list, describing the path (sequence of vertices) that the search has followed so far

In outline, the search looks like this:

<<depth first search>>=
template <typename T, typename GoalFunction>
bool depth_first_search(vertex<T>& start,
                        GoalFunction is_goal,
                        std::list< vertex<T>* >& result)
{
    ensure we're not stuck in a cycle
    result.push_back(&start);
    check if we've found the goal
    expand each child node in order, returning if we find the goal
    // No path was found
    result.pop_back();
    return false; 
}

To expand each child, we simply call depth first search recursively on each child in order. If any of them return true, the final path will be in result and we also return true:

<<expand each child node in order, returning if we find the goal>>=
typename vertex<T>::successor_list_iterator iter = start.successors().begin();
while (iter != start.successors().end())
{
    if (depth_first_search(**iter, is_goal, result))
    {
        return true;
    }
    iter++;
}

As our terminating condition, we check if the start node is a goal vertex by applying the goal predicate to it. If so, we return true immediately and the result is propagated back up to the initial caller:

<<check if we've found the goal>>=
if (is_goal(start))
{
    return true;
}

Finally, if the graph we're processing might contain directed cycles (it's not a directed acyclic graph), we need to ensure we don't get stuck in an infinite recursion following a cycle. To do this, we simply search the list of visited nodes in result for the start node and return false immediately if we've followed a cycle:

<<ensure we're not stuck in a cycle>>=
if (std::find(result.begin(), result.end(), &start) != result.end())
{
    return false;
}

It's important of course that this be done before appending &start to result. This completes the implementation.

Sample code

Petersen graph

This small test driver demonstrates use of the search function on the Petersen graph, a small graph with 10 vertices and diameter 2. We'll see that the path found by depth-first search can be much longer than the minimum 2 vertices.

First we have the code that generates the graph as a simple vector of vertices, labelling the vertices with sequential integers:

<<sample header files>>=
#include <vector>
<<sample using statements>>=
using namespace std;
<<generate Petersen graph>>=
vector< vertex<int> > PetersenGraph()
{
    vector< vertex<int> > v;
    for(int i=0; i<10; i++)
    {
        v.push_back(vertex<int>(i));
    }
    add Petersen graph edges
    return v;
}

Now we add the edges. We simply create a table of the edges and iterate over it. Note that since our data structure describes directed graphs, we need an edge in each direction for each edge of the graph:

<<add Petersen graph edges>>=
struct { int source; int target; } edges[] =
    {{0,1}, {1,0}, {1,2}, {2,1}, {2,3}, {3,2}, {3,4}, {4,3}, {4,0}, {0,4},
     {5,6}, {6,5}, {6,7}, {7,6}, {7,8}, {8,7}, {8,9}, {9,8}, {9,5}, {5,9},
     {5,0}, {0,5}, {6,2}, {2,6}, {7,4}, {4,7}, {8,1}, {1,8}, {9,3}, {3,9}};
for (unsigned int i=0; i<sizeof(edges)/sizeof(edges[0]); i++)
{
    v[edges[i].source].successors().push_back(&v[edges[i].target]);
}

Now we can complete our sample:

<<dfs_sample.cpp>>=
#include <iostream>
sample header files
type definition headers
sample using statements
vertex class
depth first search
generate Petersen graph
bool is_goal(vertex<int> v)
{
    return v.get_data() == 7;
}
int main(void)
{
    vector< vertex<int> > v = PetersenGraph();
    list< vertex<int>* > path;
    if (depth_first_search(v[0], is_goal, path))
    {
        cout << "Found path: ";
        list< vertex<int>* >::iterator iter = path.begin();
        while (iter != path.end())
        {
            cout << (*iter)->get_data() << " ";
            iter++;
        }
        cout << endl;
    }
    else
    {
        cout << "No path found" << endl;
    }
    return 0;
}

When run, this produces the following path:

Found path: 0 1 2 3 4 7

This is far from the optimal path, which is "0 4 7".

Improvements and tradeoffs

On large graphs, the cycle-checking step can become expensive, because the result list may become very long. One solution is to also maintain a set data structure holding the nodes visited so far, such as a std::set. Another more common solution is to give each node a visited flag, which has the advantage that no node is ever expanded twice, but has the disadvantage that we need to make an initial pass to clear the flag on every node of the graph. Visited flags also make it difficult to make the algorithm threadsafe.

Another issue on large graphs is that depth-first search can easily form a very long call chain following a useless path. It's useful to include a threshold on how deep we want to search. By searching with incrementally larger thresholds, we get iterative deepening depth-first search, which finds much shorter paths in general than ordinary depth-first search.

Download code
Views