University of Washington
Department of Geography
Professor Harrington
Network  Analysis
Contents:
Urban transportation planning models
Network analysis
Routing algorithms


These notes provide an overview of some basic analytic approaches for
- analyzing aggregate transportation demand (e.g., in the form that transportation planners must do),
- assessing the adequacy of transportation systems, and
- analyzing logistics for individual carriers.
Each approach has tremendous demands for geographic information, and each approach is eased by using GIS.  In fact, it is possible to use pre-packaged programs and algorithms that use these approaches, and yet make them totally invisible to the user.  These notes will help you look inside the black box of transportation analysis, to be able to use the results more intelligently.
 


URBAN  TRANSPORTATION  PLANNING  MODELS

Component One:  Trip generation
Transportation activity is viewed as derived demand:

The total number of trips from each O can be estimated or forecast based on some set of existing data, using regression analysis.
 

Component Two:  Trip distribution
The total number of trips to each D can be estimated or forecast based on some set of existing data, using regression analysis.

Each OD flow can be estimated or forecast based on some set of existing data, using regression analysis applied to a gravity model:
Iij = k OiDj dij-a
The model above would be specified by applying actual data for I, O, D, and d for a set of places to a statistical model that looks like:
Ln Iij   =   ln k + ln Oi + ln Dj -a ln dij
This would give you values for k and a, which you could apply in the first formula to other places O and D.
 

Component Three:  Modal split
The proportion of OiDj trips using private automobiles is seen to be a function of:

The actual functions are estimated from some existing data (e.g., sample surveys) and applied to other circumstances (e.g., all zones and trips modeled for the urban area).
 

Component Four:  Route assignment
What proportion of the auto or truck trips from Oi to Dj will use each of the possible road routes between the two zones?  What proportion of the transit trips from Oi to Dj will use each of the possible transit routes between the two zones?

This is seen to be a function of travel time;  most trips are assumed to take the shortest time path.

To increase accuracy, capacity constraints are placed on each route, and when the simulation model reaches capacity on any particular route, the remaining mode-specific trips are assigned to other routes between specific zones.

To understand route assignment better, we need to review network analysis.
 


NETWORK  ANALYSIS

Network:  a number of people or places among which there are one-on-one interactions – friendships, airline routes, telephone calls, automobile trips, roads

How can we systematically select the shortest distance routes between Oi and Dj?  (Note that for transportation-system modeling or for an individual's decision making, we need to know alternative routes, ordered according to distance).

First, we must abstract the transport system into a graph or matrix.
The network graph illustrates the network in schematic fashion, with each possible O and D presented as a point (“node” or “vertex”) and each possible direct link between an O and a D presented as a straight line segment (“linkage” or “edge”).
(See page 217 figure in Chou;  Figure 9.1 in Taaffe, Gauthier, and O'Kelly [TGO])

The connectivity matrix abstracts the network as a table, with each possible node (vertex) presented as a row (an origin) and as a column (a destination).  Each possible direct link between an O and a D is presented as a “1” in the appropriate cell;  if there is no direct OiDj link, a “0” appears in the appropriate cell.  Thus, network graphs or matrices are two formats for topological data (TOPOLOGY:  connections and distances among spatial data elements).  Within GIS, including topology implies providing the GIS with a spatial data matrix of explicit connections among points, adjacencies among areas, etc.
(See TGO’s Figure 9.8;  see this link)

For a simple system, the matrix tells you nothing you can’t see in the graph.  The matrix is necessary for three reasons:


Matrix manipulation for network analysis:  connectivity
If we multiply the matrix above by itself (see note 3), yielding C2 (or a squared connectivity matrix), we have a new matrix that tells us the number of two-edge routes from each O to each D.
(Compare Chou's figures on pages 223-5;  See TGO’s Figure 9.11, showing how to multiply a matrix).
 
 

If we multiply C2 by C1  to yield C3 , we have the number of three-link connections between each O and D.
(See Chou's page 226;  See TGO's Figure 9.12)
 

If we add the entries, cell by cell, in C1 , C2 , C3, and C4 , we get a matrix of the total number of possible ways to get from each O to each D.  This is generally called the T-matrix, for total accessibility.
(See Chou's page 227 or TGO's Figure 9.13;  note the row totals, which are measures of relative accessibility of each origin).
 
 

In some cases, when actual distance is not as very meaningful as the number of connections (e.g., airline travel, from the perspective of the traveler;  or connections within an integrated circuit;  or connections among linked computers on the Internet), these topological distances are all we need.

However, for walking or driving routes within a city, we care about the total distance or time, not the number of connections.

(See Chou's page 230;  TGO’s Figure 9.19).
 

 We can derive a total valued matrix by adding the L matrix to itself, repeatedly until we cover the diameter (maximum distance -- see Note 1) of the network.   This tells us the minimum distance (in time or ground distance) between each O and each D:  a very important piece of information.
(See TGO’s Figures 9.20 and 9.21;  pages 220-2 in Chou)

 These measures are useful in several ways.

1. They provide us with the dij we need for any kind of transportation modeling.
2. They show us the minimum distance between any O and D, expressed in number of links, in time, or in ground distance.
3. They allow us to understand how an additional edge (link) or the removal of an edge will affect the accessibility of a node and the connectivity of the network.



ROUTING  ALGORITHMS

What we’ve done so far, however, does not tell us exactly what set of routes we need to follow to get from O to D most efficiently.  We need to develop a routing algorithm.  This algorithm (see note 2) will allow us (or a computer) to use the matrix of direct links between nodes to establish and record exactly which links are to be used to get from Oi to Dj.

(See Chou's page 235;  TGO Figure 13.7;  J.P. Rodrigue, Transportation Geography on the Web)
We will build a new matrix, R4 .  Each cell entry (OiDj) of R4 tells us the first node that we have to go through to get from Oi to Dj.

We ask the computer to keep track of the links we’ve traced and the distances involved, and it can return to us the shortest route and its total (time or mileage) distance.
(See TGO Figure 13.8)
 

How can a GIS make use of this insight to construct minimum-distance routes?

The network of possible routes is entered into the GIS as a topological matrix:  what nodes link directly to what nodes.

The distances don’t have to be explicitly entered, because the GIS has the actual location of each node;  it can calculate the distance along each direct link.

1. It can take the approach outlined above:


2. It can take the approach outlined in Chou, pages 230-235 -- note that this approach requires that the network we're using is not the entire network, but the "minimum spanning tree" (pp. 230-233) that starts at the origin point of the route we want to optimize.


In either case, the GIS can link multiple destinations, to establish the minimum distance for a delivery route, such as we’ll be doing in the first case.  We’ll come up with a set of customers, and develop a route among them.

Read Chou, pages 236-248, for a presentation of such a routing problem. The essential logic (which is all that you need remember) is that each iteration identifies the link (or edge) between two nodes that will save the most resources out of all the remaining options.  Each time a specific link is identified in an iteration, that (direction-specific) pair of nodes is removed from the options for the next iteration -- because we're assuming we don't want to go from (e.g.) E to F twice.

We’ll then use our information about our customers, and our spatial information about this route, to help us identify similar businesses that are near are route.  Perhaps they might be potential customers?  If so, we could serve them without much additional cost. (See notes on targeting).

Next:  how to determine potential customers from the characteristics of current customers.



NOTE:

1.  Chou (p. 221) defines diameter as “the maximum number of steps required to move from any node to any other node through the shortest possible routes within a connected network.”  This is perhaps more explicit than, but the same implication as Taaffe, Gauthier, and O’Kelly (p. 252) definition of diameter as “the number of linkages needed to connect the two most remote nodes on the network.”

2.  "An algorithm is a set of mathematical expressions or logical rules that can be followed repeatedly to find the solution to a question" [Chou, p. 234]

3.  Each cell (x,y) in a new matrix (AB) which is the product of two other matrices (A and B), is the sum of:

Note that the rows of matrix A must have the same length as the columns of matrix B.  In this simple network analysis, we're multiplying a square matrix by itself, so that's not a problem.
In a connectivity matrix, cell (3,1) is 0 if there's no direct link between 3 and 1;  1 if there is.  Cell (1,4) is 0 if there's no direct link between 1 and 4;  1 if there is.  What does the product of cell (3,1) and cell (1,4) tell us?  Why would we be interested in adding this product to the product of (3,2)(2,4), to the product of (3,3)(3,4), to the product of (3,4)(4,4), to the product of (3,5)(5,4)?


copyright James W. Harrington, Jr.
revised 28 January 2004