So you want to make a model of a system, but you don't think you have all of the data you need? This is actually a very common situation in modeling. Before going on a quest to search for more data, I often remind researchers that a good model might not need as much data as they think. Personally, I think elegant math models operate with minimal data. The less data you need, the easier it is to deploy, calibrate, and scale the use of your model in the Real World, after all.
In this post we will review a data-light technique for modeling flows of people using a simple and common data set called an origin-destination matrix (OD matrix, for short).
WHAT IS AN ORIGIN-DESTINATION MATRIX?
Trip behavior data is often collected as an OD matrix. An OD matrix reflects aggregate flows between a number of discrete locations over a period of time (Fig. 1).
Unfortunately, an OD matrix does not capture routing between locations. It's also challenging to visualize the weight and directionality of flows for larger matrices. The following modeling techniques, however, can help us recover the missing information.
For our exercise, we'll use a 2D environment created in Processing. An OD matrix is randomly generated (Fig. 2 and 3). In a simple visualization, trip counts are represented by lines connecting points. Larger counts are represented by thicker lines. Color in Fig. 3 is arbitrary but unique to each OD pair.
That's great, but now we want to tease out some additional information. In particular, we want a better understanding of directionality and routing within our OD matrix. First, we'll tackle directionality using agent-based modeling techniques. Solid lines are replaced with continuous flows of individual agents (Fig 4).
By animating the network, we have a bit better understanding of flow, particularly directionality. The total amount of agents flowing between two points at any given time is proportional to the flow defined in the OD matrix. Keep in mind, however, that time is an abstract dimension in this visualization. Motion and speed are in fact arbitrary constructs that only serve to make our OD matrix more human-readable. The quantity of agents on the screen, however, is not arbitrary. Rather, quantity of agents is the key parameter by which we communicate information from our OD matrix.
Now that we've tackled directionality, we'll try to tackle the issue of routing. You've probably noticed that OD matrices tell us nothing of how one travels from point A to point B. While visualizing the routes as straight-shot lines is easy, it is hardly realistic (except for the birds, perhaps). A pathfinding algorithm, such as Dijkstra's, can be helpful for solving this. We've randomly generated a "road" network for our model, lightly colored in gray. The agents are now programmed to take the shortest route along the network provided (Fig. 5).
Not only does our visualization show a more realistic representation of routing, it starts to reveal emergent "corridors" where traffic, and perhaps congestion, might be prevalent. However, any observations at this point are purely qualitative.
By the way, you might have noticed that our example uses a poly-voxel network (Fig. 6a and 6b). This is mostly an arbitrary preference. It might make sense, however, given some of our other projects.
Despite all my talk of minimizing the need for data, you might point out that even a routing algorithm requires data for a road network. Touché. In the example above, we've randomly generated data for a network, but you might want to model a specific network yourself. We've created a couple tools to help with that. One tool let's you manually "sketch" a valid network by drawing obstacles (Fig. 7). We imagine this technique might be useful if you lack vector data but can get ahold of a raster/satellite image.
In most cases, though, road vector data is available from various sources such as Open Street Maps. We made a utility that converts such road vector data into our poly-voxel format (Fig. 8b). There's no particular reason you couldn't use vectors directly from OSM, but we prefer the sketchy-ness of the poly-voxels to emphasize the roughness of our approximations. We tested the OSM network by modeling pedestrian movement between amenities (Fig. 8a).
Our primary implementation for this model was for visualization of call detail records (CDR) for the City of Andorra la Vella (Fig. 9a). CDRs can be converted into a type of OD matrix where origins and destinations are represented by cell phone towers. The transition from one tower to the next is captured when a user's cell phone signal switches from one tower to another. This transition is called a handoff and can be the source of a very rich OD matrix.
One issue with our visualization thus far is the use of cell phone towers as our literal origins and destinations. In reality, people do not travel to and from cell phone towers but rather to the amenities and homes within their coverage. Using a Voronoi technique, we can determine roughly which Andorran amenities are associated with each cell tower (Figure 9b). This is, of course, a very rough approximation.
The result is a more nuanced visualization with trips that begin and end at valid amenity locations (Fig. 9c).
The methodology was also adapted for a tangible interactive model of pedestrian activity of a Singaporean neighborhood (Figure 10). Users can adjust an OD matrix by configuring pieces that represent amenities.
Editor's note: The term "Agent Based Visualization" appears in other literature including, but not limited to:
- Grignard, Arnaud et al. "Agent-Based Visualization: A Simulation Tool for the Analysis of River Morphosedimentary Adjustments." Multi-Agent Based Simulation XVI15 March 2016.
- Hagen, Hans et al. "A Component and Multi Agent-based Visualization System Architecture." ICM, 2000
- Roard, Nicolas and Jones, Mark M. "Agent Based Visualization and Strategies." 30 January, 2006.