Traffic Volumes

Traffic Volume White Paper

Using mobile observation data to estimate traffic volume for travel demand and transportation planning studies.

Executive Overview 

Transportation planners need accurate and comprehensive data about traffic volumes on a statewide basis. Traditionally, this need has been met by installing permanent and portable traffic counting sensors on state roadways. The paper explores how data collected from mobile devices can supplement installing and maintaining traffic counting sensors. 

The existing number of mobile devices (e.g., consumer smartphones, commercial navigation devices, fleet monitoring systems), activation of devices using location based services (GPS, Bluetooth, WiFi), and consumer adoption of applications and web services using location services (mobile apps, mobile web, i.e. Maps, Audio, Weather, Device Tracking) enables planners to estimate traffic flows from mobile devices appended to traffic message channels. 

Urban SDK estimates traffic volumes from mobile device samples derived from global positioning system (GPS)-based mobile devices, observations of devices using location based services. 

Introduction 

Real-time volume data remains the key missing dimension in operations data that would greatly improve the accuracy of assessing transportation system performance. Although agencies have invested in fixed sensors, volume data remains relatively sparse and of varying quality on the majority of the freeway and major arterial networks. Data on spatial mobility are essential in order to build and use travel demand forecasting models, for transport planning purposes and for the appraisal of transport policies. Anticipated volume does not reflect traffic volume fluctuations during weather events, major incidents, or even normal day to day fluctuations in traffic. 

A Brief History on Mobile Data 

The use of mobile phone data to construct origin-destination matrices in an urban region was first proposed in Italy by Bolla and Davoli (2000) and tested on a small sample in (White and Wells, 2002) with the aim of studying traffic on specific roads. In 2002, Akin and Sisiopiku (2002) selected just 500 individuals in the city of Birmingham in the United States. One of the first studies to use the whole population rather than a sample was carried out in Israel in 2007 (Bar-Gera, 2007). The research in question set out to estimate the traffic and obtain mean speed data on a 14 km road in Israel with 10  interchanges. Calabrese et al. (2011) were the first to produce O-D matrices from a detailed dataset, for the Boston region in Massachusetts. In 2002, two simultaneous research projects attempted to extract origin-destination matrices from mobile phone network data. One of these (Akin, Sisiopiku, 2002), working in the city of Birmingham (USA), developed an algorithm which calculated origins and destinations and divided the day into periods. To compute the subject’s position during each time periods, they took the largest number of connections in a zone. 

Our aim is to expand on the historical work to provide cities, counties, state agencies, DOT, and metropolitan planning agencies with the greatest access to mobile data to improve planning and reduce total reliance on surveys and modeling. 

Evaluation Results 

Urban SDK evaluated the accuracy of average annual daily traffic (AADT) volume estimates from Urban SDK Traffic Data using actual volume counts from FDOT traffic monitoring sites. The sites were grouped according to traffic volume levels since the magnitude of error appeared to be correlated to traffic volume (i.e., low-volume roads typically had higher estimation error). 

The mean absolute percent error for the AADT estimates was 41% for all sites but ranged from 22% at high-volume sites to 79% at low-volume sites. The mean error was strongly influenced by numerous outliers in all volume categories. 

Conclusions and Recommendations 

Traffic volume estimation from mobile devices can provide accuracy and granularity to estimate traffic volumes. Some of the traffic volume estimates are within acceptable error ranges (10% to 35% absolute percent error), but other estimates on roads of 400-5000 AADT are significantly outside this acceptable error range (greater than 100% absolute percent error). Lower volume roadways will have the highest margin of error due to lower mobile device sample sizes. 

A multi-year backfill of all observed devices is recommended to more accurately forecast AADT. To ensure accuracy, traffic-volume estimates should be calibrated locally against permanent, specifically selected traffic count sites with a minimum of 4 months mobile observation data. 

Accuracy is impacted negatively when generating traffic volume estimates on a statewide basis. Any comparison should allow for selecting and controlling the characteristics of the comparison sites could have led to lower estimation error and develop a better understanding of where algorithm improvements are most needed. 

Additional Research 

Young, Stanley E., Kaveh Sadabadi, Przemysław Sekuła, Yi Hou, and Denise Markow.

2018. “Estimating Highway Volumes Using Vehicle Probe Data – Proof of Concept: Preprint.” Golden, CO: National Renewable Energy Laboratory. NREL/CP-5400-70938. URL. https://www.nrel.gov/docs/fy18osti/70938.pdf 

Development of Traffic Volume Estimates 

This section describes how we generate traffic volume estimates. 

Urban SDK Traffic Data 

Urban SDK provides analytics for estimating traffic volumes. Traffic volume estimation models are the intellectual property of Urban SDK and are considered confidential. Our overall approach to traffic volume estimation is as follows: 

1. Combine GPS-enabled, location based services (LBS) and advertising data into “Traffic” data. These are distinct datasets that Urban SDK aggregates from source data providers. 2. Normalize traffic data by US Census population estimates. This provides the first scaling factor that attempts to account for the mobile device sampling. 

3. Normalize traffic data by roadway Traffic Message Center (TMC) code for mobile device sampling by roadway. 

4. Calibrate the mobile device samples using public agency traffic volume sources. This provides the second scaling factor that attempts to account for the mobile device scaling. The public agency traffic volumes typically come from permanent traffic monitoring sites, where there is greatest confidence in the traffic volume accuracy. 

Urban SDK used traffic counts from 1,077 FDOT monitored road segments to calibrate the traffic volume estimates. The locations had originally been identified for the purposes of evaluation/validation of traffic volume levels. Permanent and temporary monitoring sites with annual average daily traffic (AADT) volumes less than 400 vehicles per day are removed from any comparison due to poor prediction results. 

Urban SDK Traffic Analytics 

Figure 1.1 shows a scatterplot of Urban SDK Mobile Data AADT estimates as compared to actual FDOT AADT values. For the purposes of this analysis, the results were divided into five traffic volume level categories: 

  1. AADT values from 50 to 5,000 vehicles per day 
  2. 2. AADT values from 5,000 to 10,000 vehicles per day 
  3. 3. AADT values from 10,000 to 20,000 vehicles per day 
  4. 4. AADT values from 20,000 to 50,000 vehicles per day  
  5. AADT values greater than 50,000 vehicles per day
Figre 1.1 Note: X and Y axis are log scale

Equation 1

Equation 2

Comparisons and Accuracy

Table 1.1 summarizes the accuracy measures for each traffic volume level category, then for all short duration count sites combined.

Table 1.1 Accuracy measures for traffic volumes

There are several key findings regarding the comparison and resulting accuracy measures:

  • Traffic volume level is an important factor in estimation accuracy: The accuracy results were better for higher traffic volumes than for lower traffic volumes, which could be explained by larger sample sizes (and sample rates) of mobile devices on roads with higher traffic volumes. Also, there were many more comparison sites on low-volume roads, which could skew the average accuracy measures from high-volume sites. Therefore, it is important to separately report accuracy measures for different traffic volume levels.
  • Urban SDK Data AADT estimates are biased high: The mean signed difference was positive in all traffic volume level categories, which indicates a positive bias. In other words, Urban SDK Data AADT estimates were consistently greater than FDOT AADT values. 
  • Average error statistics in Table 1.1 are strongly influenced by numerous outliers in lower volume categories: There are numerous comparison outliers (resulting in absolute percent errors exceeding 1,000%) that can be seen in Figure 1.1 that strongly influence the average/mean error statistics in Table 1.1. For example, the median absolute percent error is markedly lower for the same categories as shown in Table 1.1  

Figures 1.2 through 1.6 illustrate the wide range of error values in this initial comparison. These charts illustrate that a small number of comparisons had much higher error values than the majority of comparisons. 

 

Figure 1.2 Box-and-whiskers plot for Urban SDK Data error, 50 to 5,000 AADT
Figure 1.3 Box-and-whiskers plot for Urban SDK Data error, 5,000 to 10,000 AADT

Figure 1.4 Box-and-whiskers plot for Urban SDK Data error, 10,000 to 20,000 AADT
Figure 1.5 Box-and-whiskers plot for Urban SDK Data error, 20,000 to 50,000 AADT
Figure 1.6 Box-and-whiskers plot for Urban SDK Data error, 50,000+ AADT