Integrating Household Travel Survey and Social Media Data to Improve the Quality of OD Matrix: A Comparative Case Study

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

Abstract

Collecting effective data is a fundamental step in developing transport networks and related research. Social media have become an emerging source of data for traffic analyses. In this paper, we demonstrate that the function of a city influences the utility of social media data in travel demand models by generating models for eight US cities with different functions. Data from Twitter and Foursquare, as well as other socio-demographic information, are considered as independent variables in Origin-Destination trip regression models generated via a Random Forest regression technique. Model performance with and without use of social media data are compared via 10-fold cross-validation. The results indicate that the accuracy of the models for all eight cities improved when independent variables based on social media data were included. The performance was most improved in metropolitan areas, followed by rural and tourist areas. Inspired by this finding, we conclude that the city function influences the utility of social media data in travel demand models. Meanwhile, we create models based on trip purpose and transport mode to explore other factors that may impact the efficiency of applying social media data in transport research.

Details

Original languageEnglish
Article number9098034
Pages (from-to)2628-2636
Number of pages9
JournalIEEE Transactions on Intelligent Transportation Systems
Volume21
Issue number6
Publication statusPublished - Jun 2020
Peer-reviewedYes

External IDs

ORCID /0000-0002-2939-2090/work/141543711

Keywords

Keywords

  • Foursquare, multi-city model, random forest regression, travel demand estimation, Twitter