Random forest analysis of two household surveys can identify important predictors of migration in Bangladesh

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Kelsea B. Best - , Vanderbilt University (Author)
  • Jonathan M. Gilligan - , Vanderbilt University (Author)
  • Hiba Baroud - , Vanderbilt University (Author)
  • Amanda R. Carrico - , University of Colorado Boulder (Author)
  • Katharine M. Donato - , Georgetown University (Author)
  • Brooke A. Ackerly - , Vanderbilt University (Author)
  • Bishawjit Mallick - , Chair of Environmental Development and Risk Management (Author)

Abstract

The decision to migrate is complex and is often influenced by a combination of economic, social, political, and environmental pressures. Household survey instruments can capture detailed information about migration histories and their contexts, but it can be challenging to identify important predictors from large numbers of covariates with standard statistical methods, such as regression analyses. Machine learning techniques are well suited to pattern identification and can identify important covariates from large datasets. We report on the application of machine learning approaches to two large surveys collected from a total of more than 2800 households in southwestern Bangladesh. We applied random forest classification and regression models to identify significant covariates with the greatest predictive power for household migration decisions. The results show that random forest models are able to identify nuances in predictors of different types of migration and migration in different communities. Random forests also outperform logistic regression and support vector machines in predicting migration in all cases analyzed. Therefore, random forest models and other machine learning methods can be useful for improving the predictive accuracy of migration models and identifying patterns in complex social datasets. Future work should continue to explore the potential of machine learning techniques applied to questions of environmental migration.

Details

Original languageEnglish
Pages (from-to)77-100
Number of pages24
JournalJournal of Computational Social Science
Volume4
Issue number1
Publication statusPublished - May 2021
Peer-reviewedYes

Keywords

Sustainable Development Goals

ASJC Scopus subject areas

Keywords

  • Bangladesh, Climate change, Machine learning, Migration, Random forest