Open Access
Open access
volume 4 issue 1 pages e4

Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis

Fred Sun Lu 1
Suqin Hou 2
Kristin Baltrusaitis 3
Manan Shah 4
Jure Leskovec 5
Rok Sosic 4
Jared Hawkins 6, 7
John S. Brownstein 6, 7
Giuseppe Conidi 8
Julia Gunn 8
Josh Gray 9
Anna Zink 9
Mauricio Santillana 6, 7
Publication typeJournal Article
Publication date2018-01-09
scimago Q1
wos Q1
SJR1.289
CiteScore6.3
Impact factor3.9
ISSN23692960
Public Health, Environmental and Occupational Health
Health Informatics
Abstract
Background: Influenza outbreaks pose major challenges to public health around the world, leading to thousands of deaths a year in the United States alone. Accurate systems that track influenza activity at the city level are necessary to provide actionable information that can be used for clinical, hospital, and community outbreak preparation. Objective: Although Internet-based real-time data sources such as Google searches and tweets have been successfully used to produce influenza activity estimates ahead of traditional health care–based systems at national and state levels, influenza tracking and forecasting at finer spatial resolutions, such as the city level, remain an open question. Our study aimed to present a precise, near real-time methodology capable of producing influenza estimates ahead of those collected and published by the Boston Public Health Commission (BPHC) for the Boston metropolitan area. This approach has great potential to be extended to other cities with access to similar data sources. Methods: We first tested the ability of Google searches, Twitter posts, electronic health records, and a crowd-sourced influenza reporting system to detect influenza activity in the Boston metropolis separately. We then adapted a multivariate dynamic regression method named ARGO (autoregression with general online information), designed for tracking influenza at the national level, and showed that it effectively uses the above data sources to monitor and forecast influenza at the city level 1 week ahead of the current date. Finally, we presented an ensemble-based approach capable of combining information from models based on multiple data sources to more robustly nowcast as well as forecast influenza activity in the Boston metropolitan area. The performances of our models were evaluated in an out-of-sample fashion over 4 influenza seasons within 2012-2016, as well as a holdout validation period from 2016 to 2017. Results: Our ensemble-based methods incorporating information from diverse models based on multiple data sources, including ARGO, produced the most robust and accurate results. The observed Pearson correlations between our out-of-sample flu activity estimates and those historically reported by the BPHC were 0.98 in nowcasting influenza and 0.94 in forecasting influenza 1 week ahead of the current date. Conclusions: We show that information from Internet-based data sources, when combined using an informed, robust methodology, can be effectively used as early indicators of influenza activity at fine geographic resolutions.
Found 
Found 

Top-30

Journals

1
2
3
4
5
6
7
Journal of Medical Internet Research
7 publications, 8.24%
JMIR Public Health and Surveillance
6 publications, 7.06%
Scientific Reports
3 publications, 3.53%
PLoS Computational Biology
3 publications, 3.53%
Science advances
2 publications, 2.35%
IEEE Access
2 publications, 2.35%
Health Informatics
2 publications, 2.35%
JMIR Mental Health
1 publication, 1.18%
Interactive Journal of Medical Research
1 publication, 1.18%
JMIR Infodemiology
1 publication, 1.18%
Machine Learning and Knowledge Extraction
1 publication, 1.18%
Public Health Reports
1 publication, 1.18%
Frontiers in Public Health
1 publication, 1.18%
Frontiers in Artificial Intelligence
1 publication, 1.18%
Frontiers in Research Metrics and Analytics
1 publication, 1.18%
International Journal of Biometeorology
1 publication, 1.18%
Environmental Science and Pollution Research
1 publication, 1.18%
Nature Communications
1 publication, 1.18%
BMC Public Health
1 publication, 1.18%
BMC Research Notes
1 publication, 1.18%
Rheumatology International
1 publication, 1.18%
Journal of Big Data
1 publication, 1.18%
Physics Reports
1 publication, 1.18%
Vaccine: X
1 publication, 1.18%
Epidemics
1 publication, 1.18%
EBioMedicine
1 publication, 1.18%
SSRN Electronic Journal
1 publication, 1.18%
The Lancet Digital Health
1 publication, 1.18%
Computers in Biology and Medicine
1 publication, 1.18%
1
2
3
4
5
6
7

Publishers

5
10
15
20
25
JMIR Publications
25 publications, 29.41%
Springer Nature
15 publications, 17.65%
Elsevier
9 publications, 10.59%
Institute of Electrical and Electronics Engineers (IEEE)
6 publications, 7.06%
Frontiers Media S.A.
4 publications, 4.71%
Taylor & Francis
4 publications, 4.71%
Cold Spring Harbor Laboratory
4 publications, 4.71%
MDPI
3 publications, 3.53%
Public Library of Science (PLoS)
3 publications, 3.53%
American Association for the Advancement of Science (AAAS)
2 publications, 2.35%
SAGE
1 publication, 1.18%
Social Science Electronic Publishing
1 publication, 1.18%
American Chemical Society (ACS)
1 publication, 1.18%
Oxford University Press
1 publication, 1.18%
Cambridge University Press
1 publication, 1.18%
American Medical Association (AMA)
1 publication, 1.18%
Proceedings of the National Academy of Sciences (PNAS)
1 publication, 1.18%
5
10
15
20
25
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
85
Share
Cite this
GOST |
Cite this
GOST Copy
Lu F. S. et al. Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis // JMIR Public Health and Surveillance. 2018. Vol. 4. No. 1. p. e4.
GOST all authors (up to 50) Copy
Lu F. S., Hou S., Baltrusaitis K., Shah M., Leskovec J., Sosic R., Hawkins J., Brownstein J. S., Conidi G., Gunn J., Gray J., Zink A., Santillana M. Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis // JMIR Public Health and Surveillance. 2018. Vol. 4. No. 1. p. e4.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.2196/publichealth.8950
UR - https://doi.org/10.2196/publichealth.8950
TI - Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis
T2 - JMIR Public Health and Surveillance
AU - Lu, Fred Sun
AU - Hou, Suqin
AU - Baltrusaitis, Kristin
AU - Shah, Manan
AU - Leskovec, Jure
AU - Sosic, Rok
AU - Hawkins, Jared
AU - Brownstein, John S.
AU - Conidi, Giuseppe
AU - Gunn, Julia
AU - Gray, Josh
AU - Zink, Anna
AU - Santillana, Mauricio
PY - 2018
DA - 2018/01/09
PB - JMIR Publications
SP - e4
IS - 1
VL - 4
PMID - 29317382
SN - 2369-2960
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2018_Lu,
author = {Fred Sun Lu and Suqin Hou and Kristin Baltrusaitis and Manan Shah and Jure Leskovec and Rok Sosic and Jared Hawkins and John S. Brownstein and Giuseppe Conidi and Julia Gunn and Josh Gray and Anna Zink and Mauricio Santillana},
title = {Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis},
journal = {JMIR Public Health and Surveillance},
year = {2018},
volume = {4},
publisher = {JMIR Publications},
month = {jan},
url = {https://doi.org/10.2196/publichealth.8950},
number = {1},
pages = {e4},
doi = {10.2196/publichealth.8950}
}
MLA
Cite this
MLA Copy
Lu, Fred Sun, et al. “Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis.” JMIR Public Health and Surveillance, vol. 4, no. 1, Jan. 2018, p. e4. https://doi.org/10.2196/publichealth.8950.