Research

Leveraging big data to improve influenza surveillance system design

Every flu season, the U.S. Centers for Disease Control and Prevention (CDC) recruits roughly 3,000 physicians across the United States to report how many of their patients appear to have flu-like symptoms. These physicians form the core of the country’s sentinel surveillance system, a data source which is used to determine the geographic spread, timing, and severity of the influenza season nation-wide. While everyone acknowledges the importance of sentinel reporting, physicians are given few incentives to participate due to limited time and resources. My newest paper, which was recently published in PLoS Computational Biology, tackles the challenging question of how to improve targeting for sentinel physician recruitment by leveraging the high volume of aggregated medical claims data.

How can we improve sentinel site recruitment?

Compared to traditional sentinel surveillance, our medical claims data has reports from over 120,000 physicians and represents roughly 20% of all visits to health care providers during our study period. We found that our estimates of influenza disease burden and our inference about what drives the variation in its spatial distribution were most robust when the same sentinel locations reported data every year. Yet even with the best sentinel recruitment design, we observed that 10-30% of county-level estimates of disease burden were poor at the level of coverage at which the CDC collects U.S. outpatient influenza surveillance data. This means that surveillance practitioners should strive to recruit the same health care providers each flu season in order to get the most information out of the reported data.

What did we learn about influenza epidemiology?

The statistical surveillance model that we used to evaluate sentinel surveillance design also provided valuable insights about influenza epidemiology in the United States. During our study period of flu seasons from 2002-2003 through 2008-2009, we found that mid-Atlantic states had greater relative risk for influenza disease burden, and that socio-environmental factors, local population interactions, state-level health policies, and sampling and reporting levels contributed to the spatial patterns of disease.

Read the full paper:

Lee EC, Arab A, Goldlust SM, Viboud C, Grenfell BT, Bansal S (2018) Deploying digital health data to optimize influenza surveillance at national and local scales. PLoS Comput Biol 14(3): e1006020. https://doi.org/10.1371/journal.pcbi.1006020.

Code, Research

Launching the flu severity index application

TL;DR version —

Click here to check out my new Shiny app, which displays the U.S. seasonal influenza severity index as calculated from Centers for Disease Control and Prevention ILINet data from 1997-98 to 2013-14.


 

My recent paper proposed new methods for quantifying seasonal influenza severity by looking at the relative risk of influenza-like illness between adults and children at varying points in the flu season in the United States. Don’t worry, this isn’t a repeat of my recent blog post on the paper itself.

As a proponent of open science, I had always been planning to post the code I had used to generate the data and figures that appear in the main manuscript. Due to the proprietary uses of the medical claims data, the primary data source in the paper, however, it was clear that we could not post any of the data itself. In these circumstances, I asked myself — why post code that wouldn’t add value beyond the findings of the paper?

As an alternative, I’m excited to announce the launch of a web application that displays the seasonal influenza severity index, as calculated with U.S. CDC’s ILINet data. These data are publicly available from CDC’s website through FluView Interactive, and I showed these results in the Supporting Material. The original analyses were conducted in Python, but I’ve developed the web application with the Shiny package in R (post to follow about that experience!).

Data from the 1997-98 to 2013-14 flu seasons are pre-loaded into the application. Users can use the drop-down menu to view two figures from a specific season: 1) adult and child ILI rates from week 40 (first week of October) to week 39 in the following year, and 2) the population-level severity index, as calculated according to the methods in the paper.

The goal of this web application is to make the results and “intuition” derived from the paper more accessible to researchers, policymakers, and the public. I hope to add features to the application in the future (e.g., ability for users to upload their own data), so suggestions are welcome!

Check out the seasonal influenza severity index application here!

sevixFluApp

 

News, Research

Novel indexes for estimating population-level flu severity

I know one great way to start off the new year — Check out my new paper on “Detecting signals of seasonal influenza severity through age dynamics” in BMC Infectious Diseases!

What is this paper about?

Typically, when we think about severity in the context of epidemiology, we ask: “Of all of the people who have this condition, how many or them died or were hospitalized by its symptoms?” These measures, also known as the case-fatality or case-hospitalization risks, are standard ways of quantifying the severity magnitude of a disease.

Unfortunately, it’s really challenging to estimate how many people get influenza every year and only a small subset of the population gets ill enough to die or become hospitalized.

  1. At the population-level, we can only observe the sick individuals that report their illness in some way (e.g., those that visit the doctor, buy drugs to combat flu, call in sick for school or work, or complain about symptoms on social media). It’s possible that all individuals with symptoms might be captured across multiple data sources, but how do you combine information from hospitals, drug companies, and Twitter in a meaningful way?
  2. Many flu cases are asymptomatic — people themselves may not even know that they are sick. These asymptomatic individuals can still transmit the virus to others — some immune systems might be strong enough to fight off the virus without generating symptoms, but people receiving the infection from asymptomatic individuals can still end up feeling crummy.
  3. We don’t usually test for flu among individuals that go to the doctor. In most cases, identifying the specific virus that is causing your symptoms won’t change the treatments they will prescribe, so it’s not often useful to confirm that influenza virus is the source of illness. They’ll prescribe you general antiviral drugs and send you back home for bed rest.
  4. The elderly and young toddlers are most at risk for mortality and hospitalization. Functionally, existing flu severity metrics focus only on the outcomes of these two age groups.

How can we capture information about the severity of a flu outbreak with fewer data sources and for a greater portion of the population?

In this paper, we use routinely available flu surveillance data to identify age patterns among working-aged adults and school-aged children in “influenza-like illness cases” (unconfirmed sick cases that look like they could be flu) that are consistent across multiple flu seasons in the United States. We use these observed age patterns to create a new severity index; this index has some demonstrated capacity to detect severity early on in the flu season. We compare this new index to other quantitative severity benchmarks and examine data at the level of the entire U.S. and across different states. Public health officials may be able to use these measures to inform communication strategies during the course of an outbreak.

Bottom line: We suggest that it may be possible to use the relative risk of influenza-like illness between adults and children in imperfectly sampled data sources to estimate flu severity in the entire population.

Click here to read more!

We will be posting the code for these analyses on Bansal Lab Github in the coming weeks. Stay tuned for details!