Now that season 10 of the hit TV show, The Walking Dead, is on hiatus until 2020, our minds turn to zombies of a different sort. A zombie statistic is a false or misleading statistic, often reanimated from studies conducted many years ago and now printed as truth without citation.
Well-constructed prevalence studies on health conditions require large populations to produce statistically relevant results. The more rare the disease, the bigger the required population. The expense and effort to accurately assess the prevalence of a rare disease is beyond the reach of most study budgets. As a result, we often see zombie-like rare disease prevalence estimates that may not have a basis in reality. In addition, publications rarely segment prevalence rates by U.S. health insurance market (commercial, Medicaid, Medicare, individual) even though sometimes significant differences in prevalence may exist among different covered populations.
The U.S. incents the development of drugs to treat rare diseases through the Orphan Drug Act of 1983, which defines a rare disease as having fewer than 200,000 individuals affected. We used real-world data to calculate the age-adjusted prevalence rate for three diseases: rheumatoid arthritis, which is uncommon but not a rare disease by the Orphan Drug Act definition; ulcerative colitis, which could be considered a rare disease depending on which source is cited; and hemophilia, for which antihemophilic factor was approved in 2010 by the Food and Drug Administration (FDA) under the 1983 Orphan Drug Act. We then compared these results to zombie statistics commonly cited.
We queried administrative health claims databases of group commercial, individual, Medicare Advantage, and managed Medicaid populations with incurred claims during 2016 and 2017, consisting of over 926 million member months, searching for patients with ICD-10 codes describing each of the rare diseases. We calculated prevalence by age and then applied these results to the U.S. population to produce an estimate of the number of patients in the United States who are currently diagnosed with each disease.
According to the Arthritis Association approximately 1.5 million people in the United States have rheumatoid arthritis (RA). 1 Healthline.com reports 1.3 million. 2 Yet neither of these authoritative organizations names a source for their estimates. Both organizations indicate that RA often occurs later in life, yet neither provides estimates by age or specifically for the Medicare population.
A 2019 study of the incidence and prevalence of rheumatoid arthritis within members of an integrated healthcare delivery system (Aniket A. Kawatkar et al) 3 and another paper, describing the cost-effectiveness of Sarilumab monotherapy for adults with active rheumatoid arthritis (Melanie D. Whittington et al) 4, cite a Rochester, Minnesota, study published in 1999.5 These papers indicate that the prevalence of RA in the United States is 1.5 million, in the Kawatkar, and 1.8 million, in the Whittington. The Whittington study was conducted using medical records in Rochester, Minnesota, between 1955 and 1984, which was about the time of the dawn of the personal computer and well before the use of cell phones.
The table in Figure 1 shows the age-adjusted RA prevalence rate per 100,000 people by payer as calculated from our administrative claims data. Note the considerable variance by market. Additionally, the prevalence is substantially higher than the cited numbers described above. We included a 95% confidence interval, which means that, as an example, we are 95% confident we would find between 2,579 and 2,635 (2,607 +/- 28) people with RA from a population of 100,000 Medicare beneficiaries.
Figure 1: Rheumatiod arthritis
|Market||Insured population6||People with RA||Prevalence per 100,000 people||95% Confidence interval|
|Commercial and Tricare||159,600,000||1,500,000||940||+/- 5|
Up until December 2019, the Centers for Disease Control and Prevention (CDC) stated that ulcerative colitis (UC) affects 37 to 246 people per 100,000 of the population. The CDC’s source, published in 2004, is a study by Edward V. Loftus Jr. 7 This paper is a review of selected registries in North America, Europe, and Asia, with prevalence dates ranging from 1980 to 1997. The prevalence rate endpoints of the CDC range come from two different locations two decades apart: a northern Alberta, Canada, population in 1981 is the source of the 37/100,000 rate and an Olmstead County, Minnesota, population in 2001 is the source of the 246/100,000 rate.
The American College of Gastroenterology (ACG) Clinical Guideline for Ulcerative Colitis in Adults,8 published in 2019, states that nearly 1 million Americans are affected with UC, though it doesn’t note a source.
The table in Figure 2 shows the age-adjusted UC prevalence rate per 100,000 population by payer that we calculated from our administrative claims data. The overall prevalence rate of 253 per 100,000 people is similar to the CDC’s high end of the range, with the Medicare population having the highest prevalence rate (474/100,000) and Medicaid having the lowest (121/100,000). The total prevalence does not quite reach 1 million as quoted in the 2019 clinical guideline.
Figure 2: Ulcerative Colitis
|Market||Insured population9||People with UC||Prevalence per 100,000 people||95% Confidence interval|
|Commercial and Tricare||159,600,000||400,000||251||+/- 3|
The CDC estimates that 20,000 males in the United States suffer from hemophilia. 10 This statistic is based upon a study by J.M. Soucie et al11 published in 1998 and based on an “active surveillance system in six states.”
In a recently published paper by Alfonso Iorio et al,12 a total prevalence rate was reported at 20.9 per 100,000 males. In this study, the authors applied a steady state analysis to hemophilia registry data. 13
The table in Figure 3 shows the age-adjusted hemophilia prevalence rate per 100,000 male population by payer that we calculated from our administrative claims data. The overall prevalence rate is similar to the cited values described by Iorio, but we show a higher estimate of the number of males with hemophilia. Once again, there are differences by market, with Medicare and Medicaid having the highest rates among the four markets shown.
Figure 3: Hemophilia
|Market||Male insured population14, 15, 16, 17, 18||Males with hemophilia||Prevalence per 100,000 males||95% Confidence interval|
|Commercial and Tricare||78,200,000||10,000||13||+/- 3|
Prevalence rates in rare diseases vary in accuracy, may become outdated with new advances in treatment, and are seldom broken into payer types found in the U.S. market. A large administrative claims survey of the prevalence of rare diseases may increase insight into the way individuals are treated for disease and may aid in providing funding for treatments. There is no reason to be afraid or to use old and undocumented prevalence estimates, even in rare diseases. A wealth of real-world data is waiting to be explored.
Milliman 2016 and 2017 consolidated data sets
Milliman Standard Demographic Assumptions – Commercial Insurance
The table in Figure 4 shows the portion of the U.S. population represented by members who were continuously enrolled in 2016 and 2017, by region.
Figure 4: Portion of Population Continuously Enrolled, 2016-2017, by Region
We calculated a prevalence rate for each disease, health insurance market, and age range band, and applied it to the number of people insured in each market and age range band.
Prevalence rate = numerator/denominator
Numerator is the number of people estimated to have a particular condition as described below. For hemophilia, only males were considered.
Denominator is the number of people who were continuously enrolled in 2016 and 2017, by health insurance market and age band. For hemophilia, only males were considered.
For each disease, health insurance market, and age range band, we found people who had one or more of the diagnosis codes shown in the table in Figure 5 within the first five diagnosis codes listed on claims records. The table in Figure 6 contains the age range bands used by the health insurance market. We removed people from the study who were not continuously enrolled during 2016 and 2017. We also removed people who did not have at least one inpatient admit, one outpatient visit (excluding radiology and lab), or two professional visits (again excluding radiology and lab) during 2016 and 2017.
Figure 5: ICD-10 Diagnosis codes
|Disease||ICD-10 Diagnosis codes|
|Rheumatoid arthritis||M05, M06, M08, M120, M45, M488X1, M488X2, M488X3, M488X4, M488X5, M488X6, M488X7, M488X8, M488X9|
|Ulcerative colitis||K5180, K51811, K51812, K51813, K51814, K51818, K51819, K5190, K51911, K51912, K51913, K51914, K51918, K51919|
|Hemophilia||D66 and D67|
Figure 6: Age range bands by health insurance market
|Market||Age range bands|
|Medicare||0-64, 65-74, 75-84, 85 and over|
|Commercial and Tricare||0-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64, 65 and over|
|Individual||0-17, 18-34, 35-44, 45-54, 55-64, 65 and over|
|Medicaid||0-20, 21-26, 27-45, 46-64, 65 and over|
Population with each disease
To estimate the number of insured people with each disease, we multiplied the prevalence rate for each health insurance market and age range band by the number of people insured in that market and age band. We averaged 2016 and 2017 population counts by health insurance market type. We have excluded uninsured from this study. We applied the commercial prevalence rate to the Tricare population because our data did not allow for separate calculation of prevalence within the Tricare segment.
We relied on the diagnosis codes in the administrative claims data. Inaccurate diagnosis codes could change the prevalence rates we’ve calculated. Many rare diseases have not been assigned ICD-10 diagnosis codes and therefore an analysis of this type is not currently possible. Health insurance market populations may not represent each state in the same proportion as their insured populations. Prevalence rates can change over time as new tests and treatments are developed. Different methodologies could also produce different prevalence rates. We assumed that the distribution of people with each disease resembles a normal distribution when calculating the confidence interval.