Much has changed since we published our first white paper1 analyzing key drivers of gross savings2 for accountable care organizations (ACOs) participating in the Medicare Shared Savings Program (MSSP) during 2015. Most notably, the Centers for Medicare and Medicaid Services (CMS) has made changes to many aspects of the program, including the benchmarking methodology and options for risk sharing. Additionally, participation in the program increased and ACOs have had more time to implement and refine their population health strategies. Finally, the number of ACOs in a two-sided version of the program has increased dramatically, with 150 ACOs accepting some level of risk for shared losses in 2019, compared with only three in 2015.
Considering the changes over the past four years, we wanted to analyze more recent MSSP results for ACOs to understand whether the conclusions from our original paper still apply in today’s MSSP environment. What we found is that the drivers of recent success are quite different and, in some cases, the opposite of what they were in 2015.
Our findings can be summarized into five key takeaways:
- Changes to the benchmarking methodology effective in 2017 have benefitted ACOs that have historically had low risk-adjusted costs relative to their peers.
- Having a benchmark significantly above costs in benchmark year 3 (BY3), the most recent benchmark year, is a major driver of higher gross savings, at least in the first performance year.
- ACOs that selected tracks with downside risk tended to be more successful in achieving gross savings, although there are multiple plausible explanations for why this occurred.
- ACOs with few specialists and hospitals on their participant list tended to have higher gross savings than other ACOs.
- Higher rates of primary care evaluation and management (E&M) visits and lower rates of unplanned inpatient hospital admissions were correlated with higher gross savings.
We analyzed data from the 2019 performance year for ACOs in the Pathways to Success model (Pathways), referred to by CMS as “2019A.” Our primary data source was the MSSP Public Use File (PUF), but we supplemented this data with information gathered by Milliman’s Torch Insight and Medicare Repricer product teams, such as geographic reimbursement factors and other risk-sharing contracts that ACOs are involved in. Consistent with our previous analysis, we applied a machine learning algorithm known as a Random Forest to identify which of the nearly 220 ACO characteristics were most strongly associated with gross savings. For purposes of this analysis, the biggest advantage of the Random Forest algorithm is that it can handle a large number of features, including features that are highly collinear (for instance, BY1 costs and BY2 costs). Based on the results of the Random Forest, we then dove deeper into specific aspects of the data to better understand these relationships.
The remainder of this paper will explore each of these topics in more detail. In Appendix 1, we numerically rank the top 25 ACO characteristics identified by the Random Forest. The ‘Data Sources and Methodology’ section includes further detail on our approach to the analysis. Unless otherwise stated, values in tables and graphs throughout the paper are observed values and not the values predicted by the Random Forest model.
Note for readers: Upon review of existing literature, we were unable to find any studies measuring the relationship of a comprehensive set of ACO characteristics to ACO gross savings in a performance year after the implementation of Pathways to Success (2019 or beyond). We were able to find a number of studies that focused on the correlation of select ACO characteristics or performance measures to gross savings. Additionally, we found one study that focused on comprehensive set of ACO characteristics and their association with gross savings; however, this study was conducted prior to Pathways to Success, and as a result, we would not expect its conclusions to align with our analysis.
Changes to the benchmarking methodology effective in 2017 have benefitted ACOs that have historically had low risk-adjusted costs relative to their peers.
In June 2016, CMS finalized a rule changing the financial benchmark formula effective January 2017. A key component of this rule change involved adjusting an ACO’s financial benchmark based on its risk-adjusted average per capita cost position relative to that of its region3. This new approach fundamentally changed how CMS would measure and reward ACO performance. Prior to the rule change, ACOs were measured solely based on performance relative to their own historical expenditures. After the change, however, ACOs would also be evaluated based on their cost efficiency relative to regional expenditure levels. CMS’ stated intent for this change was to “…strengthen the incentives for ACOs to invest in infrastructure and care redesign necessary to improve quality and efficiency…”4
Our study indicates CMS was successful in meeting one of its primary policy objectives5. An ACO’s BY3 regional efficiency factor (defined as the ACO’s BY3 expenditures divided by the risk-adjusted expenditures in its region) was the most important ACO characteristic in predicting gross savings percentage in our Random Forest analysis, with more efficient ACOs (d) generating more savings than less efficient ACOs. This is in stark contrast to our original study, which measured results under the former financial benchmark formula, and found high starting risk-adjusted expenditures as the most important characteristic. To understand this relationship between an ACO’s regional efficiency and gross savings percentage, we ran a linear regression with regional efficiency as the input variable and actual gross savings percentage as the prediction variable. The results confirmed the Random Forest analysis and demonstrated strong correlation, as shown in Figure 1.
Figure 1: Regional Efficiency Factor vs. Gross Savings/(Loss) Percentage
Note: A small number of ACOs had outlier values that fell outside of the range shown on this chart. These ACOs were included in the Random Forest analysis.
Having a benchmark significantly above costs in BY3, the most recent benchmark year, is a major driver of higher gross savings, at least in the first performance year.
On the surface, this sounds distinctly different from regional efficiency; however, the two concepts are related in that an ACO’s trend across benchmark years directly impacts the relationship between its benchmark and expected expenditures (i.e., expected savings). Although the financial benchmark is now partially driven by regional expenditure levels (as previously mentioned), 50% to 85% of the financial benchmark is still based on an ACO’s historical expenditures. Therefore, it makes sense that an ACO reducing expenditures relative to benchmark trend levels would lead to increased likelihood of savings. We have often referred to this as “benchmark tailwind” and have provided an illustration of the concept In Figure 2.
Figure 2: “Benchmark Tailwind” Illustration
In our previous study, we found that BY1-to-BY3 trend ranked 25th among the more than 190 ACO characteristics we reviewed; in this most recent study, it ranked 2nd. Although “benchmark tailwind” was correlated with gross savings in the past, its importance has been magnified with the Pathways to Success rule. Considered alongside the most predictive characteristic, the BY3 regional efficiency factor, these illustrate CMS’s intention to balance incentives for both efficient and inefficient ACOs to join and/or stay in the program6.
Given the recent implementation of Pathways to Success, we were able to review only performance year 1 (PY1) results. It remains to be determined whether an ACO’s BY3 expenditures relative to the financial benchmark will continue to be as important in future performance years. Because MSSP agreement periods last five years before resetting the regionally adjusted financial benchmark, this may provide opportunities for ACOs less well positioned at the start to catch up in subsequent PYs.
ACOs that selected tracks with downside risk tended to be more successful in achieving gross savings, although there are multiple plausible explanations for why this occurred.
The Random Forest model indicated that participation in a downside risk track (as opposed to an upside-only track) was predictive of higher savings rates. This programmatic characteristic is an interesting one that deserves further exploration because there are competing explanations that might have bearing on policy discussions. On one hand, even though the tracks within Pathways are set up to move participants toward greater risk, the voluntary nature of the entire program means that it is possible for ACOs to select the track in which they expect to perform most favorably. On the other hand, the incentives faced by ACOs in two-sided risk arrangements may drive changes in care delivery that result in savings to the Medicare program, as CMS intended7.
With the introduction of Pathways to Success, the number of ACOs in risk-bearing tracks has increased dramatically. Opportunities to select tracks were not nearly as abundant during the 2015 performance period considered in our original paper. At that time, only three ACOs had chosen to enter Track 2, the only track with downside risk that year. By contrast, the number of ACOs in tracks with downside risk rose to 150 in July 20198, the start of the first Pathways performance period. Despite declining overall participation in the MSSP since 2018, the number of downside risk ACOs at the beginning of 2021 had increased further to 195, almost 41% of this year’s 478 participating ACOs.
Those ACOs in downside risk tracks in the 2019A performance period tended to have higher savings rates. Figure 3 shows that participants in the Enhanced track generated gross savings rates that were more than 2.0 percentage points higher than participants in Basic Tracks A or B, and the downside risk portions of the Basic track (Levels C/D/E) also had higher gross savings than those in upside only.
Figure 3: Average Gross Savings/(Loss) Percentage9 by Pathways Track, MSSP 2019A Performance Period
It may be that stronger incentives for Basic C/D/E vs. Basic A/B and Enhanced vs. Basic C/D/E are driving better performance. However, an alternative explanation is that, because the MSSP is voluntary, the incentives in tracks with downside risk may attract providers that have a strong prior expectation of success in the program. It could be the case that higher savings in two-sided risk tracks is a result of already high-performing providers selecting into those tracks, rather than (or perhaps in addition to) strong incentives.
One analytical approach we used to investigate the reason for the observed positive correlation between risk and gross savings was to compare ACOs that started Pathways to Success in higher-risk tracks than they were required to enter. Under the Pathways to Success rules, ACOs that are high revenue based on the MSSP definition,10 are experienced with downside risk, and/or are reentering ACO entities have different minimum track levels they may enter.11 Using the revenue status reported in the 2019A PUF and the ACO’s prior experience with downside risk under Medicare ACO programs,12 we present in Figure 4 a comparison of the savings rates for the two groups. Those that started in higher risk tracks than required had average gross savings rates that were approximately 1.7 percentage points higher than other ACOs. When this data is broken out by track, Figure 5 shows that voluntarily at-risk ACOs had somewhat higher savings than ACOs starting at their minimum required levels of risk. However, these differences are not as large as those between the risk-bearing tracks and Basic A/B.
Figure 4: Average Gross Savings/(Loss) Percentage by Whether the ACO Started in a Higher-Risk Track Than Required, MSSP 2019A Performance Period
Figure 5: Average Gross Savings/(Loss) Percentage by Pathways Track and Whether the ACO Started in a Higher-Risk Track Than Required, MSSP 2019A Performance Period
Clearly, participants in downside-risk tracks have achieved greater savings relative to benchmarks than their counterparts in Basic A and B. However, whether that is due more to strong incentives or strategic selection is an open question.
ACOs with few specialists and hospitals on their participant lists tended to have higher gross savings than other ACOs.
A somewhat unexpected finding of our analysis was that an ACO’s number of physician specialists in its participant list13 was one of the top predictors of gross savings. The results of our analysis indicated that ACOs with a very low number of specialists tended to have higher gross savings, but after a threshold of roughly five specialists per 1,000 beneficiaries, there did not appear to be a consistent relationship between the number of specialists and gross savings. Figure 6 illustrates this observation. The average gross savings for ACOs with fewer than five specialists per 1,000 was nearly 5.0%, while the average gross savings for ACOs with more specialists ranged from approximately 1.0% to 3.0%, with no clear pattern.
Figure 6: Gross Savings by Number of Specialists
Another related characteristic that we analyzed was the ACO’s “revenue status,” which we mentioned previously as a condition for minimum required track levels.2 Low-revenue ACOs tend to be comprised largely of physicians, while high-revenue ACOs generally include at least one hospital. Figure 7 compares the average gross savings rate for low-revenue and high-revenue ACOs, further stratified by the number of specialists per 1,000 beneficiaries. Darker red shading indicates higher values and darker blue shading indicates lower values. We found that, even after controlling for the number of specialists, the low-revenue ACOs performed better than the high-revenue ACOs. Notably, the highest-performing ACOs were low-revenue and had a low number of specialists, suggesting they were primarily comprised of primary care physicians (PCPs).
Figure 7: Gross Savings by Revenue Category and Number of Specialists
|Number of Specialists per 1,000 Beneficiaries||Low Revenue||High Revenue|
|Less than 5||5.2%||3.1%|
|5 to 25||2.2%||2.3%|
One possible explanation for these findings is that the financial incentives to manage utilization may be stronger for PCPs than for hospitals and specialists. Specialists tend to have far fewer patients assigned to them through the MSSP assignment algorithm, which can make it challenging to align incentives with the ACO. Hospitals’ revenue is largely driven by the services that ACOs often seek to reduce, such as inpatient admissions. Conversely, PCPs are able to more actively manage the care of their assigned beneficiaries without reducing their Medicare fee-for-service (FFS) revenue.
Higher rates of primary care evaluation and management (E&M) visits and lower rates of unplanned inpatient hospital admissions were correlated with higher gross savings.
Two other notable ACO characteristics that were determined to be strongly predictive of gross savings were primary care E&M visits as well as unplanned inpatient hospital admissions (measured by quality measure ACO3814). The results of our analysis have shown that ACOs with higher utilization of PCP E&M visits and fewer unplanned admissions in the performance year had higher gross savings, with both features landing among the top 12 in terms of relative importance in the Random Forest model. Although our analysis cannot prove causality, it is reasonable to expect that unexpected hospital visits, which are high cost events, would drive up per capita costs for the ACO, resulting in lower gross savings. The finding regarding PCP E&M visits may suggest that higher utilization of preventive and ambulatory care can lead to lower costs. Figure 8 shows the average gross savings after grouping ACOs into buckets based on annual unplanned admissions per 1,000 and annual PCP E&M visits per 1,000. Darker red shading indicates higher values and darker blue shading indicates lower values. Note that average gross savings generally increase with fewer unplanned admissions (moving bottom to top on the chart) as well as more PCP E&M visits (moving left to right on the chart).
Figure 8: Gross Savings by Number of Unplanned Admissions and PCP E&M Visits
|PCP E&M Visits per 1,000|
|Unplanned Admissions per 1,000||Under 4,000||4,000 to 5,000||Over 5,000|
|Less than 55||3.6%||3.9%||6.5%|
|55 to 60||2.6%||1.8%||5.6%|
|60 to 65||1.5%||2.3%||N/A|
Note: Values marked as N/A were excluded due to containing fewer than 10 observations.
While these characteristics separately had moderate predictive power, we also found the two characteristics were correlated with each other. Figure 9 shows the relationship between the two variables. As the trendline illustrates, for the ACOs in 2019A, those with a higher rate of PCP E&M visits tended to have fewer unplanned admissions.
Figure 9: Relationship Between Unplanned Admissions and PCP E&M Visits
Other ACO characteristics
There were many other characteristics included in our analysis that were not strongly associated with gross savings15, including some that were highly predictive in our previous analysis. These notable characteristics include the following groups:
- CMS region (e.g., Southeast, Mid-Atlantic, New England, etc.)
- Utilization metrics other than inpatient admissions and E&M visits (30 variables)
- Quality metrics other than ACO38 (23 variables)
- Participation in ACO programs other than MSSP (6 variables)
- Population demographics and ACO size (22 variables)
- All other characteristics outside the top 25, including various information on risk scores, cost, provider types, and prior track selections (110 variables)
With Pathways to Success, CMS endeavored to reshape the MSSP by adjusting incentives, encouraging greater accountability in ACOs, and offering options specific to each ACO’s ability to take on risk. Our analysis gives early indication that these changes are rewarding ACOs for attained efficiency levels, possibly enhancing the attractiveness of the program. Furthermore, we also see evidence of at least some correlation between tracks with downside risk and higher gross savings, supporting CMS’s case for accountability as a policy priority, though voluntary track selection may also be playing a role. Lastly, we see some indication that ACOs strongly emphasizing primary care are having greater success than their peers. We look forward to seeing how these conclusions hold up in future MSSP performance periods.
Data sources and methodology
This analysis was based primarily on data from the 2019 MSSP Public Use File (PUF). We incorporated quality metric information from the 2019 MSSP ACO Performance Results, also made publicly available by CMS. We excluded any variables that were directly related to the performance year gross savings calculation, such as shared savings amounts and performance year costs. Based on these two sources alone, we engineered more than 50 additional features. They included, but were not limited to, risk-adjusted costs, percentage of the ACO population by entitlement category, change in entitlement category mix from baseline to performance year, number of PCPs per capita, number of specialists per capita, and CMS region (based on primary state).
We also added additional features using analysis of outside data sources:
- For the geographic-risk-adjusted costs, we also developed ACO-specific geographic reimbursement factors. They were developed using average Medicare reimbursement levels by county, weighted based on each ACO’s mix of assigned beneficiaries county from the 2019 Number of ACO Assigned Beneficiaries by County PUF. The factors used in this portion of the analysis reflected only differences in cost per service, not utilization.
- Regional efficiency factors were based on comparisons of ACO costs to risk-adjusted regional cost data in the MSSP rebasing public use files. Risk-adjusted regional costs for each ACO were weighted based on the ACO’s mix of beneficiaries by county.
- Other information about each ACO’s other risk arrangements and affiliations were taken from Milliman’s Torch Insight database. Torch Insight collects information about ACOs and other alternative payment models and links these payment arrangements to other data on providers and geographies. This tracking effort includes ACO arrangements with private payers and employers (where such information is publicly available).
We used these features to predict gross savings percentage with a machine learning algorithm known as a Random Forest. A Random Forest averages the predictions from a large number of decision tree models developed from bootstrapped samples of the data. In our case, we used 10,000 decision trees.
Although the Random Forest does not produce coefficients or p-values, as we typically see in linear models, the Random Forest algorithm provides a useful measure of feature importance, which we utilized in this paper. The feature importance value represents the increase in error if the ACO characteristic was randomized (therefore, rendering it useless). The relative magnitude of each number is more important than the actual number itself.
Limitations and qualifications
The information in this paper is intended to identify and rank ACO characteristics most associated with gross savings under the MSSP “2019A” performance year. It may not be appropriate, and should not be used, for other purposes.
Milliman has developed certain models to estimate the values included in this paper. The intent of the models was to identify ACO characteristics associated with gross savings under the MSSP. We have reviewed the models, including their inputs, calculations, and outputs for consistency, reasonableness, and appropriateness to the intended purpose and in compliance with generally accepted actuarial practice and relevant actuarial standards of practice (ASOP). The models rely on data and information as input to the models. We have relied upon certain data and information made available by CMS and Milliman’s Torch Insight for this purpose and accepted it without audit. To the extent that the data and information provided is not accurate, or is not complete, the values provided in this correspondence may likewise be inaccurate or incomplete. The models, including all input, calculations, and output may not be appropriate for any other purpose.
Differences between our projections and actual amounts depend on the extent to which future experience conforms to the assumptions made for this analysis. It is certain that actual experience will not conform exactly to the assumptions used in this analysis. Actual amounts will differ from projected amounts to the extent that actual experience deviates from expected experience.
Guidelines issued by the American Academy of Actuaries require actuaries to include their professional qualifications in all actuarial communications. Cory Gusland, Anders Larson, and Mackenzie Egan are members of the American Academy of Actuaries and meet the qualification standards for performing the analyses presented in this report.
1Gusland, C., Herbold, J.S., & Larson, A. (September 2017). What Predictive Analytics Can Tell Us About Key Drivers of MSSP Results. Milliman White Paper. Retrieved July 13, 2021, from https://www.milliman.com/en/insight/what-predictive-analytics-can-tell-us-about-key-drivers-of-mssp-results.
3The rule change initially applied only to ACOs when starting a second agreement period. The Pathways to Success rule (effective July 1, 2019) made the regional benchmark adjustment effective for all ACOs.
4The full text of the rule is available at https://www.govinfo.gov/content/pkg/FR-2016-06-10/pdf/2016-13651.pdf.
5In the June 2016 Final Rule, CMS moved to a regional benchmarking methodology in order to provide “Strong incentives for ACOs to improve efficiency and to continue participation in the program over the long term.” Throughout the rule CMS refers to ‘efficiency’ as costs relative to their region. As such we inferred that a primary goal of the rule was to design a payment model which would reward ACO’s with risk adjusted per capita costs lower than regional averages.
6In CMS’s 2016 proposed rule, they acknowledged that “any proposed changes to the benchmark rebasing policies would require consideration of tradeoffs among several criteria that were initially described in the June 2015 final rule (81 FR 5828)”. These included “Strong incentives for ACOs to improve efficiency and to continue participation in the program over the long term.” and “Generating benchmarks that reflect ACOs’ actual costs in order to avoid potential selective participation by (and excessive shared payments to) ACOs with high benchmarks.”
7Verma, S. (August 9, 2018). Pathways To Success: A New Start For Medicare’s Accountable Care Organizations. Health Affairs Blog. Retrieved July 13, 2021, from https://www.healthaffairs.org/do/10.1377/hblog20180809.12285/full/.
9The values for average gross savings/(loss) percentage in figures 3-8 are calculated as total gross savings divided by total benchmark expenditures, i.e., a weighted average of gross savings/(loss) percentages across ACOs.
10If an ACO participant's total Medicare Part A and Part B FFS revenue is more than 35% of the total Medicare Parts A and B FFS expenditures for the ACO’s assigned beneficiaries, then it is “High Revenue.”
11See the Milliman paper “'Pathways to Success' MSSP Final Rule: Key Revisions to the Proposed Rule," Figure 5, available at https://www.milliman.com/en/insight/pathways-to-success-mssp-final-rule-key-revisions-to-the-proposed-rule.
12Because CMS does not release the experience determination for ACO participants, we estimated “experience” based on the MSSP tracks in which the ACO entity participated previously. This will not capture ACOs that are not reentering entities but that are deemed experienced because 40% or more of their physician participants have experience with downside risk. Our list of experienced ACOs thus might exclude a handful of ACOs that are actually experienced.
13The count of specialists in each ACO is included in the MSSP PUF. Per the PUF data dictionary, this count reflects the total number of physician specialists that reassigned billing rights to an ACO participant in the performance year. Based on the ACO's certified participant list used in financial reconciliation and information in the PECOS. We created a per capita version of this variable by dividing it by the ACO’s number of assigned beneficiary person-years.
15Of the nearly 220 characteristics considered in our model, nearly 200 had a relative importance in the Random Forest model of less than 0.02. By comparison, the most predictive feature, the Regional Efficiency Factor for BY3, had a relative importance of approximately 0.40.
What predictive analytics can tell us about key drivers of MSSP results: 2021 update
We analyzed recent MSSP results for ACOs, and found that drivers of recent success are quite different from our report in 2015, and in some cases the opposite.