Distracted driving: Text-mining accident descriptions

  • Print
  • Connect
  • Email
  • Facebook
  • Twitter
  • LinkedIn
  • Google+
By Philip S. Borba | 26 January 2012

The National Highway Traffic Safety Administration (NHTSA) has issued a policy statement advising drivers to resist engaging in any activity that distracts from the operation of a motor vehicle, specifically mentioning the use of cell phones, and recommended that states prohibit “novice” drivers from using electronic communication devices during the learners and intermediate stages of a driver license program.1 Deliberations over this policy statement will certainly attract many interested parties—including manufacturers (automobile, as well as makers of the electronic devices), driver associations, consumer safety advocates, and automobile insurers.2

Whatever direction these deliberations may eventually take, it seems clear that distractions are a contributor in motor vehicle accidents: Distraction was a “critical reason” in 8.4% of all motor vehicle accidents we studied. The form of distraction most recently cited as a critical reason? Adjusting the radio, followed by cell phone use. These results are summarized in Figure 1.

This article will briefly identify the major considerations concerning the use of cell phones on automobile insurance premium and claim adjusting.3 Cell phone usage may be very difficult to capture in the rates charged for automobile insurance coverage. Rates and rating plans typically capture the features of an automobile, such as make, model, year, and certain safety features. As with excessive speed, cell phone usage may be difficult to capture because it concerns behavior (rather than a physical feature) but may be reflected in automobile insurance premiums through law enforcement violations. Similar to excessive speed, violations of cell phone usage may cause points on a driver’s record that increase insurance premiums.

We direct most of the attention in this article to the difficulty cell phone usage has on claim adjusting. The most commonly used reporting forms do not capture whether the driver was distracted or the nature of the distraction. We briefly describe a process we have developed to extract from text data (e.g., accident descriptions, claim adjuster notes) information that is not typically captured on data-reporting forms. Using accident descriptions from a NHTSA database, we describe the process and present results on the incidence of three types of distractions.


Definitions and information-capture for distracted driving are evolving. Currently, the NHTSA defines distracted driving to include talking on, listening to, or dialing a cell phone; adjusting climate, radio, cassette, or CD controls; eating or drinking; and smoking-related activities. Distracted driving may also include distractions from outside the vehicle, such as a street sign, previous crash, or billboard.

The NHTSA has reported distracted driving was a factor in 20% of auto accidents where one or more persons were injured.4 In 2009, driving while distracted was reported for 11% of all drivers in fatal crashes. Among drivers cited as having been distracted, using a cell phone was reported for 20% of the distracted drivers.

The incidence of distracted driving is probably underreported in the NHTSA and other sources of automobile accident data. The NHTSA statistics are from its Fatality Analysis and Reporting System (FARS) and General Estimates System (GES) and are based on police reports, which vary across jurisdictions, and the reported distraction as a distinct field is not consistent across jurisdictions. Accordingly, the incidence of distracted driving is probably understated in the statistics that rely on FARS and GES data. For non-fatal accidents, information on the pre-accident activities is usually self-reported by the driver and occupants in the vehicle(s), and driver and occupants may be averse to reporting a distraction at the time of the accident. For fatal accidents, law enforcement officers and claim adjusters may need to rely on witness accounts, and this information may lack information on the situation inside the vehicle(s) at the time of the accident.

Implications for automobile insurance

We will focus the present discussion on two areas where electronic communication devices (and particularly cell phones) will have implications for automobile insurance—rates and claim adjusting. Frequently, the introduction of equipment to automobiles has caused changes to rating plans and rates. Air bags, braking systems, seat belts, and vehicle design (such as the widening of sport utility vehicles) have been changes worthy of adjustments to rating plans and rates. In most cases, the new equipment has made the automobile safer and been reason for reduced rates or rate discounts.

The cell phone (and use of electronic equipment generally) introduces a new aspect into the operation of a vehicle and one which poses a challenge for setting insurance rates. Generally, cell phones are seen as distractions to the operation of a vehicle and are likely to increase the frequency and possibly the severity of accidents. Furthermore, the nature of the equipment may have differing effects on the frequency and severity of accidents. Steering-wheel and voice-activated controls for built-in cell phones may be safer than plug-in after-market equipment, while external devices (e.g., hands-free headsets) may be the least safe model.

Insurance premiums notwithstanding, more responsible use of cell phones may occur through state laws prohibiting or limiting their irresponsible use. Similar to the case with powerful vehicles that can increase speed beyond safe limits, state laws may impose some control over the irresponsible use of cell phones. Violations of state laws can carry fines, and the points on one’s license can increase the driver’s insurance premium.5

Drivers’ use of electronic communication devices will also influence claim adjusting—in particular, assigning responsibility and liability when a distraction has occurred. However, the most commonly used data-capture reports do not enable the report-taker to report if the driver was distracted, or the nature of the distraction. As an alternative, claim adjuster notes, and other text reports, can provide a great deal of information on the circumstances attendant to an accident. These text-format data sources provide a great deal of information that can be tapped for deciphering the activities preceding, during, and after an accident.

Using text data to understand activities occurring during vehicular accidents

The NHTSA has developed a database with detailed information and accident descriptions for a cross-section of approximately 7,000 automobile accidents.6 We have performed extensive work with text data from property/casualty claim and policy data systems. Briefly, we break the long strings of text data into discrete components—typically, 1-to-6 word phrases found in the data. For a single accident description, these 1-6 word phrases can number into the hundreds, and for a modest-sized set of accidents, these word phrases can number into the millions. We then use the word phrases to perform data analytics on outcome variable(s) of interest, such as the possibility of a third-party recovery or fraud.

For the distracted-driver issue, we performed a series of analyses using the distracted-driver indicator and accident descriptions in the NHTSA database. After the initial processing of the accident descriptions, we applied a process that aggregated similar phrases into a common expression. For example, “talking on a cell phone,” “talking on her cell phone,” “talking on his cell phone,” and similar expressions where “talking” was replaced by “conversing,” “dialing,” and “texting” were aggregated into a common expression. Adjusting a radio, conversing with others in the vehicle, and use of a cell phone are frequently mentioned as distractions occurring at the time of an accident. Accordingly, we focused on these three activities in our analyses of the accident descriptions.

Examples of distractions

From the summary information compiled by the NHTSA, an internal distraction was identified as the critical reason for slightly more than 8% of all accidents in the National Motor Vehicle Crash Causation Survey (NMVCCS) database (see the top half of Figure 1). The bottom half of the table in Figure 1 presents different measures of accident frequencies for the three activities identified above. The first column in the bottom half of the table presents the frequency across all accidents, the second column presents the frequency across distracted-driving accidents, and the third column presents the frequency with which the activity was associated with a distracted-driving accident. These frequencies are intended to address three questions: (1) How frequently does the activity occur across all accidents? (2) How frequently does the activity occur in distracted-driver accidents? (3) When the activity is cited, what is the frequency that it is cited in a distracted-driver accident?

From the accident descriptions, adjusting a radio occurred in less than 1% of all accidents but almost 5% of accidents involving a distracted driver. Furthermore, in almost 70% of the cases when adjusting a radio was found in the accident description, distracted driving was a critical reason for the accident. In sum, while adjusting a radio is not prevalent in many accidents, when this activity is cited it is very often cited in cases involving a distracted-driver accident. It is important to keep in mind that our present analyses should not be interpreted as establishing causal factors. For example, a driver adjusting the radio may not have been the driver at fault but the distraction may have prevented the driver from avoiding the accident. The distracted driver may not have seen the other driver running a red light. The distracted driver may have been able to avoid the accident if not for the distraction.

Continuing with Figure 1, conversing with another individual was mentioned in the accident description for 10% of all accidents and 30% of the distracted-driving accidents. Among accidents where talking with another individual was mentioned in the accident description, distracted driving was identified as a critical reason for approximately 20% of the accidents. These findings suggest that conversing with others is quite common across all types of accidents. This pattern can be contrasted with the frequencies for adjusting a radio—not frequent in distracted-driving accidents, but distracted driving is often the critical reason when adjusting a radio is mentioned.

Talking on, dialing, or texting with a cell phone was the third activity we investigated using the NMVCCS accident descriptions, and the activity receiving the most specific attention in the NHTSA policy statement. Using a cell phone was mentioned in slightly less than 2% of all accidents in the NMVCCS database but almost 8% of the accidents involving a distracted driver.7 Among accidents where talking on, dialing, or texting with a cell phone was mentioned in the accident description, distracted driving was identified as a critical reason for 37% of the accidents. The pattern in these frequencies is in contrast with the pattern for conversing with others. While using a cell phone was not found to be very prevalent among distracted-driving accidents, when cell phone use was mentioned the activity was frequently mentioned in the context of a distracted-driving accident.

Finally, the nature of the activity may have an effect on the association between the distraction and the frequency of an accident. Adjusting a radio and handling a cell phone may be considered “active” distractions—that is, distraction in which the driver removed his or her hands from the steering wheel. By contrast, talking with another occupant may be a “passive” distraction. The active/passive nature of the distraction may be a critical distinction in the likelihood the distraction is a contributing factor to the occurrence of the accident.

Concluding comment

While the NHTSA may have had the best of intentions in mind when it issued its policy statement, it is unrealistic to expect that drivers will be willing to give up their cell phones. Furthermore, given the multitude of cell-phone devices and the complex activities that can occur when using a cell phone, it will be difficult to design a data-reporting system in the near term that adequately captures the cell-phone-use activities that may occur at the time of the accident. The present discussion has briefly described a process we developed to access text data (e.g., claim adjuster notes, accident descriptions) to extract information on the activities occurring at the time of an accident, as a way of getting around the data limitations inherent in the distracted driving problem. Besides serving a useful business purpose, the extracted information may also be useful for policymaking purposes.


1National Highway Traffic Safety Administration. Policy Statement and Compiled FAQs on Distracted Driving. Retrieved January 18, 2012, from http://www.nhtsa.gov/Driving+Safety/Distracted+Driving/Policy+Statement+and+Compiled+FAQs+on+Distracted+Driving.

2An anthology of news articles and commentaries on distracted driving is available at http://topics.nytimes.com/top/news/technology/series/driven_to_distraction/index.html.

3The present discussion focuses on the hazards and implications for personal-use driving. While there are some overlaps, there are other important considerations concerning the use of electronic communications devices for commercial drivers, but these considerations are beyond the scope of the present discussion. For example, compared to personal-use driving, much more of the commercial driving occurs between midnight and 6 a.m., and there are other forms of communication devices that are not commonly used by personal-use drivers, such as citizens’ band (CB) radios.

4National Highway Traffic Safety Administration (September 2010). Distracted Driving 2009. Report No. DOT HS 811 379, Table 3. Traffic Safety Facts: Research Note. Retrieved January 18, 2012, from http://www.distraction.gov/research/PDF-Files/Distracted-Driving-2009.pdf.

5A summary of state laws can be found at http://www.ncsl.org/default.aspx?TabId=17057 (retrieved January 23, 2012).

6The accident descriptions were between approximately 70 and 1,000 words, with the mean and median lengths approximately 450 words.

7Two considerations need to be kept in mind when reviewing the frequencies concerning cell phone use. First, the data were collected from accidents occurring during 2005 through 2007. The prevalence of cell phones is likely to be much greater today than for that period. Second, and perhaps more important, much of the data on the activities occurring at the time of the accident were from the drivers and occupants of the vehicles, who may have been reluctant to admit that a cell phone was being used at the time of the accident. Besides admitting to a distraction, in many instances use of a cell phone may have been illegal, which is not the case for the other two activities in the present analyses.