🛑

The Statistical Transparency of Policing Report (STOP) chose to use analytical methods that would be less likely to indicate disparity than previous methods.

Whether or not these newer methods are more accurate is debatable, but previous population-based benchmarks (comparing the disparity in number of people in a perceived racial/ethnic group that were stopped in a county to the total number of people in that perceived group living in the county) were found to “biased or invalid” due to the differences between residential population and driving population demographics. One research brief in support of this dismissal of the old benchmark was conveniently released directly by the STOP program in 2018.

Because of this apparent invalidation of the old benchmarks, the analytical methods used in this STOP Report were three different analyses each with their own issues.

The Decision to Stop (DTS) analysis assumes it is harder to perceive an individual's race/ethnicity during sundown than it is during sunup, so the relative rates of stops between a semi-control night group could be compared to the day group rates to check for disparity. This “Veil of Darkness” assumption is supported with research but is still a fairly large confounding variable and depends on other factors of the car being stopped such as window tint, height, and other make & model characteristics.

The Stop Outcomes analysis matches and controls for all factors besides perceived race/ethnicity and then compares between groups the different outcomes of a stop (i.e. citation, search, arrest) to look for disparity. There are countless confounding variables affecting the outcomes of a stop ranging from the officer’s mood at the time, to the driver’s charisma and mood at the time, to even just the appearance of the driver or their car. Many confounding variables these are impossible to realistically account for, so the question remains if this analysis is even still worth considering.

The Search Findings analysis compares the relative rate of successful searches (AKA contraband was seized) between perceived racial/ethnic groups, so a large difference in successes between group would indicate a disparity where a certain group was arbitrarily chosen to be searched more. This runs into similar confounding variables as the Stop Outcomes analysis but is narrowed in scope to only stops with a search outcome, so it ends up being conditionally dependent on the Stop Outcomes analysis as well.

The limitations of the data gathering process are shown through the lack of accurate and specific location data resulting in it being not fit for use in analysis, and the fact that no data is recorded on either the economic characteristics of the driver or the driving population from which they are a part of (i.e. morning commute, tourist, from Lane County, etc.). These three factors are of utmost importance to each of these analyses and are going ignored.

Location of stops is incredibly important when highlighting where a stop may come from a “typical hotspot” with frequent stops, when compared to a rarely-stopped location where a stop's importance may be for example weighted higher.

The economic status of a driver affects everything about how they are perceived by a cop, with factors such as being well-/poorly-groomed, having a more/less expensive car in better/worse shape, and being more/less nervous around cops are all more than likely to have a substantial effect on how each of the three analyses would go.

The driving population for which the driver is a part of holds importance in many confounding variables, but one key non-variable use of getting this information would be to allow for the potential use of population-based benchmarks. Given that the main issue with such benchmarks was the lack of information on the driving population versus the residential population, this data could allow for further analyses to be done using the old benchmarks.

As a final cherry on top, the main data collection method is manual entry by an officer after the stop has already happened. This leaves room for misremembered or purposefully misentered data to affect the cleanliness of the dataset.

The STOP Report is unsuited for measuring the disparity in treatment between different perceived ethnic/racial groups because it disregards population-based benchmarks entirely, its three main methods of analysis are flawed, and its limitations and assumptions in data gathering and use are too substantial to be ignored.