- The viral video “How AI Could Reinforce Biases In The Criminal Justice System” argues algorithms replicate historic policing patterns and can create feedback loops that concentrate enforcement in certain neighborhoods.
- Independent analyses — most notably ProPublica’s 2016 study of COMPAS — found Black defendants were labeled high risk but did not reoffend at almost 2x the rate of white defendants, highlighting racial disparities in sentencing tools.
- Academic work (Alexandra Chouldechova, 2017) shows a mathematical trade-off: you can’t simultaneously equalize all fairness metrics when underlying base rates differ, complicating simple technical fixes.
- Policy fixes in the video and among experts include mandatory algorithmic impact assessments, independent audits, public data provenance, limits on automated decisions in sentencing, and stronger community oversight.
What the video argues — three mechanisms that reinforce bias
The viral short film “How AI Could Reinforce Biases In The Criminal Justice System” lays out a compact argument: algorithms aren’t neutral. They learn from historical records — police stops, arrests, prior convictions — and those records reflect decades of unequal policing. The video breaks the problem into three mechanisms that viewers can spot quickly.
1. Biased training data
The video explains that machine learning models trained on arrest records will encode whatever patterns exist in those records. If police disproportionately patrol certain neighborhoods, the dataset will show more crime in those places even if actual crime rates are similar elsewhere. The model then predicts higher risk for residents of those neighborhoods, and officers use those predictions to focus enforcement there — a clear feedback loop.
2. Proxy variables and opaque models
Next, the film points out that algorithms often use variables that act as stand-ins for race or socio‑economic status: address, arrest history, employment gaps. Even when developers remove explicit race fields, these proxies can reintroduce the same disparities. The video also calls out opaque commercial tools — like many proprietary sentencing-score systems — where neither defendants nor judges see how scores are calculated.
3. Automation and decision authority
Finally, the piece warns about delegating too much discretionary power to algorithm outputs. When a judge or a police department treats a score as decisive rather than advisory, the algorithm’s biases translate directly into human outcomes: who gets stopped, who gets detained pretrial, who receives a harsher sentence.
How the video connects to established reporting and research
The video’s core claims sit on a firm base of reporting and scholarship. The most-cited case is ProPublica’s 2016 investigation of the COMPAS recidivism tool, led by Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner. ProPublica found that Black defendants were twice as likely as white defendants to be labeled high risk but not go on to reoffend — a disparity now referenced throughout the video.
On the research side, Carnegie Mellon statistician Alexandra Chouldechova published a key paper in 2017 showing that fairness metrics conflict when base rates differ across groups: you can’t have both equal false positive rates and equal predictive values unless underlying recidivism rates are equal. The video cites this tension implicitly when it warns that simple tweaks (remove race, retrain the model) won’t fix deeper structural problems.
My read as a reporter: what the video gets right — and what it soft-pedals
The video does a strong job at making the feedback loop intuitive. That loop is the single clearest pathway from historical bias to automated injustice, and the film’s imagery — repeated patrol maps, stacked arrest logs — communicates it simply. It also correctly centers the ProPublica findings and the Chouldechova trade-off, which anchor the story in evidence rather than alarmism.
Where the video soft-pedals: the diversity of algorithmic tools in use and the uneven performance across jurisdictions. Not every predictive tool is equally problematic, and some vendors and researchers are experimenting with fairness-aware methods. The video treats the field as monolithic at times, which helps the narrative but flattens important technical distinctions: supervised risk scores, unsupervised hotspot maps, and anomaly-detection tools behave differently and demand different safeguards.
Another nuance the video glosses over is the law and incentives shaping adoption. Police departments often deploy predictive maps to maximize clearance rates or to satisfy grant reporting — incentives that may not align with reducing bias. Fixing tools without changing incentives risks producing prettier algorithms that preserve the same outcomes.
Concrete fixes the video and experts propose — and which actually help
The film closes with a list of reforms. Those proposals echo what many researchers and civil-rights groups have been saying for years. Here are the practical steps that will matter, ranked by my read of feasibility and impact:
- Independent audits: Regular, third-party audits of models and datasets. Audits should test for disparate false positive/negative rates and publish results.
- Algorithmic impact assessments: Before deployment agencies should document data provenance, intended use, and downstream effects. That intake paperwork has to be public.
- Limits on automated decision authority: Ban using scores as the sole basis for detention or sentencing. Require a documented human review that accounts for context the model misses.
- Data governance: Fix the input: reduce biased policing data by changing patrol priorities and improving reporting standards so datasets reflect real prevalence not policing intensity.
- Community oversight: Give affected neighborhoods seats on procurement and oversight boards so deployment choices face democratic scrutiny.
Comparing predictive policing and sentencing tools: an at-a-glance table
| Tool category | Primary use | Typical algorithms | Key documented bias | Notable case/study |
|---|---|---|---|---|
| Predictive policing | Where to deploy patrols or overtime | Hotspot mapping, time-series models | Concentrates stops in over-policed neighborhoods | Multiple city reports; academic audits (e.g., UCLA / 2019) |
| Sentencing/risk scores | Pretrial release, sentencing severity | Supervised classification (risk scores) | Disparate false positive rates by race | ProPublica COMPAS analysis (2016) |
| Bail decision support | Recommend pretrial detention or release | Logistic regression, boosted trees | May encode socio-economic proxies; opaque thresholds | Carnegie Mellon / Chouldechova work (2017) |
Objections and counterarguments the video raises — and how to answer them
Some defenders of algorithms make two claims: (1) automated tools are less biased than individual humans, and (2) data-driven decisions reduce discretion and thus discrimination. The video acknowledges both and pushes back effectively.
First, on the comparative bias claim: algorithms can be more consistent than humans, but consistency doesn’t equal fairness. If the pattern they’re consistent about is biased, you get baked-in injustice. Second, on the discretion argument: removing discretion can reduce idiosyncratic bias, but it also removes the ability to correct for bias case by case. The right path isn’t full automation; it’s accountability and carefully bounded use.
What to watch next — where reforms are being tested
Several jurisdictions have begun piloting stronger governance: public algorithm registries, mandatory audits, and limits on how risk scores feed into sentencing. Watch for state-level legislation that requires transparency in government AI procurement — bills like those introduced in California and New York over the past three years — and for court cases challenging the unreviewed use of proprietary scores in sentencing.
The most telling number remains the disparity ProPublica highlighted: in their COMPAS analysis, Black defendants were labeled high risk but did not reoffend at nearly 2x the rate of white defendants. That gap should be a constant check on claims that an automated score is merely a neutral aid.
