When Content Moderation Fails: What Trust & Safety Teams Need to Learn from Law Enforcement Threat Assessment
Issue 001 · Grey Inflection Intelligence · Corporate Risk & Platform Safety
Reading time: approximately 7 minutes
🎙 This issue is also available as a podcast episode. Episode 001 of the Grey Inflection Intelligence Podcast goes beyond this written analysis, extending the discussion with additional context, deeper examples, and operational detail that doesn't always fit neatly into print. If you've read this issue, the episode will deepen it. If you're new here, it stands on its own. Listen on Substack ·
The Problem Nobody Is Talking About
Time to time, another tech platform makes headlines for the wrong reason.
A user who had been flagged repeatedly for harassment escalates to making credible threats against a public figure. A gig economy worker with a history of boundary violations commits a serious incident against a customer. An online community moderator misses the warning signs of a user in crisis, until it is too late.
The truth is, in almost every post-incident review, the same finding surfaces: the signals were there. They just weren’t being read correctly.
What we do know is that, this is not a technology problem. It is likewise not a resourcing problem. Rather, it is a methodology problem, one with a solution law enforcement figured out decades ago.
In most companies within the technology industry, Trust & Safety teams are among the most dedicated professionals. They work under enormous pressure, processing staggering volumes of content reports, flagging both accounts and user complaints. But the frameworks most platforms use to handle these cases were designed for content moderation (identifying and removing policy-violating material at scale).
There is a need to understand that content moderation and threat assessment are not the same discipline. Conflating them is costing platforms, and the people who use them.
What Law Enforcement Learned the Hard Way
In the 1990s and early 2000s, law enforcement agencies, particularly in the United States and United Kingdom, began developing structured threat assessment methodologies in response to a disturbing pattern: targeted violence in schools, workplaces, and public spaces was almost never truly spontaneous.
There are multiple researches available, and they consistently showed that individuals who carried out acts of targeted violence had followed a pathway to violence - a progression of observable behaviours that were visible to people around them. We could see that the problem was not a lack of signals but rather, nobody had a framework for recognising what those signals meant together.
The result was the development of Behavioral Threat Assessment and Management (BTAM). It is an evidence-based, proactive process used to identify, evaluate, and mitigate potential threats of targeted violence. Not only so, it is a structured methodology for evaluating whether a person of concern poses a genuine risk, and from the findings, what intervention is appropriate.
The core insight of BTAM is deceptively simple: it is not about what someone says. It is about what someone does.
A user who posts “I’m going to kill you” to an individual is almost certainly not a credible threat. However, when the said user who posts the same message, has researched the victim’s home address, has expressed grievance repeatedly over months, and/or has recently experienced a significant personal loss is an entirely different risk profile, even if the language used is identical.
While content moderation catches the first case easily, it almost always misses the second.
The Three Gaps in How Platforms Currently Operate
Gap 1: Binary decision-making in a non-binary problem space
Most Trust & Safety workflows are built around a fundamental binary: content either violates policy or it does not. Remove or keep. Ban or don’t ban.
Threat assessment on the other hand does not work this way, and risk exists on a spectrum. An account that today represents a low-level concern may, following a triggering event, such as relationship breakdown, job loss or perceived public humiliation, may escalate rapidly into a genuine threat. Effective threat assessment monitors the trajectory, not just the current state.
What platforms need is longitudinal case management - the ability to track a person of concern over time, note any escalating behaviours and intervene at the right point on that trajectory. Most current T&S tooling is optimised for high-volume, single-incident resolution. It is poorly suited to ongoing case monitoring.
Gap 2: Context blindness
When a content moderator reviews a reported post, they typically see the post and perhaps the user’s recent history. What they rarely see is the full picture: the pattern of interactions across multiple accounts, the targets being researched, the real-world geography of the situation, or the personal circumstances that may be acting as stressors.
Law enforcement threat assessors call this the totality of the circumstances. It is based on the principle that no single behaviour or statement should be evaluated in isolation. That is to say, a threat assessment is a mosaic, not a single tile.
Building this full picture requires cross-functional collaboration that most T&S teams are not structured to provide. Legal, privacy, and data infrastructure constraints likewise further limit what analysts can access. These are real constraints, and companies need to be designed around, not accepted as permanent limitations.
Gap 3: Intervention is treated as binary
For most platforms, the available responses to a person of concern are limited. They are either warn, restrict, suspend, or ban. These responses are escalating punitive measures, and often, counterproductive when applied to individuals on a pathway to targeted violence.
Several law enforcement and mental health research have shows that abrupt account termination of a fixated individual could be a triggering event rather than a solution. Not to say, it removes the platform’s visibility into the person’s behaviour while doing nothing to address the underlying risk.
Effective threat assessment on the other hand, includes a much wider range of intervention options such as, but not limited to, outreach, de-escalation, referral to support services, coordination with other platforms, and in serious cases, engagement with law enforcement. The right intervention depends on where the individual sits on the risk spectrum, not on which policy they most recently violated.
What Good Looks Like
There are a handful Trust & Safety operations globally that are genuinely doing this well, and they share several characteristics that reflect threat assessment principles:
A dedicated case management function - one that is separated from volume content moderation, staffed by analysts trained in behavioural indicators and risk evaluation, with the time and tools to build full case pictures.
Structured professional judgement tools - these analysts use standardised frameworks for evaluating risk level that reduce inconsistency between analysts and create a defensible, documented decision trail.
Cross-functional threat assessment teams - bringing together T&S analysts, legal, law enforcement liaison, and where appropriate, mental health professionals to evaluate serious cases collectively.
Tiered intervention menus - a range of responses calibrated to risk level, including non-punitive options designed to manage risk without triggering escalation.
Metrics that measure outcomes, not just outputs - moving beyond “cases closed” and “content removed” to tracking whether interventions actually reduced risk over time.
None of the above is beyond reach. What we do know, law enforcement and corporate security have been building and refining these capabilities for decades. The intellectual framework exists. What is missing in most platforms is the deliberate decision to apply it.
The Grey Inflection Assessment
Trust & Safety is one of the most consequential functions in modern technology, and one of the least mature in terms of methodology. The field has grown rapidly in response to regulatory pressure and public scrutiny, but much of that growth has been in scale rather than sophistication.
The next phase of T&S evolution in today’s world is not about hiring more moderators or deploying better classifiers. It is about building genuine threat assessment capability that are structured, human-centred practice of understanding who poses a risk, why, and what to do about it.
Such capability exists. It lives in law enforcement agencies, corporate security departments, and behavioural threat assessment units around the world. It has simply not yet made the full journey into the technology industry in any systematic way.
That gap is both a problem and an opportunity.
For Trust & Safety professionals: the frameworks you need already exist, such as the Behavioural Threat Assessment and Management, WAVR-21, and the work of organisations like the Association of Threat Assessment Professionals (ATAP) represent decades of tested methodology. If your current training does not include these, it should.
For platform leadership: the question is not whether your platform will face a serious targeted violence incident. It is whether you will be prepared when it happens. A content moderation team, however skilled, is not a threat assessment function. Building one requires investment, cross-functional commitment, and a willingness to import expertise from outside the technology industry.
The signals are almost always there. The question is whether anyone is trained to read them.
Grey Inflection Intelligence is currently in its launch phase. All issues are free during this period. A tiered subscription model will be introduced in future, with advance notice given to all subscribers before any changes take effect. When the paid tier launches, free subscribers will continue to receive one issue per month, while paid subscribers will receive a minimum of two issues per month, additional briefs and articles published at editorial discretion, quarterly deep-dive reports, and access to the GLI resource library.
Analysis is produced by a founding analyst with a background spanning law enforcement investigation, corporate threat assessment, travel risk management, and trust & safety operations, with formal academic grounding in strategic studies at the postgraduate level. Published anonymously to maintain source independence and editorial discretion. All analysis is based on open-source information.
If this issue was useful, consider forwarding it to a colleague in security, HR, or Trust & Safety.
