From Telemetry to Signals: Designing Detections with an Audience in Mind
Let us imagine the following scenario.
You spin up a lab, execute the payload, and craft a somewhat decent detection rule that flags a specific behavior as “High Severity.” Satisfied, you ship it to production, only to receive three very different reviews from users:
- User 1: “Great rule. Caught a live threat actor mid-incident.”
- User 2: “What a noisy nightmare. Are you sure you tested this before shipping it? We had to disabled it immediately.”
- User 3: “The rule’s logic is solid, but unfortunately, once we put it into our engine, the whole platform came to a halt. We had to yank it.”
You might ask yourself, how can a the same detection, receive three different opinions? Let me reveal the users personas, and hopefully this will make it clear.
As it turns out:
- User 1, is a DFIR investigator, that’s been running rule directly on logs collected from the current incident, and is looking for suspicious things while ignoring legitimate matches.
- User 2, is an MSSP/MDR/SOC analyst, ingesting events in the millions per day, and their user base is not baselined.
- User 3, builds detection tooling and performance reigns key.
What this illustrate (hopefully), is that a rule without a clearly defined audience and context is a rule that could be described as a Schrödinger’s Detection, simultaneously brilliant, useless, and harmful.
Now, I could end the blog here and say that its up to the rule authors and users to add that context and have due diligence when using rules respectively, but that would not be fun.
My aim in this blog will be to try and clear this issue up, by finding user and detection archetypes, and putting all of it in a framework of sort to help DEs and users alike be more mindful of this issue, as well as helping DEs ask the right questions in order to design and build detection with the right audience in mind. Let us get started.
Know your audience
Generally speaking a detection rule is built for a specific user archetype. While others can use and adapt it, it should be obvious who the target audience is. The reason for this, is that different users will have different needs and pain points. Let’s illustrate this by exploring the three user archetypes we highlighted in the intro.
The DFIR Investigator / Threat Hunter
Typically a DFIR investigator would collect some sort of logs or artefacts from an incident, and then apply a set of tools and analysis on top, to look for the “bad” stuff. For the sake of argument, let’s say that our investigator posses a “Security.evtx” file and they apply a bunch of detections on it, to look for suspicious keywords, strings or events.
A detection that looks for simple suspicious strings such as “mimikatz”, “payload”, specific parent / child relationships, such as “powershell.exe” or “cmd.exe” spawning anything, or even a combination of event codes and username / IPs can be very useful in finding a thread to move the investigation forward.
Now, I hope we can all agree that such logic isn’t what we call “great” but perhaps it “gets the job done”, and during the investigation or hunting activity, when your goal perhaps is to look for “suspicious”, “interesting” or simply “uncommon” things, you might find this detection “usable” or even call it “great” :D
In fact this archetype, prefers rules that are broad and easy to pivot from, rich fields and a lot of logs.
Typical pain points are usually gaps in visibility, brittle and very specific rules.
MSSP/MDR/SOC Analyst
If the DFIR investigator thrives in the world of “suspicious until proven harmless,” the MSSP or SOC analyst often lives in the complete opposite reality. Their day-to-day revolves around monitoring massive event streams, millions of logs per day from dozens or even hundreds of customers or business units.
In that environment, even a slightly noisy rule can becomes a productivity killer. If a detection is firing hundreds of times an hour because it’s tuned for investigation rather than monitoring, or worse, not tuned at all, it can drown the analyst in alerts. This is especially true in environments, where baselining is very difficult, asset context is scarce, and there’s often no luxury of deep local knowledge.
For this archetype, “broad and pivot-friendly” rules are often the wrong fit. They need signals that are high-confidence, context-aware, and easy to triage quickly. Metadata like severity and enrichment matter as much as the actual matching logic.
The typical pain points are unscoped rules that look “fine” in a test lab but explode in the wild, rules that lack clear triage guidance, and detections that require deeper investigation effort for each match, which simply isn’t scalable.
The Detection Platform Engineer
The detection “platform” / “tooling” engineer, have to make sure that the 4000 detections or more have to run quickly, without bogging down the system and have almost no false positives.
Having detections that leverage heavy operations such as regex, sub searches, joins and whatnot is a big no no often times. FP tolerance is at an all time low here, especially for alerting rules.
Hence why this archetype’s tend to favor rules that are efficient and predictable, even if sometimes that comes with the cost of False Negatives.
Setting Expectations
By now, you can probably see why the same detection can be hailed as “the greatest thing ever” by one person and “a piece of garbage” or “harmful” by another. The truth is, every archetype approaches a detection rule with a different operational reality, success metric, and tolerance for noise or complexity.
This is why setting expectations is technically not optional, it’s the difference between a detection being understood and valued versus misunderstood and scrapped.
Metadata is usually the most common and easiest way you can use to communicate your intention when building and shipping a detection and avoiding mismatched expectations. At a minimum, a detection rule should include a title and a description, a confidence or some sort of measurement of the “quality” in terms of FP/TP rate, a type or some tags.
You can (and probably should) even go beyond this, by leveraging a full detection framework to describe your detections in further details. Something like Palantir’s Alerting and Detection Strategies Framework can be a great start.
Telemetry Shapes the Signal
Now that we’ve established that audience and context matter, let’s talk about the other piece of the equation: the data that we actually have to work with.
Any detection is built upon a telemetry source(s). Not only are these sources different by OS and tooling, their availability is also dependent on different conditions.
Similarly to the user personas archetypes, we can establish a very similar idea when talking about telemetry. Here a couple of examples:
- An SMB might have no logging available except for the defaults.
- Advanced logging might be enabled but not completely configured. Fro example missing the “Command Line” in Event ID 4688, PowerShell Module logging might be disabled.
- EDR 1 vs EDR 2 might have different coverage, field naming conventions, or levels of enrichment. Even two instances of the same product can vary depending on policy configuration, sensor version, or integration depth with other tooling. (See the EDR telemetry project).
These are just a handful of the many possible scenarios. When designing detections, telemetry shapes what’s even possible to detect. The more aware we are of this, the better we can build detections that reflect our audience’s reality.
A “Framework” for Designing Detections with the Audience and Telemetry in Mind
When you sit down to write a detection, decide which archetype it is meant for, and ask yourself:
Who am I writing this for?
Is it one of the archetypes we mentioned? Someone else entirely?
What’s their operational reality?
- Incident response?
- Live threat hunting?
- Continuous monitoring?
- Are they in a Multi-tenant environment?
- Do they have low visibility in terms of telemetry and logs?
- Are there any resource constraints?
What does success look like?
- Is it “catch anything remotely suspicious”?
- Is it “reduce alert fatigue”?
- Is it “maintain uptime”?
How easy it is for users to operationalize the detection?
- Am I providing the right metadata, context, and tuning guidance?
- Is it obvious where it fits in the detection pipeline?
By consistently asking and answering these questions (and more, be creative), we can stop writing these “Schrödinger’s Detections” and start producing detections and signals that are designed for the realities of the people who will actually run them.
Closing Thoughts
A detection without an audience in mind, might not be serving its user or anyone well. By making the audience explicit during the design phase with their operational reality in mind, we can start to build detections that fits.
