Every SOC starts in the same place: too many alerts, not enough signal. The first week onboarding a new log source almost always produces a wall of noise — and the instinct is to either tune everything down (and risk missing real attacks) or leave it as-is (and burn out the analysts who triage it).
Detection engineering is the discipline that sits between those two failure modes. It's not a single tool or platform — it's a repeatable loop for turning raw telemetry into detections that analysts can actually trust.
The loop
The workflow I've found most effective across SentinelOne, CrowdStrike Falcon, Cortex XDR/XSIAM, Microsoft Sentinel, FortiSIEM, and QRadar boils down to four repeating stages:
- Onboard the log source and validate field mapping before writing a single rule.
- Baseline normal behavior — what does "business as usual" actually look like for this data source?
- Write and map the detection to a specific MITRE ATT&CK technique, so coverage gaps are visible at a glance.
- Tune against real traffic, then validate with breach & attack simulation (AttackIQ is a good fit here) before calling it production-ready.
Skipping step 2 is the most common mistake. A detection that looks airtight in a vacuum often turns into a 3 a.m. page once it meets a backup job, a vulnerability scanner, or a misconfigured service account.
A simple example: suspicious PowerShell
Here's a stripped-down example of the kind of logic that starts a detection — flagging encoded PowerShell execution, a technique mapped to ATT&CK T1059.001:
-- Pseudocode: flag encoded PowerShell with network activity
SELECT host, user, command_line, parent_process
FROM process_events
WHERE process_name = 'powershell.exe'
AND command_line ILIKE '%-encodedcommand%'
AND host IN (
SELECT host FROM network_events
WHERE process_name = 'powershell.exe'
AND timestamp BETWEEN process_events.timestamp AND process_events.timestamp + INTERVAL '60 seconds'
)On its own, "PowerShell ran with -EncodedCommand" is far too noisy — plenty of legitimate admin tooling does exactly that. The correlation with an outbound network connection in the same window is what turns this from a low-value alert into something worth an analyst's attention.
Mapping coverage, not just writing rules
The highest-leverage part of this work isn't any individual rule — it's keeping a living map of which ATT&CK techniques have detection coverage, which have partial coverage, and which have none. A simple table goes a long way:
| Technique | Coverage | Data source | Notes |
|---|---|---|---|
| T1059.001 – PowerShell | Full | EDR process telemetry | Tuned against backup agents |
| T1003 – OS Credential Dumping | Partial | EDR + Sysmon | LSASS access alerts only |
| T1486 – Data Encrypted for Impact | Full | EDR + file integrity | Validated via AttackIQ simulation |
| T1078 – Valid Accounts | Partial | Identity provider logs | No baseline for service accounts yet |
When a compromise assessment or breach & attack simulation surfaces a gap, it goes straight into this table — which becomes the backlog for the next onboarding cycle.
Why this matters beyond the SOC
The same loop applies outside of security operations: any system that generates high-volume signals — application logs, infrastructure metrics, even CI/CD pipeline events — benefits from the same discipline of baseline-first, map-to-outcome, then tune. The tooling changes; the underlying practice of turning noise into something actionable doesn't.
