Signature Based Detection.

 What is signature-based detection?

Signature-based detection matches known patterns (signatures) against observed artefacts (files, network traffic, logs). It’s the classic approach used by AV, IDS/IPS (Snort/Suricata), email gateways, and many EDR rules. Signatures can be exact matches (file hash), pattern matches (byte sequence), structural rules (YARA), or behavioral/log patterns (SIEM rules).

Common signature types

  • File hashes (exact-match signatures)

    • MD5, SHA-1, SHA-256 (and SHA-512). Used to uniquely identify a file binary or sample. Fast to compute, cheap to compare.

    • Recommendations: use SHA-256 for new work (collision resistance + wide adoption). MD5/SHA-1 are weak for cryptographic guarantees but still used as legacy identifiers.

  • Fuzzy / similarity hashes

    • ssdeep (context triggered piecewise hashing) — measures similarity between files; useful for variants (packing, minor edits).

    • TLSH (Trend-micro Locality Sensitive Hash) — another similarity hash.

    • Use when exact hashes differ due to minor changes but you still want to detect family variants.

  • YARA rules (file content / structure rules)

    • Very flexible: match strings, hex patterns, file offsets, metadata, boolean logic, and modules for PE/ELF parsing. Great for malware family detection and hunting.

    • Example (very small):

      rule Suspicious_Packer { strings: $s1 = "UPX" nocase $s2 = { 4D 5A } // MZ header condition: $s1 or $s2 }
  • IDS/IPS rules (Snort/Suricata)

    • Signatures for network traffic: HTTP patterns, protocol anomalies, specific payload bytes, flows, port-based detections. Example: alert tcp any any -> any 80 (msg:"SQLi"; content:"UNION SELECT"; sid:1000001;)

  • Log/behavior signatures (Sigma, SIEM rules)

    • Detect sequences in logs (process spawn chains, suspicious command-lines, Lateral movement patterns). Sigma is a vendor-agnostic rule format that translates into Splunk/Elastic/QRadar rules.

  • IOC lists

    • Simple indicators: filenames, mutex names, registry keys, IPs, domains, URLs. Often used in blocklists or quick detection.

  • Non-cryptographic hashes for indexing

    • CRC32, MurmurHash — used internally for fast lookups (not for security).

Properties of different hash families (short)

  • MD5 — fast, 128-bit; collisions are trivial to produce now. Good as an identifier but not secure against attackers.

  • SHA-1 — 160-bit; collision attacks exist. Avoid for security-sensitive use.

  • SHA-2 (SHA-256/512) — secure for current practical needs. Use SHA-256 for file identification and signing workflows.

  • ssdeep / TLSH — not cryptographically secure but similarity metrics are useful for clustering variants.

How signatures are used in practice

  • AV engine: exact-hash for known malicious binaries + YARA for families + heuristics for packed/obfuscated code.

  • IDS: network pattern matching + protocol decoding + rule thresholds (to avoid floods).

  • EDR & Hunting: YARA + fuzzy hashes + behavioral detection (abnormal process creation, suspicious command lines).

  • Threat intel sharing: publish hashes, YARA rules, domains, IPs as IOCs.

Strengths and weaknesses

Strengths:

  • Very precise for known threats (low false negatives for exact signatures).

  • Fast and deterministic.

  • Easy to share (hash lists, YARA rules).

Weaknesses:

  • Only detects what’s known — fails against novel malware, zero-days, or significant polymorphism.

  • Evasion: trivial binaries changes break exact hashes; packers, encryption, polymorphism, and runtime code generation avoid static signatures.

  • False positives/negatives: poorly written rules can match benign content or miss variants.

  • Volume / performance: large rule sets or regex-heavy signatures can tax endpoints or sensors.

Common evasion techniques

  • Changing a single byte or timestamp to alter exact hash.

  • Packing/packing with custom packers (changing file envelope).

  • Polymorphic/encrypted payloads; unpacking only at runtime.

  • Domain generation algorithms (DGAs) for network indicators.

  • Living-off-the-land (LoL) — using signed/legit binaries (bypass file-based detection).

Mitigations & best practices

  • Layered detection: don’t rely only on hashes. Combine static signatures (hashes, YARA) with behavioral detection, heuristics, telemetry, and sandboxing.

  • Prefer strong hashes (SHA-256) for IOC publication. Include ssdeep or TLSH for family clustering.

  • YARA + modules: use metadata, file format checks, PE/ELF parsing to reduce false positives.

  • Triage & threat intel: validate IOCs (avoid blind blocking), add context (first seen, source reputation).

  • Update cadence: keep signature/rule feeds current; roll out safely to avoid mass false positives.

  • Testing: test signatures in a staging environment and tune thresholds.

  • Canonicalization and normalization: normalize URLs/paths before rule matching to avoid trivial evasion.

Practical examples

Compute common hashes on Linux:

  • md5sum sample.exe

  • sha1sum sample.exe

  • sha256sum sample.exe

Small YARA example (file detection + metadata):

rule EvilBackdoor { meta: author = "analyst" description = "detects sample family X" strings: $c1 = "steal_credentials" ascii $s2 = { 8B FF 55 8B EC } // some byte pattern condition: (any of ($c*) and $s2) or filesize < 1MB }

Small ssdeep example (compare similarity):

  • Generate fuzzy hash: ssdeep -b sample.exe

  • Compare: ssdeep -k sample1.ssdeep sample2.ssdeep

Rule writing tips

  • Use anchored strings or hex patterns with offsets when possible (reduces false positives).

  • Avoid overly broad regexes across large inputs.

  • Add metadata (malware family, confidence, source) to rules.

  • Rate-limit noisy rules in network sensors (threshold options) to avoid alert storms.

When to use which signature

  • Use exact hashes for blocking confirmed, immutable malicious files (but with caution).

  • Use fuzzy hashes for hunting and clustering variants.

  • Use YARA for family detection, structural checks, and hunting in repositories.

  • Use IDS rules for network IOCs and protocol anomalies.

  • Use Sigma / SIEM rules for log-based behavioral detection.

Signature Based Detection Lab - Simple Example.



This is a custom made msfvenom payload that is a PE executable, compressed in a zip and locked with a a password, this is one of the common ways of hackers throwing around malicious softwares to meet their objectives, whatever their motive is. Signature based detection cannot pick this up until the file is unzipped and checked/or executed.



Comments

Popular posts from this blog

Common Network Commands: IP R

Junior Security Analyst Intro

Example of A Day in the Life of a Junior (Associate) Security Analyst