One case through the whole machine, and a likely false flag coming apart.
A regional logistics provider is hit. The malware and the infrastructure both point (loudly) at a known actor we'll call RED. This page follows every Malwarebox component handing its piece to the next, until SOLBIT notices that only the cheap evidence blames RED. Watch the attribution scoreboard fill as you scroll.
Raw sightings from collection modules: a phishing lure, a malware sample, a cluster of freshly registered domains, one multilingual ransom note.
Ties each observation to the entity it concerns, with three timestamps and a source-side confidence. Nothing is interpreted yet: this is fact, not meaning.
Evidence { evidence_id: "EVD-2026-0142", kind: "campaign_artifact", occurred_at: "2026-01-22T03:11Z", confidence: 0.95, iim_refs: [ "sample:9f3c…", "fqdn:…", "note:…" ] }
The binary Evidence from KRAKEN.
Static + dynamic analysis. Finds a commodity loader (copyable), code reuse with RED's toolkit, and Cyrillic strings pointing at RED. Rich evidence, but every bit of it is cheap to plant.
sample_analysis { sha256: "9f3c4b…", family: "commodity_loader", // copyable code_reuse: [ "RED_toolkit" ], strings: [ "<Cyrillic strings> → RED" ] }
The domain & hosting Evidence from KRAKEN.
Models the infrastructure as roles and relations (entry → redirector → staging → C2), and notes a registrar/TLD habit matching RED. The nodes are rentable; the pattern is the real tell.
IIM.chain { shape: entry → redirector → staging → c2, registrar: "RED-like habits", pattern_ref: "MB-F-0317" }
Timing & behavioral Evidence, interpreted into versioned Decision Events.
Reads the operation as a staffed human enterprise: build & C2 activity cluster in UTC+7, holiday dips on a different calendar than RED, cleanup discipline matching cluster 0zzz. Hard to fake for months under pressure.
DecisionEvent[] → Modus.profile { operation_hours: "UTC+7", // ≠ RED holiday_dips: "≠ RED calendar", tradecraft: "≈ 0zzz, ≠ RED" }
The ransom note (EN + RU + KO) and panel strings.
Source-pivot says the English is the original; the RU carries machine-translation tells. Exclusionary sieve: error pattern inconsistent with L1-Russian and L1-Korean, consistent with L1-Vietnamese, lining up with UTC+7. The note is operator-authored, so it's admissible.
Lingua.profile { source_pivot: "en", // not ru exclude: [ "≠ L1-Russian", "≠ L1-Korean" ], L1_distribution: { Vietnamese: 0.55, … }, authorship: "operator_authored", solbit_L: { locale: 0.70→RED [surface], idiolect: 0.15→RED [deep] } }
All six dimensions: T, I (surface) and S, O, B, L-idiolect (deep). It also computes Strategic: the victim serves no RED-sponsor objective.
Instead of averaging, it splits the evidence by cost-to-fake and compares the two halves. Surface screams RED. Deep doesn't. That gap is not noise; it's the finding.
Surface-vs-deep divergence detector
SOLBIT.verdict { target: "MB-ACTOR-0yyy", surface_consensus: "High", // T, I → RED deep_consensus: "Low", // S,O,B,L.idiolect → not RED divergence_flag: true, false_flag_assessment: "probable", // a probability, not a fact tier: "T0", confidence: "Moderate", competing: [ { "MB-ACTOR-0zzz": "T0 · Moderate" }, // likely hand { "RED": "T1 · Low, likely planted" } ], collection_gaps: [ "operational_timing", "linguistic_idiolect" ] }
The verdict's assessed cluster: 0zzz, rather than RED.
Maps the defender's context (sector, geography, exposure) against the attributed actor. Attribution is what makes this ranking trustworthy; you can't prioritize against a ghost.
MB-RM.ranking { defender: { sector: "logistics", geo: "EU" }, actor: "MB-ACTOR-0zzz", // the real one relevance: "High", rationale: "0zzz targets EU logistics; RED would not" }
The relevant, correctly-attributed actor 0zzz.
Ranks defensive controls by actor disruption (ADV), impact reduction (IRR), cost/complexity (CC) and detection-to-decision time (DDT), against the real tradecraft, not the decoy's.
ACDP.controls (vs MB-ACTOR-0zzz) { rank_axes: [ ADV, IRR, CC, DDT ], top: [ "kill staging-domain reuse path", "detect UTC+7 build cadence", "harden initial-access vector" ] }
SOLBIT's collection_gaps: the evidence that would harden the verdict.
Turns gaps into concrete tasking for the next cycle: acquire more operational timing and linguistic idiolect to test the RED false-flag hypothesis and firm up the assessment. The system doesn't just answer; it asks better next time.
KRAKEN.collection_tasks { from: "SOLBIT.collection_gaps", acquire: [ "operational_timing", "linguistic_idiolect" ], goal: "harden 0zzz hypothesis · test the RED false-flag" } // → loops back to station 01 · next cycle
The whole point: structure beats the sum.
Every component did one job and handed a clean object to the next. The magic isn't any single tool; it's the wiring: cheap evidence and expensive evidence are kept separate, weighed by how hard they are to fake, and their disagreement is treated as a finding instead of being averaged away. Then the gaps flow back into collection, and the loop runs again.