Research: Open threads

01 // Open threads

REF: RSCH-01

AI VISIBILITY & GSO

How do you measure whether an LLM cites a brand, and why?

Buying research increasingly happens inside AI assistants, not search results, where there is no rank and no console. We decompose 'AI visibility' into ~14 proprietary, formulaic metrics (embedding-alignment and token-probability style measures) so the question becomes diagnosable rather than opaque.

APPROACH: DECOMPOSE VISIBILITY INTO ~14 FORMULAIC METRICS; SEPARATE PRIOR FROM RETRIEVAL; SAMPLE FOR STABILITY

STATUS: ACTIVE FRAMEWORK + ANALYTICS BUILD

ACTIVE

Read the threadarrow_forward Productized: Semantic Signal

REF: RSCH-02

ACOUSTIC SPEECH ANALYSIS

What does the speech signal carry that a transcript throws away?

Transcribe-then-analyze discards almost everything that makes speech speech: tremor, hesitation, the hedged word. We route audio through a Praat phonetics sidecar to extract real vocal features, then fuse them with the transcript so feedback can say not just 'you hedged' but 'you hedged, and your voice confirmed it.'

APPROACH: PHONETICS SIDECAR EXTRACTS VOCAL FEATURES; FUSE WITH TRANSCRIPT FOR CORROBORATED FEEDBACK

STATUS: ACTIVE BUILD // PHONETICS SIDECAR

ACTIVE BUILD

Productized: VoiceReadyarrow_forward

REF: RSCH-03

AUTONOMOUS BUILD SYSTEMS

Can a specification be built by a fleet of agents under verification gates?

A spec is decomposed into independently verifiable tasks; a runner dispatches agents that must pass a gate before 'done' counts: format, lint, type-check, tests. Fleets run in isolated git worktrees so changes can't collide. Independent verifiability is the whole game.

APPROACH: SPEC -> TASKS -> RUNNER -> GATES; ISOLATED WORKTREES; TICK-BASED LONG-HORIZON AGENTS

STATUS: IN DAILY PRODUCTION USE

IN DAILY PRODUCTION USE

Read the threadarrow_forward This is how we Build

REF: RSCH-04

CLINICAL TRIAGE ARCHITECTURE

How do you let a model converse while a separate, grounded model owns the decision?

A fast voice front-end talks to a patient and gathers detail; a separate supervisor service makes the triage decision against an established clinical protocol. The voice is never trusted to make the call. It gathers; the supervisor adjudicates and owns the record.

APPROACH: SPLIT CONVERSATION FROM DECISION; GROUNDED SUPERVISOR OWNS STATE; PATTERN GENERALIZES BEYOND HEALTHCARE

STATUS: WORKING ARCHITECTURE

WORKING ARCHITECTURE

Read the threadarrow_forward Productized: TIA

REF: RSCH-05

MULTI-MODEL ORCHESTRATION

When does an ensemble of frontier models beat the best single model?

Routing a problem across multiple frontier models and reconciling their answers sometimes wins and sometimes just costs more. We study the production patterns that make an ensemble worth its overhead, and the cases where one strong default is the right call.

APPROACH: CROSS-MODEL FACT-CHECKING; DISAGREEMENT AS A CONFIDENCE SIGNAL; MEASURE WHEN THE ENSEMBLE EARNS ITS OVERHEAD

STATUS: PRODUCTION PATTERNS + ONGOING STUDY

ONGOING STUDY

Read the threadarrow_forward

REF: RSCH-06

RETRIEVAL OVER REGULATORY CORPORA

How do you do faithful RAG over large, unstructured regulatory documents?

Regulatory corpora are long, dense, and unforgiving. A wrong or unsourced answer is worse than none. We index multiple corpora and pursue retrieval that returns answers a reader can trace back to the passage that proves them.

APPROACH: INDEX MULTIPLE CORPORA; FAVOR FAITHFUL, PASSAGE-TRACEABLE RETRIEVAL OVER FLUENT-BUT-UNSOURCED ANSWERS

STATUS: MULTIPLE CORPORA INDEXED

MULTIPLE CORPORA INDEXED

Productized: FDA Intelligencearrow_forward

REF: RSCH-07

LANGUAGE-EMERGENCE SIMULATION

How does shared language emerge or fragment when populations meet on a spatial lattice?

A question pursued with an academic collaborator: simulate populations on a spatial lattice and watch whether a shared language emerges, drifts, or fragments. A first version has shipped.

APPROACH: NAMING GAME x SCHELLING ON A LATTICE; LOUVAIN COMMUNITY DETECTION; DETERMINISTIC, REPRODUCIBLE RUNS

STATUS: ACADEMIC COLLABORATION // V1 SHIPPED

ACADEMIC COLLABORATION // V1 SHIPPED

Read the threadarrow_forward

REF: RSCH-08

REAL-TIME COLLABORATIVE SIMULATION

What is the right substrate for multiplayer, GPU-rendered simulation?

Many participants, shared state, GPU-rendered, in real time: the substrate question underneath collaborative simulation. We have an architecture prototype and a list of the constraints that actually bind.

APPROACH: SEPARATE THE THREE CONCERNS (ENTITY STATE, CONFLICT-FREE SYNC, GPU RENDER) AND FIND THE BINDING CONSTRAINT

STATUS: ARCHITECTURE PROTOTYPE

ARCHITECTURE PROTOTYPE

03 // Why a lab invests in research

Research is how the next engagement gets cheaper, safer, and better.

Most shops bill the hour and move on. A lab keeps the question. The difference compounds: a method we prove once (independent verifiability, a phonetics sidecar, a supervisor that owns the decision) pays out on every engagement that follows, instead of being re-discovered at each client's expense.

It is also how credibility is earned rather than claimed. We do not argue that we understand frontier systems; we show the thread, the measurements, and what we now know that we did not before. And it keeps us standing at the frontier on purpose, close enough to the open problems that when a client's hard problem arrives, we have already been living next to it.

See what it produced How we turn a thread into a buildarrow_forward

REF: WHY-01

COMPOUNDING

A method proven once is reused on every engagement after. The cost of the answer is paid down across all the work it touches.

REF: WHY-02

CREDIBILITY

We show the thread and the measurements, not a claim. Understanding is demonstrated, not asserted.

REF: WHY-03

STANDING AT THE FRONTIER

We stay next to the open problems on purpose, so a client's hard problem is one we have already been living beside.

A lab earns the name.