Research pipeline

Nine stages from topic to published report.

01 Plan

Build a structured research workplan from the user's topic, depth, and instructions.

LLM call

Function
suggestResearchPlanWithLlm
Model
planner
Params
temp 0.15, 6k tokens, 180s provider / 300s outer
Schema
structured_research_plan
System prompt
You are the research planner for an agent-first research runtime.
Produce the full semantic plan for the run as strict JSON matching the schema exactly.
Infer the right unit of collection, search families, discovery channels,
extraction targets, and coverage lanes from the request itself.
Honor explicit caller constraints for depth, source policy, output schemas, and limits.
Coverage lanes should describe what evidence the final answer must contain.
Query plan items must be concrete search-engine queries under 300 characters.
For long customer briefs, preserve the requested output contract in successCriteria
and extractionTargets — every explicit deliverable should appear in the plan checklist.
User payload
{ topic, requestedDepth, requestedSourcePolicy, requestedOutputSchemas,
  requestedLimits, instructions: compactPlannerInstructions(instructions) }

Output

research_plan_v2 — objective, collection unit, query plan, coverage lanes, source priorities, discovery channels, extraction targets, success criteria, follow-up rules, stop conditions.

Gate

No LLM plan, empty query plan, missing lanes/schemas/success criteria → fail run.

Also

generateRunTitle via planner model — best-effort, keeps default on failure.

engine/planner.ts engine/planner-request.ts engine/providers/openrouter/planning.ts

02 Recall

Load customer library context and prior research to inform acquisition.

What happens

  • Fetches same-customer library context as routing signal (not factual proof)
  • Deduplicates known fresh URLs out of new acquisition
  • Builds research contract binding the run to plan + customer intent

Tools

Provider
Postgres — customer library tables
Query
Same-customer packs, sources, extractions by topic similarity

Output

library_recall_v1, research_contract_v1

engine/research/pipeline-planning.ts

03 Acquire

Search the web, scrape pages, and extract video transcripts to build the source corpus.

Search

Primary
Exa — semantic search with topic, instructions, source policy, coverage lanes, query plan
Fallback
Firecrawl search
Budget
Planned queries + reserved refinement budget by depth

LLM — Candidate triage

Function
triageCandidatesWithLlm
Model
triage
Params
temp 0.12, 3k tokens, 90s
Schema
candidate_triage
Failure
Skip; deterministic selection continues
Triage system prompt
Triage cheap web-search candidates for an agent-first research system.
Use only URL, title, snippet, domain, rank, source class, and planner context.
Prioritize candidates that fill required coverage lanes.
Prefer primary, authoritative, specific, and information-dense sources.
Reject junk, thin pages, login walls, duplicates, discovery pages, off-topic results.
Suggest follow-up queries only when important source types or subtopics are missing.

Scrape

Provider
Firecrawl
Output
Normalized SourceDocument — markdown, URL, domain, media type, metadata, hash, quality score
Dedup
Content hash + near-duplicate demotion

Video transcripts

Provider
Supadata API
Routing
YouTube URLs split from scrape candidates via canTranscribe()
Modes
native | auto | generate
Metadata
oEmbed — title, channel, thumbnail, duration

LLM — Speaker labeling

Function
restructureTranscript
Model
analyst
Method
36-turn chunks → infer speaker labels (Host, Guest, named)
Gate
55% labeled turns required; graceful fallback

Video artifacts

  • video_transcript_v1 — segments with speaker labels, timestamps, confidence
  • video_quotes_v1 — high-signal quotes, concept inference, entity mentions
  • video_metadata_v1 — title, channel, thumbnail, duration
  • 30–90s readable transcript windows for synthesis
engine/research/pipeline-initial-acquisition.ts engine/research/acquisition.ts engine/research/acquisition-scrape-stage.ts engine/research/acquisition-transcripts.ts engine/supadata.ts engine/research/transcript-restructure.ts engine/research/media-artifacts.ts

04 Extract

Extract structured evidence from every acquired source. Each source gets its own LLM call that decides accept, ignore, or fail.

No pre-filtering

Every acquired source goes directly to per-source extraction. There is no portfolio selection or deterministic ranking step — the extraction LLM sees the full content of each source and makes its own relevance judgment. Sources that are junk, off-topic, or too thin are marked ignored by the extraction LLM and excluded from the corpus. This eliminates the single-point-of-failure risk of a large batch triage call and ensures no source is silently discarded by topic-blind heuristics.

LLM — Source extraction

Function
extractSourceWithLlm
Model
extraction
Params
temp 0.1, 9k tokens, 120s
Schema
source_extraction
Sections
48 for transcripts, 18 for web
Reuse
Same-customer extractions by content hash or canonical URL
Extraction system prompt
Extract source-level research records for downstream AI agents.
Use only the provided source sections.
Classify the source itself — distinguish primary pages from roundups,
community threads, videos, discovery pages, and reference pages.
For video transcripts, summarize substantive points and preserve
representative quoted evidence with timestamps.
Use the analyst workplan and coverage lanes to judge relevance.
Extract dated events, laws, organizations, quantitative facts, policy claims,
stakeholder positions, limitations, and caveats.
Mark generic background as ignored when it does not help the specific topic.
If the page is irrelevant or thin, return status ignored with empty arrays.
Extraction prompt input
buildSourceExtractionPromptInput():
  topic, instructions, outputSchemas, mediaScope, collectionUnit,
  successCriteria, extractionTargets, coverageLanes, queryPlan,
  topicAnchors, sourceUrls, source metadata, markdown sections

LLM — Extraction audit

Function
auditSourceExtractionWithLlm
Model
extraction
When
Per accepted extraction
Failure
Keep original if audit unavailable

Output

source_extraction_v1 — claims, facts, dates, evidence, classification.
source_analysis_record_v1 — durable per-source record for corpus metasynthesis.

Gate

No structured extraction → failed extraction persisted. Accepted records below threshold → fail run.

engine/research/extraction-portfolio.ts engine/research/extraction.ts engine/research/extraction-source.ts engine/source-processing/extraction-prompt.ts engine/providers/openrouter/extraction.ts

05 Review

Evaluate evidence coverage against lanes and trigger targeted follow-up.

LLM — Evidence coverage

Function
reviewEvidenceCoverageWithLlm
Model
coverage
Params
temp 0.1, 4k tokens, 120s provider / 150s outer
Schema
evidence_coverage_review
Failure
Deterministic coverage review baseline
Coverage system prompt
Review whether extracted source evidence satisfies the required coverage lanes.
Use only provided source ids, summaries, evidence counts, and excerpts.
Prefer semantic judgment over raw counts: a single strong primary source
can satisfy a lane, while several generic sources may not.
Compare extracted evidence against the planner's lane goals and query plan.
Return precise gaps and follow-up queries only for evidence that is actually
missing or weak.
Follow-up queries can include date constraints — the runtime applies date
range filtering automatically.

Follow-up loop

If lanes are weak → targeted follow-up queries → reacquire + extract while loop budget remains (1 / 3 / 6 by depth).

Output

research_state_v1 — loop, phase, source counts, gap counts, next queries, history.

engine/research/pipeline-evidence-refinement.ts engine/research/pipeline-evidence-refinement-loop.ts engine/research/pipeline-review.ts engine/providers/openrouter/coverage.ts

06 Merge

Extract knowledge and insight from collected evidence through decomposed focused passes.

Stage 0 — Foundation

Deterministic aggregation of raw evidence from all extractions into structured candidate lists (claims, chronology, source rankings). Data wrangling that seeds LLM passes, never published as final output.

Stage 1 — Pattern Identification (4 parallel LLM passes)

1A: Claim consolidation
Semantic dedup + theme/pattern detection across all claims
1B: Chronology
Causal sequencing — what led to what, not just date sorting
1C: Comparisons
Analytical comparison tables that reveal insight
1D: Media concepts
Transcript runs only — named ideas, frameworks, models
Model
corpus
Schemas
consolidated_claims, enriched_chronology, comparison_tables

Stage 2 — Insight Synthesis (1 sequential LLM pass)

Input
Claim patterns, chronology, comparisons, deliverables, coverage lanes — NOT raw source data
Output
Competing theses with evidence threads, deliverable mappings, tensions, coverage gaps
Schema
insight_synthesis
Failure
Critical — retries 3 times, fails merge if exhausted

Stage 3 — Evidence Mapping (1 sequential LLM pass)

Input
Theses, claim patterns, source metadata
Output
Best sources mapped to theses, contradictions as analytical tensions
Schema
evidence_mapping
Failure
Non-critical — omitted on failure

Output

corpus_analysis_v1 — competing theses, consolidated claims with evidence threads, causal chronology, analytical comparisons, contradictions as tensions, best sources mapped to theses, coverage gaps, media concepts.

Gate

Claim consolidation or insight synthesis failed after retries → fail merge explicitly. Non-critical passes omitted on failure, not filled with deterministic text.

engine/research/corpus.ts engine/research/corpus-stage-merge.ts engine/providers/openrouter/corpus-passes.ts engine/source-processing/corpus-prompt-input.ts

07 Evaluate

Judge whether the corpus is complete enough for synthesis — may trigger more loops.

LLM — Research quality

Function
reviewResearchQualityWithLlm
Model
quality
Params
temp 0.1, 4k tokens, 120s provider / 150s outer
Schema
research_quality_review
Failure
Merges with deterministic baseline
Quality system prompt
Judge whether the run is complete enough for downstream AI agents.
Treat successCriteria and extractionTargets as a checklist.
Verify ranked tables, confidence tags, named contrarian perspectives,
quantitative support, and requested memo shape when asked.
Mark needs_followup when critical criteria remain unsupported.

Checks: original instructions, planner objective, success criteria,
extraction targets, source priorities, coverage lanes, evidence review,
source mix, corpus analysis.

Quality loop

High-severity issues → targeted acquisition → extraction → evidence review → corpus merge → re-evaluation. Stops on budget exhaustion, source cap, synthesis reserve depletion, or stagnation.

Output

research_quality_review_v1

engine/research/pipeline-quality-initial.ts engine/research/pipeline-quality-loop.ts engine/research/pipeline-quality-round.ts engine/providers/openrouter/coverage.ts

08 Write

Structured synthesis, markdown report, contract review, and optional rewrite.

LLM — Structured synthesis

Function
synthesizePackWithLlm
Model
synthesis
Params
temp 0.1, 7k tokens, 180s / 210s outer
Schema
research_pack_synthesis
Synthesis system prompt
Synthesize completed research runs for downstream AI agents.
Use only the provided source ids and sampled source text.
Quality outranks source volume. Use corpus.bestSources, sourceRole,
evidenceStrength, primaryFor, and citationTarget to decide citation weight.
Use the analyst workplan as the target answer shape.
For media/video runs, preserve corpus.mediaConcepts in keyClaims and the brief.
Produce a compact markdown brief plus structured action surfaces.
Keep agentBriefMarkdown under 12,000 characters.

LLM — Report writer

Function
writeResearchReportWithLlm
Model
synthesis, fallback gpt-4.1
Params
temp 0.2, 24k tokens, 300s
Format
Markdown text
Report system prompt
Write a plain-English markdown research report.
customerInstructions, plannerObjective, successCriteria, extractionTargets,
and requestedOutputContract are binding.
Lead with the answer in the first paragraph.
Cite source IDs inline like [source-id] for each important claim.
For investment memos: ranking, thesis table, deep dives, why now, why overlooked,
evidence, risks, confidence, and investor fit.
For media reports: video intelligence report giving the user the alpha
from watching without spending the time. Use mediaEvidence.quoteTable directly.
Default: prose paragraphs, short bullets. Tables only for genuinely tabular data.
Never write extraction status reports or pipeline mechanics.

Optional — Deep Research

writeResearchReportWithDeepResearch via OPENAI_DEEP_RESEARCH_MODEL, 930s. Only when configured. Falls through to normal writer on failure.

LLM — Report contract review

Function
reviewResearchReportWithLlm
Model
synthesis
Params
temp 0.05, 4k tokens, 150s
Schema
report_contract_review
Review system prompt
Review the final markdown report against the user's requested output contract.
Treat customerInstructions, plannerObjective, successCriteria, extractionTargets,
and sourcePriorities as the checklist.
Pass only if the report substantially follows the requested shape and answers
the critical criteria.
Do not ask for more acquisition when the issue is writing — require a rewrite.
Mark evidenceNeeded=true only when the corpus itself cannot support the point.
Return precise rewrite instructions a report writer can apply directly.

Rewrite flow

Medium/high issues → one rewrite (240s) → revised review (180s). Must pass: no high issues, score ≥ 0.75.

Output

research_report_markdown_v1, agent_brief_markdown_v1, claim_evidence_map_v1

Gate

Missing synthesis, missing report, missing review, failed rewrite, or revised score < 0.75 → fail run.

engine/research/pipeline-finish.ts engine/research/synthesis-runner.ts engine/research/synthesis-report-writer.ts engine/providers/openrouter/synthesis.ts engine/providers/openrouter/report.ts engine/providers/openrouter/report-review.ts

09 Publish

Generate indexes, optional persona, persist all artifacts, and settle billing.

Steps

  1. Build cluster index + record index for Run Analyst retrieval
  2. Generate action manifest from synthesis
  3. Persist all artifacts to Postgres
  4. Settle billing — close MPP budget session, record final usage
  5. Mark job completed

LLM — Persona

Function
generateRunPersona
Model
analyst
Failure
Best-effort; ignored

Run Analyst

Function
answerRunAnalystQuestion
Model
analyst, fallback extraction
Scope
Run-scoped retrieval only — returns gaps instead of generic web

Final artifacts

cluster_index_v1, record_index_v1, action_manifest_v1