Research pipeline
Nine stages from topic to published report.
01 Plan
Build a structured research workplan from the user's topic, depth, and instructions.
LLM call
- Function
suggestResearchPlanWithLlm- Model
planner- Params
- temp 0.15, 6k tokens, 180s provider / 300s outer
- Schema
structured_research_plan
System prompt
You are the research planner for an agent-first research runtime. Produce the full semantic plan for the run as strict JSON matching the schema exactly. Infer the right unit of collection, search families, discovery channels, extraction targets, and coverage lanes from the request itself. Honor explicit caller constraints for depth, source policy, output schemas, and limits. Coverage lanes should describe what evidence the final answer must contain. Query plan items must be concrete search-engine queries under 300 characters. For long customer briefs, preserve the requested output contract in successCriteria and extractionTargets — every explicit deliverable should appear in the plan checklist.
User payload
{ topic, requestedDepth, requestedSourcePolicy, requestedOutputSchemas,
requestedLimits, instructions: compactPlannerInstructions(instructions) }
Output
research_plan_v2 — objective, collection unit, query plan, coverage lanes, source priorities, discovery channels, extraction targets, success criteria, follow-up rules, stop conditions.
Gate
No LLM plan, empty query plan, missing lanes/schemas/success criteria → fail run.
Also
generateRunTitle via planner model — best-effort, keeps default on failure.
02 Recall
Load customer library context and prior research to inform acquisition.
What happens
- Fetches same-customer library context as routing signal (not factual proof)
- Deduplicates known fresh URLs out of new acquisition
- Builds research contract binding the run to plan + customer intent
Tools
- Provider
- Postgres — customer library tables
- Query
- Same-customer packs, sources, extractions by topic similarity
Output
library_recall_v1, research_contract_v1
03 Acquire
Search the web, scrape pages, and extract video transcripts to build the source corpus.
Search
- Primary
- Exa — semantic search with topic, instructions, source policy, coverage lanes, query plan
- Fallback
- Firecrawl search
- Budget
- Planned queries + reserved refinement budget by depth
LLM — Candidate triage
- Function
triageCandidatesWithLlm- Model
triage- Params
- temp 0.12, 3k tokens, 90s
- Schema
candidate_triage- Failure
- Skip; deterministic selection continues
Triage system prompt
Triage cheap web-search candidates for an agent-first research system. Use only URL, title, snippet, domain, rank, source class, and planner context. Prioritize candidates that fill required coverage lanes. Prefer primary, authoritative, specific, and information-dense sources. Reject junk, thin pages, login walls, duplicates, discovery pages, off-topic results. Suggest follow-up queries only when important source types or subtopics are missing.
Scrape
- Provider
- Firecrawl
- Output
- Normalized
SourceDocument— markdown, URL, domain, media type, metadata, hash, quality score - Dedup
- Content hash + near-duplicate demotion
Video transcripts
- Provider
- Supadata API
- Routing
- YouTube URLs split from scrape candidates via
canTranscribe() - Modes
native|auto|generate- Metadata
- oEmbed — title, channel, thumbnail, duration
LLM — Speaker labeling
- Function
restructureTranscript- Model
analyst- Method
- 36-turn chunks → infer speaker labels (Host, Guest, named)
- Gate
- 55% labeled turns required; graceful fallback
Video artifacts
video_transcript_v1— segments with speaker labels, timestamps, confidencevideo_quotes_v1— high-signal quotes, concept inference, entity mentionsvideo_metadata_v1— title, channel, thumbnail, duration- 30–90s readable transcript windows for synthesis
04 Extract
Extract structured evidence from every acquired source. Each source gets its own LLM call that decides accept, ignore, or fail.
No pre-filtering
Every acquired source goes directly to per-source extraction. There is no portfolio selection or deterministic ranking step — the extraction LLM sees the full content of each source and makes its own relevance judgment. Sources that are junk, off-topic, or too thin are marked ignored by the extraction LLM and excluded from the corpus. This eliminates the single-point-of-failure risk of a large batch triage call and ensures no source is silently discarded by topic-blind heuristics.
LLM — Source extraction
- Function
extractSourceWithLlm- Model
extraction- Params
- temp 0.1, 9k tokens, 120s
- Schema
source_extraction- Sections
- 48 for transcripts, 18 for web
- Reuse
- Same-customer extractions by content hash or canonical URL
Extraction system prompt
Extract source-level research records for downstream AI agents. Use only the provided source sections. Classify the source itself — distinguish primary pages from roundups, community threads, videos, discovery pages, and reference pages. For video transcripts, summarize substantive points and preserve representative quoted evidence with timestamps. Use the analyst workplan and coverage lanes to judge relevance. Extract dated events, laws, organizations, quantitative facts, policy claims, stakeholder positions, limitations, and caveats. Mark generic background as ignored when it does not help the specific topic. If the page is irrelevant or thin, return status ignored with empty arrays.
Extraction prompt input
buildSourceExtractionPromptInput(): topic, instructions, outputSchemas, mediaScope, collectionUnit, successCriteria, extractionTargets, coverageLanes, queryPlan, topicAnchors, sourceUrls, source metadata, markdown sections
LLM — Extraction audit
- Function
auditSourceExtractionWithLlm- Model
extraction- When
- Per accepted extraction
- Failure
- Keep original if audit unavailable
Output
source_extraction_v1 — claims, facts, dates, evidence, classification.source_analysis_record_v1 — durable per-source record for corpus metasynthesis.
Gate
No structured extraction → failed extraction persisted. Accepted records below threshold → fail run.
05 Review
Evaluate evidence coverage against lanes and trigger targeted follow-up.
LLM — Evidence coverage
- Function
reviewEvidenceCoverageWithLlm- Model
coverage- Params
- temp 0.1, 4k tokens, 120s provider / 150s outer
- Schema
evidence_coverage_review- Failure
- Deterministic coverage review baseline
Coverage system prompt
Review whether extracted source evidence satisfies the required coverage lanes. Use only provided source ids, summaries, evidence counts, and excerpts. Prefer semantic judgment over raw counts: a single strong primary source can satisfy a lane, while several generic sources may not. Compare extracted evidence against the planner's lane goals and query plan. Return precise gaps and follow-up queries only for evidence that is actually missing or weak. Follow-up queries can include date constraints — the runtime applies date range filtering automatically.
Follow-up loop
If lanes are weak → targeted follow-up queries → reacquire + extract while loop budget remains (1 / 3 / 6 by depth).
Output
research_state_v1 — loop, phase, source counts, gap counts, next queries, history.
06 Merge
Extract knowledge and insight from collected evidence through decomposed focused passes.
Stage 0 — Foundation
Deterministic aggregation of raw evidence from all extractions into structured candidate lists (claims, chronology, source rankings). Data wrangling that seeds LLM passes, never published as final output.
Stage 1 — Pattern Identification (4 parallel LLM passes)
- 1A: Claim consolidation
- Semantic dedup + theme/pattern detection across all claims
- 1B: Chronology
- Causal sequencing — what led to what, not just date sorting
- 1C: Comparisons
- Analytical comparison tables that reveal insight
- 1D: Media concepts
- Transcript runs only — named ideas, frameworks, models
- Model
corpus- Schemas
consolidated_claims,enriched_chronology,comparison_tables
Stage 2 — Insight Synthesis (1 sequential LLM pass)
- Input
- Claim patterns, chronology, comparisons, deliverables, coverage lanes — NOT raw source data
- Output
- Competing theses with evidence threads, deliverable mappings, tensions, coverage gaps
- Schema
insight_synthesis- Failure
- Critical — retries 3 times, fails merge if exhausted
Stage 3 — Evidence Mapping (1 sequential LLM pass)
- Input
- Theses, claim patterns, source metadata
- Output
- Best sources mapped to theses, contradictions as analytical tensions
- Schema
evidence_mapping- Failure
- Non-critical — omitted on failure
Output
corpus_analysis_v1 — competing theses, consolidated claims with evidence threads, causal chronology, analytical comparisons, contradictions as tensions, best sources mapped to theses, coverage gaps, media concepts.
Gate
Claim consolidation or insight synthesis failed after retries → fail merge explicitly. Non-critical passes omitted on failure, not filled with deterministic text.
07 Evaluate
Judge whether the corpus is complete enough for synthesis — may trigger more loops.
LLM — Research quality
- Function
reviewResearchQualityWithLlm- Model
quality- Params
- temp 0.1, 4k tokens, 120s provider / 150s outer
- Schema
research_quality_review- Failure
- Merges with deterministic baseline
Quality system prompt
Judge whether the run is complete enough for downstream AI agents. Treat successCriteria and extractionTargets as a checklist. Verify ranked tables, confidence tags, named contrarian perspectives, quantitative support, and requested memo shape when asked. Mark needs_followup when critical criteria remain unsupported. Checks: original instructions, planner objective, success criteria, extraction targets, source priorities, coverage lanes, evidence review, source mix, corpus analysis.
Quality loop
High-severity issues → targeted acquisition → extraction → evidence review → corpus merge → re-evaluation. Stops on budget exhaustion, source cap, synthesis reserve depletion, or stagnation.
Output
research_quality_review_v1
08 Write
Structured synthesis, markdown report, contract review, and optional rewrite.
LLM — Structured synthesis
- Function
synthesizePackWithLlm- Model
synthesis- Params
- temp 0.1, 7k tokens, 180s / 210s outer
- Schema
research_pack_synthesis
Synthesis system prompt
Synthesize completed research runs for downstream AI agents. Use only the provided source ids and sampled source text. Quality outranks source volume. Use corpus.bestSources, sourceRole, evidenceStrength, primaryFor, and citationTarget to decide citation weight. Use the analyst workplan as the target answer shape. For media/video runs, preserve corpus.mediaConcepts in keyClaims and the brief. Produce a compact markdown brief plus structured action surfaces. Keep agentBriefMarkdown under 12,000 characters.
LLM — Report writer
- Function
writeResearchReportWithLlm- Model
synthesis, fallbackgpt-4.1- Params
- temp 0.2, 24k tokens, 300s
- Format
- Markdown text
Report system prompt
Write a plain-English markdown research report. customerInstructions, plannerObjective, successCriteria, extractionTargets, and requestedOutputContract are binding. Lead with the answer in the first paragraph. Cite source IDs inline like [source-id] for each important claim. For investment memos: ranking, thesis table, deep dives, why now, why overlooked, evidence, risks, confidence, and investor fit. For media reports: video intelligence report giving the user the alpha from watching without spending the time. Use mediaEvidence.quoteTable directly. Default: prose paragraphs, short bullets. Tables only for genuinely tabular data. Never write extraction status reports or pipeline mechanics.
Optional — Deep Research
writeResearchReportWithDeepResearch via OPENAI_DEEP_RESEARCH_MODEL, 930s. Only when configured. Falls through to normal writer on failure.
LLM — Report contract review
- Function
reviewResearchReportWithLlm- Model
synthesis- Params
- temp 0.05, 4k tokens, 150s
- Schema
report_contract_review
Review system prompt
Review the final markdown report against the user's requested output contract. Treat customerInstructions, plannerObjective, successCriteria, extractionTargets, and sourcePriorities as the checklist. Pass only if the report substantially follows the requested shape and answers the critical criteria. Do not ask for more acquisition when the issue is writing — require a rewrite. Mark evidenceNeeded=true only when the corpus itself cannot support the point. Return precise rewrite instructions a report writer can apply directly.
Rewrite flow
Medium/high issues → one rewrite (240s) → revised review (180s). Must pass: no high issues, score ≥ 0.75.
Output
research_report_markdown_v1, agent_brief_markdown_v1, claim_evidence_map_v1
Gate
Missing synthesis, missing report, missing review, failed rewrite, or revised score < 0.75 → fail run.
09 Publish
Generate indexes, optional persona, persist all artifacts, and settle billing.
Steps
- Build cluster index + record index for Run Analyst retrieval
- Generate action manifest from synthesis
- Persist all artifacts to Postgres
- Settle billing — close MPP budget session, record final usage
- Mark job completed
LLM — Persona
- Function
generateRunPersona- Model
analyst- Failure
- Best-effort; ignored
Run Analyst
- Function
answerRunAnalystQuestion- Model
analyst, fallbackextraction- Scope
- Run-scoped retrieval only — returns gaps instead of generic web
Final artifacts
cluster_index_v1, record_index_v1, action_manifest_v1