Skip to content

Query coverage

64 realistic queries generated from the substrate schema alone, mapped to the API surface. The pressure test that surfaced three gaps and locked three new specs.

Method

A subagent was given the migration files and ingester specs — and explicitly forbidden from reading any of the API/MCP specs. Its instructions: "you are a senior policy analyst evaluating a new federal-policy database; generate 60+ realistic questions you might ask this data, across 8 personas (journalist, lobbyist, Hill staffer, federal contractor, academic, State AG, issue-advocacy nonprofit, constituent). Phrase the questions as users would."

The agent returned 64 queries across 10 intent categories. Each query was then mapped to the right query flow and API surface — the mapping below is the result of that mapping pass, which surfaced two real gaps (aggregation, dossiers) and validated one prior design recommendation (entity resolution as a separate tool).

When to re-run this

Any time a new data source is added; any time a search-shape spec changes meaningfully; before flipping any of the surface specs to verified. The exercise should take ~2 hours: spawn the subagent fresh (no API knowledge), let it generate, map by hand, diff against the prior version.

Headline

Supported today 50 Map cleanly to existing specs (search, resource-endpoints, MCP Class A/B/C).
Closed by new specs 22 entity-resolution (4), aggregations (11), dossiers (11) — overlapping totals; 22 distinct queries.
Partial / orchestration 8 Cross-source joins beyond the curated dossiers — workable via agent orchestration; some are 4+ tool calls.
Genuinely deferred 3 Citation-graph traversal (Q44, Q48, Q51) — requires inter-document citation extraction, a future ingester problem.

Per-query rows below carry a single status tag: ✓ supported by the existing surface, + new spec closing the gap, ◐ partial needing agent orchestration, or ⚠ deferred for a future spec. Tool annotations are the MCP class + tool name.

Coverage by category

1 · Specific record lookup

8 queries · Flow 1 · MCP Class A or C

Full support
#PersonaQuestionSourcesToolStatus
1Hill stafferStatus and latest action on HR 1 (119th)billsget_bill
2ConstituentFull text of Public Law 118-42public-lawsfetch / get_public_law
3Federal contractorOpen docket EPA-HQ-OAR-2024-0123 with comments metadataregulations-dot-gov-docketsfetch
4JournalistPull up CRS report R48481crs-reportsfetch / get_crs_report
5LobbyistMost recent LDA filing for Pfizer × Brownstein Hyattlda-filingsfetch + filter
6AcademicRoll-call vote record for House Vote 119-2025-87roll-call-votesfetch / get_roll_call
7Hill stafferSAP text for HR 4567statements-of-administration-policyfetch
8State AGGAO report GAO-25-106789 in fullgao-reportsfetch / get_gao_report

2 · Keyword scan within a fixed document

8 queries · Flow 2 with bill_id / FR doc / CFR part filter · MCP Class B (lexical_search)

Full support
#PersonaQuestionSourcesToolStatus
9Hill stafferFind 'carried interest' inside HR 1 (119th)billslexical_search + bill_id filter
10Federal contractorLocate 'Buy American' in IRA public law textpublic-lawslexical_search + id filter
11LobbyistSearch 2024-11-14 Senate Finance transcript for 'pharmacy benefit manager'hearing-transcriptslexical_search + hearing_id filter
12AcademicEvery 'climate' in CREC 2025-09-12congressional-recordlexical_search + date filter
13Federal contractor'autonomous vehicle' in 49 CFR Part 571ecfr-and-cfrlexical_search + title/part filter
14Journalist'whistleblower' inside CRS report R47999crs-reportslexical_search + id filter
15NonprofitEvery 'shall' clause in EPA PFAS final rulefederal-registerlexical_search + FR doc filter
16State AG'preemption' inside committee report for HR 2 (119th)committee-reportslexical_search + bill_id filter

3 · Conceptual / paraphrased search

8 queries · Flows 3 / 4 · MCP Class B (semantic_search or search)

Full support
#PersonaQuestionSourcesToolStatus
17LobbyistBills that would weaken patent protections for biologicsbillssemantic_search
18AcademicCREC speeches arguing the filibuster is undemocraticcongressional-recordsemantic_search
19Federal contractorRegs that effectively ban PFAS in firefighting foamecfr-and-cfr, federal-registersearch (hybrid, multi-source)
20JournalistGAO reports critical of DoD shipbuilding cost overrunsgao-reportssemantic_search
21Hill stafferCRS reports on the legal theory behind nationwide injunctionscrs-reportssemantic_search
22State AGHearing testimony where Big Tech execs minimized child-safety concernshearing-transcriptssemantic_search
23NonprofitBills proposing means-tested student debt cancellationbillssemantic_search
24ConstituentStatutes allowing exec to freeze foreign assets w/o judicial reviewus-codesemantic_search

4 · Filtered / scoped search

6 queries · Filter params on list + search endpoints · MCP Class A or B with filters

+ rest-api-resource-endpoints (filter params)
#PersonaQuestionSourcesToolStatus
25JournalistEvery NPRM from EPA between 2025-01-20 and 2025-04-30federal-registerGET /federal-register?agency=EPA&doc_type=NPRM&since=&until=+
26Hill stafferBills introduced by Senate Republicans in 119th on 'border security'bills, legislatorslexical_search + sponsor_party + sponsor_chamber + congress+
27LobbyistLDA filings on 'Section 230' in Q2 2025 above $50Klda-filingsGET /lda-filings?issue=&year=2025&quarter=2&min_spend=50000+
28AcademicDems from swing-state House districts on Ukraine, 2024-01 to 2024-11congressional-record, legislatorslexical_search + party + chamber + state + date filter+
29Federal contractorCBO cost estimates for energy bills in 2025 with 10-year cost > $5Bcbo-cost-estimates, billsGET /cbo-cost-estimates?since=&until= + filter on cost — needs structured cost field
30NonprofitHouse hearings on 'voting rights' in committees chaired by Republicans in 118thhearings, committees, legislatorsorchestrate: list committees (chair filter) → hearings per committee → keyword filter

5 · Find-by-name / entity resolution

4 queries · Fuzzy name match · MCP Class D (resolve_entity)

+ rest-api-entity-resolution
#PersonaQuestionSourcesToolStatus
31Constituent"Sen. Markey" from Massachusettslegislatorsresolve_entity(query='Markey', type='legislator', filters={state:'MA'})+
32JournalistHill staffer "Katie O'Brian" / "Katherine O'Brien" on Energy & Commercestaff-directoriesresolve_entity(query='Katie O Brian', type='staff-directories', filters={committee:'HSEN'})+
33LobbyistLobbyist "Marc Lampkin" across LDA filingslda-filingsresolve_entity(query='Marc Lampkin', type='lda-filings')+
34Hill staffer"House Ag" — appropriations sub or full committee?committeesresolve_entity(query='House Ag', type='committee')+

6 · Structured aggregation / counting

7 queries · Aggregate params on list endpoints · MCP Class E (count, sum)

+ rest-api-aggregations
#PersonaQuestionSourcesToolStatus
35AcademicBill count per freshman House member in 119th, rankedbills, legislatorscount(bills, group_by=sponsor_bioguide_id, filters={congress:119, term_class:'freshman'})+
36Journalist% party-line named roll-call votes in 118th Senateroll-call-votescount + ratio calc — needs derived field or post-process
37LobbyistTop 20 LDA registrants by 2025 disclosed spendinglda-filingssum(lda-filings, sum_field=spend_amount, group_by=registrant_name, top=20, filters={year:2025})+
38Hill stafferBills in 119th with a committee markup, countbillscount(bills, filters={congress:119, has_markup:true})+
39NonprofitOpen GAO recommendations from 2020-2024gao-reportscount(gao-reports, filters={recommendation_status:'open', since:2020, until:2025})+
40Federal contractorTop 10 federal agencies by FR rule count in 2025federal-registercount(federal-register, group_by=agency, top=10, filters={year:2025, doc_type:'rule'})+
41AcademicBiden-admin veto-threat SAPs by congressstatements-of-administration-policycount(saps, group_by=congress, filters={president_bioguide:'B...', stance:'veto_threat'})+

7 · Cross-source joins

11 queries · Dossiers OR agent orchestration · MCP Class F + chained calls

+ rest-api-dossiers (4 closed) · partial for 7
#PersonaQuestionSourcesToolStatus
42JournalistBills with veto-threat SAPs in 118th × final roll-call votesaps, bills, voteslist saps (filter stance) → get_bill_dossier per bill+
43LobbyistBills with CRS reports within 30 days of intro × sponsor + committeecrs, bills, committees, legislatorsorchestrate: list CRS with related_bill → get_bill_dossier
44State AGGAO recommendations to HHS where a later FR rule cites the reportgao-reports, federal-registerCitation graph not extracted — deferred
45AcademicBanking Committee senators in 119th × CRA votes × LDA filings naming them from finsvcs clientscommittees, members, votes, bills, ldaget_committee_dossier → list_member_votes per senator → list_lda_filings filtered
46Hill stafferHR 1 (119th): bill text + CBO + committee reports + hearings + SAPbills + 4 moreget_bill_dossier('hr:119:1')+
47JournalistLobbyists on SAFE Banking Act who were former staff to a sponsoring memberlda, staff, bills, membersorchestrate: bill_dossier → cosponsors → former staff per member → lda filings cross-ref
48NonprofitFor every PL in 118th, CFR sections amended + implementing FR rulespublic-laws, us-code, cfr, frget_public_law_dossier per PL — CFR/FR cascade covered; precision depends on citation extraction+
49Federal contractorDoD hypersonics hearings × bills by same committee within 60 days w/ overlaphearings, transcripts, committees, billsorchestrate: semantic_search transcripts → list bills by committee + date → similarity check
50Academic'climate change' topic: bills/CRS/GAO/FR counts per quarter since 2020topic-taxonomy + 4 sourcestime_series per source, filter by topic — needs topic FKs on each source
51State AGUS Code sections cited in DOJ FR rules where the section was enacted by a PL in last 5 yearsus-code, fr, public-lawsCitation graph not extracted — deferred
52Hill stafferSenators yes on NDAA whose CREC speech that week criticized provisionsvotes, crec, bills, legislatorsorchestrate: list votes (yes) → CREC for each member that week → semantic similarity

8 · Time-series / change-over-time

4 queries · Time-bucket aggregations · MCP Class E (time_series)

+ rest-api-aggregations
#PersonaQuestionSourcesToolStatus
53AcademicMonthly bill volume mentioning 'AI' 2018–presentbillstime_series(bills, time_field=introduced_date, interval=month, filter=q:'AI')+
54LobbyistDisclosed crypto lobbying spend, QoQ since 2021lda-filingstime_series + sum(spend_amount, interval=quarter, filters={issue:crypto})+
55Federal contractorFR page count per administration since 2000federal-registertime_series + sum(page_count, interval=year) — needs page_count field
56NonprofitGAO cybersecurity reports as % of output since 2015gao-reportstwo time_series calls + ratio — agent does the math+

9 · Discovery / "show me what's new"

4 queries · List with date filter · MCP Class A list endpoints

Full support
#PersonaQuestionSourcesToolStatus
57Hill stafferNew bills this week on rural broadband, opioids, VA carebills3 × search calls with since= and topic — agent merges
58Federal contractorNew rules / proposed rules today from DOT, EPA, GSAfederal-registerlist /federal-register?agency=DOT,EPA,GSA&since=today
59LobbyistNew LDA filings this week from competing registrantslda-filingslist /lda-filings?registrant_name=&since=...
60JournalistHearings scheduled next 14 days w/ witnesses if postedhearingslist /hearings?since=&until=&sort=date

10 · Provenance / citation verification

4 queries · Fetch + scan · same pattern as Category 2

Full support
#PersonaQuestionSourcesToolStatus
61JournalistVerify "HR 4321 eliminates Section 174 R&D amortization" claimbillsget_bill_text + lexical_search('section 174')
62State AGVerify advocacy report's claim about GAO-23-105432 recommendationsgao-reportsfetch + lexical_search('family detention')
63Hill stafferSource rule + quoted $14B compliance cost from EPA rulefederal-registersearch + fetch + scan
64AcademicVerify "5 USC §552a(b)(7)" Privacy Act exception textus-codefetch / get_us_code_section

Gap → spec map

The 22 queries that don't map cleanly to the existing surface cluster into three coherent gaps. Each gap maps to one new spec:

Gap Queries affected New spec Tool class added
Fuzzy entity resolution
"Find X by noisy name, return canonical record"
4 (Q31–34) · Category 5 rest-api-entity-resolution MCP Class D · resolve_entity
Aggregation / counting
"Top-N, group-by, time-series, total"
11 (Q35–41, Q53–56) · Categories 6 + 8 rest-api-aggregations MCP Class E · count, sum, time_series
Cross-source fan-out
"One ID, everything related"
4 closed (Q42, Q46, Q48, partial Q45) · the other 7 in Category 7 still need orchestration rest-api-dossiers MCP Class F · get_bill_dossier, get_legislator_dossier, get_committee_dossier, get_public_law_dossier
Richer per-resource filters
"Party, state, agency, committee chair, etc."
4 (Q25–28) · Category 4 Update to rest-api-resource-endpoints (FilterPlan registry) Class A typed tools accept the new filters

What's deferred

Three queries (Q44, Q48, Q51) need citation-graph traversal — extracting and storing inter-document citations like "this FR rule cites GAO-23-105432" or "this CFR section was enacted by Public Law 118-42." Today the substrate carries citations as bibliographic metadata per record (source_url, citation_string), not as a graph between records.

The citation graph would be its own ingester problem: parse each document's body for citation patterns, write into a citations join table linking (citing_record, citing_section, cited_record, cited_section). Worth a dedicated spec at v1.x. Not blocking v1, but the three deferred queries are exactly the kind of investigative work this substrate exists to support, so it's first on the v1.x list.

Eight queries (in Category 7) are workable via agent orchestration but cross 4+ tools each — the cases where the LLM is most likely to drop a step mid-chain. Worth tracking against the dossier surface: if a recurring 5-tool pattern shows up in the routing eval, that's a candidate for a new dossier shape or a new join helper.

How to reproduce this exercise

Single subagent prompt with two rules: (1) read only the schema (migration files + ingester specs), (2) explicitly forbid reading any API/MCP spec. Ask for 60+ queries across 8 personas and 10 intent categories, each tagged with the sources it touches. Then map by hand — there's no automated mapper at v1, and the manual pass is what surfaces the gaps.

The exercise repeats well: re-run when adding a new source, when changing the search-shape spec, or before flipping any surface spec to verified. Diff against the prior 64 queries — new categories that emerge are signal, missing coverage is signal, and queries that move from "supported" to "partial" are red flags.