Regulatory Corpus

Corpus Accuracy
& Coverage Report

The Insurance Professor grounds every response in verified public-domain regulatory content. This page documents the current state of that corpus, the methodology that maintains it, and the verification mechanisms that govern how corpus content reaches users.

Q2 2026 Report · Published May 1, 2026 · Updated quarterly

Loading corpus data…
03

Florida deep-dive

Florida is the platform's most exposure-relevant state. Hurricane season opens June 1; the state's HB 837 tort reform (2023), AOB reforms (2022), and post-Ian regulatory environment have made Florida P&C insurance among the most consumer-relevant regulatory landscapes in the country. The platform has invested deepest in Florida coverage as a result.

The Florida statutory corpus contains 222 chunks across 18 distinct canonical citations. Each citation traces to an official Florida Statutes section published at flsenate.gov.

What the Florida corpus does not yet contain. The state's flood-insurance regime (federal NFIP) is not in scope for state-statute scraping. Citizens Property Insurance enabling statute (§627.351(6)) is identified as the next FL corpus expansion target. Genuine public-adjuster apprenticeship and adjuster-disciplinary statutes (likely §626.611 and adjacent sections) are tracked as remaining gaps.

04

Verification mechanisms

The corpus's existence is necessary but not sufficient. The platform must also ensure that responses delivered to users actually cite content the corpus actually contains. Three mechanisms govern this.

Citation verification layer (deployed Q2 2026)
Every response passes through a verification layer before delivery. The layer extracts every statutory section number cited in the response and matches each one against the canonical citation field of the corpus chunks retrieved during generation. If a section number cannot be matched, or if the matched chunk is older than 365 days, the response is modified before delivery: the unverifiable citation is removed, the response's confidence tier is downgraded, and a footer is appended directing the user to their state DOI for confirmation.
Voice compliance gate
Every response is scanned against a set of forbidden phrases that constitute Unauthorized Practice of Insurance or Law risk: directives to buy, accept, or reject specific policies; characterizations of legal positions as strong or weak; guarantees of outcomes. Responses containing absolute-forbidden phrases are blocked before delivery.
Three-tier confidence display
Every response carries one of three confidence indicators tied to the strength of corpus retrieval: HIGH (regulatory source verified, full statutory citation available), MEDIUM (based on general state guidance, claims qualified), or LOW (limited regulatory data, user directed to state DOI). The LOW display is the most consequential element. It exists so the platform never substitutes generic model inference for missing corpus content.
What these mechanisms do not catch.
The verification layer catches fabricated citations and stale citations. It does not catch citations where the section number is real and current but the response's interpretation of the statute is wrong. That failure mode is mitigated by the corpus quality work below, by the voice compliance system, and by the platform's consistent direction of users to their state DOI for authoritative interpretation. It is not eliminated.
05

Q2 2026 audit and corrections

This section documents content-change events processed during the period. The platform commits to publishing this section in every quarterly report.

Citation mislabel corrections
FL Statute 627.70121
Flood Insurance
Payment of Claims for Dual Interest Property
FL Statute 768.0427
Comparative Fault
Medical Expense Evidence; Letters of Protection (HB 837)
FL Statute 627.7153
Property Insurance Repair Practices
Restriction on Assignment of Post-Loss Benefits
FL Statute 626.856
Public Adjuster Apprentice
Company Employee Adjuster Defined
FL Statute 626.8584
Adjusters Disciplinary Actions
Nonresident All-Lines Adjuster Defined
Legacy-source pollution removal

140 chunks removed. 95 chunks from leg.state.fl.us (pre-migration URLs after Florida Senate moved to flsenate.gov), and 45 chapter-level chunks from flsenate.gov/Laws/Statutes/2024/Chapter627/All that lacked section-precision needed for citation verification.

After corrections and deletions, the Florida statutory corpus contains 222 chunks, all of which trace to canonical flsenate.gov source URLs and all of which carry a single, accurate canonical citation reference.

Infrastructure shipped

The citation verification layer described in section 04 was implemented and deployed during Q2 2026. Prior to this work, a placeholder verification function existed in the platform code but its implementation was incapable of catching mislabel-type errors. The five mislabel corrections above are exactly the failure mode the new layer is designed to prevent in production.

06

Limitations and open questions

The corpus does not yet contain comprehensive coverage for many states. Of fifty states plus territories, 27 have active scrapers. State-by-state expansion is paced by scraper development, not by user demand alone, because building a low-quality scraper that pulls inaccurate content would be worse than having no scraper at all.

Some chunks lack a scraped_at timestamp. 62 chunks (0.6% of corpus) do not have provenance timestamps. The verification layer treats these chunks as stale and removes any citation that depends on them. Backfilling these timestamps is in the operational backlog.

The verification layer catches fabrication and staleness, not misinterpretation. A response may correctly cite a real, current statute while mischaracterizing what that statute requires. This failure mode is mitigated but not eliminated.

Some state DOI websites block automated access. Where alternative-source content is insufficient, the relevant state's coverage is shallower than it would otherwise be.

07

Process commitments

Quarterly publication.
This report will be updated and republished at the start of each calendar quarter. The next report (Q3 2026) is scheduled for July 2026.
Audit transparency.
Every quarterly report will include a content-change events section documenting corrections, deletions, and infrastructure changes that affected corpus accuracy during the period.
Real-time provenance.
The verification layer ensures that every citation delivered to a user traces to a specific corpus chunk with verifiable provenance. Enforced at runtime, not only at audit time.
Open methodology.
This page is the methodology, applied. Anyone can read what the corpus contains, where it came from, and how it is verified.
Direct DOI engagement.
The platform welcomes review of its use of state regulatory materials. State DOI staff who identify any inaccuracy are invited to contact [email protected].
The Insurance Professor is an educational service. Not licensed insurance advice. Not legal advice. Not financial advice. This page describes the corpus that grounds the platform's educational responses and the mechanisms that govern how corpus content reaches users.
The Insurance Professor ·About·Corpus Accuracy·insurance-professor.com