Request access
Ratings
RuleScore™ GradesReview — grade a contractDispute DatabaseMethodologyTrack RecordPolicy & Regulation
Research
Uncertainty Index (EMU)The Thesis ($1T)The DeskThe ArchiveAll research
EventBasis — tax tooling
EventBasis calculator
Firm
AboutAdvisory — exchangesPress & newsroomAPI & AccessAPI DocsSystem status
Supporting tools
All toolsExploreArbitrage ScannerWhale TrackerMarket DifferencesFed MonitorDashboards HubVolumeDominancePaper BookGreeks
The Desk / methodology
Methodology

Turning an opinion into a rating: methodology, evidence, and a public scored record

A grade becomes a rating when it carries information about real outcomes — and when the firm publishes its own forecasts, scored in the open. Here is how Brierly validates RuleScore: a reproducible methodology, a sourced dispute record, and a live, Brier-scored forward log.

Brierly Research Team · Jun 14, 2026 · 5 minute read · sourced & dated (verification-log discipline)


RuleScore is a deterministic, public opinion about contract language. To call it a rating, the grades have to predict something real: disputes, resolution delays, rule changes, refunds. We are building that evidence three ways, and we are being explicit about the limits of each.

First, a retrospective calibration: score each coded dispute case on the rules text as it was listed, blind to the outcome where possible, and read dispute/delay rates by grade band. Second — and this is the part that disciplines the claim — a CONTROL sample of non-disputed high-volume markets, the denominator without which a 'disputes-only' table proves nothing. We will publish no 'N times more likely' figure until that control set exists. Third, a forward log: live grades are hash-stamped with timestamps every scan, so in a few months there is a clean, pre-registered out-of-sample test that needs no historical reconstruction. That log is already accruing.

The honest constraint is that a clean historical backtest needs point-in-time rules text the firm largely does not have, because the point-in-time recorder was scoped down for data-rights reasons. We say so rather than imply a backtest we can't run. The calibration reads out with exact N, dates, method, and caveats once the control set is coded — published only if the relationship holds; if it does not, that is a model-improvement signal. The discipline is the credibility: a rating that can't be checked is an opinion with confidence.

Meanwhile the parts that are already real do the work: the methodology is fully public and reproducible, the dispute evidence is sourced and dated, the forward grade log runs daily, and the firm's own forecasts are registered and Brier-scored in the open — the first resolving at the June 17 FOMC. The rating is being earned in public, on the record, not asserted.

Cite this note

News / wire style
Brierly Research, "Turning an opinion into a rating: methodology, evidence, and a public scored record," The Brierly Desk, Jun 14, 2026. brierlyresearch.com/note/opinion-to-rating.
APA style
Brierly Research. (2026). Turning an opinion into a rating: methodology, evidence, and a public scored record. The Brierly Desk. https://brierlyresearch.com/note/opinion-to-rating

Research is © 2026 Brierly; quote with attribution. Every claim is sourced and dated — see the citations above.

Informational research only — never investment, legal, or tax advice. © 2026 Brierly; quote with attribution.