Methodology
How Keystone Court Data's research and intelligence reports are built, and what their known limitations are.
1. Data origin
Keystone Court Data sources property-related court filings directly from county court records, which are public information. Filings are collected on a daily schedule in our active coverage area (Indiana, North Carolina, Pennsylvania, Connecticut, New Jersey).
Before any filing enters the dataset that powers these reports, it passes through a verification stage that confirms the current owner-of-record and screens out filings tied to entity owners (LLCs, trusts, banks, corporations) and to properties where the named party is not the current owner. The resulting dataset represents the owner-occupied, individual-owned, current-record subset of court filings. It is not the raw scrape count, and percentages in the reports are computed against this verified subset only.
2. Volume thresholds and small-N policy
Reports are published only when sample sizes support meaningful claims:
- County reports require at least 75 verified filings in the dataset for the county.
- State reports require at least 100 verified filings statewide.
- Individual breakouts (e.g. a single monthly bucket, value bucket, or ZIP) with fewer than 5 filings are shown for transparency but not used as the basis for headline claims.
3. Known limitations
- Time window. Daily scrape coverage began in early 2026 in most counties. Year-over-year comparisons are not yet available. Trend claims use month-over-month within the available window.
- Coverage depth varies by county. Not every county we cover has the same scrape start date or operational depth. "Top counties by filing volume" tables reflect both real underlying activity AND our coverage depth.
- Property value and equity enrichment is populated for a subset of filings (rate disclosed per report). Aggregate claims about value use the available subset and the coverage rate is shown in each report.
- Case lifecycle tracking (filing-to-served days, settlement rates, etc.) is currently limited to a subset of cases with active docket monitoring. When lifecycle stats appear in a report, the sample they're based on is disclosed.
- Pre-2026 historical data is not available.
3a. How our counts compare to national foreclosure-data aggregators
If you compare our numbers to the totals published by national foreclosure-data aggregators (e.g. ATTOM Data Solutions, the most-cited industry source), you'll see different numbers. The two are measuring different things and both can be accurate.
What aggregators count. National aggregators combine every stage of the foreclosure process across all 50 states from multiple data feeds: pre-foreclosure default notices, notices of trustee sale, scheduled auctions, sheriff sale results, and bank-owned (REO) transfers. Statewide coverage.
What we count. We measure one specific stage: the foreclosure complaint (lis pendens) filed in court. We scrape this directly from county dockets the day it files. Coverage is the counties we actively monitor, not statewide.
Why both are useful. Court complaints are the leading indicator — the moment a case enters the legal process, before any default notice mailing, before counsel is retained. Aggregator totals are the broadest measurement of total foreclosure activity across all stages. Investors looking for the earliest signal prefer court records. Researchers measuring market-wide foreclosure exposure prefer aggregated feeds.
Expected divergence. A given county may show several times more filings on aggregator feeds than on our reports for the same time window. That gap is the methodology difference, not a data error. A single property in foreclosure typically gets counted multiple times by aggregators (once per stage as it moves through default notice → auction → REO); we count it once at the complaint stage.
4. What we don't publish
- Individual case data. No case numbers, defendant names, exact property addresses. Court records are public information, but Keystone aggregates rather than republishes them. Reports show counts, percentages, and distributions only.
- Our business metrics. Subscriber counts, revenue, conversion rates. Reports cover filing activity in the underlying market, not Keystone's commercial state.
- Internal pipeline details. The specific vendors, scoring models, and verification techniques used to build the dataset are proprietary and not described here. The general principle (court-direct collection, ownership verification, aggregate-only publication) is documented; the implementation is not.
5. Update cadence
Reports are regenerated monthly on the 3rd of each month from the live dataset. Each page is dated to its generation. Research notes are dated to publication and are updated only when the underlying audit is re-run.
6. Citation and corrections
Each report carries a suggested citation line at the bottom of the page. The general form is:
Keystone Court Data, "<Report Title>," <generation date>, <URL>
Corrections or questions about a specific number on any report: carson@keystonecourtdata.com.
7. Apply this methodology to specific reports
Examples of this methodology applied:
- Behind Indiana's #1 Foreclosure Rate · methodology applied to a news-cycle hook
- The Pre-Foreclosure Visibility Gap · methodology applied to cross-source comparison
- Most Active Foreclosure Plaintiffs · methodology applied to lender activity ranking
- Indiana state intelligence report · methodology applied to a full state report
- Lake County intelligence report · methodology applied to a county report
- Lead-type strategic fit framework · methodology applied to investor decision-making
- Court-record lead-types catalog · methodology applied to type taxonomy
- PA Act 6 + Act 91 explained · methodology applied to state-specific statutory analysis
- NC unified eCourts edge · methodology applied to structural-comparison research
Updated 2026-06-04