HomeReports › Methodology

Methodology

How Keystone Court Data's research and intelligence reports are built, and what their known limitations are.

1. Data origin

Keystone Court Data sources property-related court filings directly from county court records, which are public information. Filings are collected on a daily schedule in our active coverage area (Indiana, North Carolina, Pennsylvania, Connecticut, New Jersey).

Before any filing enters the dataset that powers these reports, it passes through a verification stage that confirms the current owner-of-record and screens out filings tied to entity owners (LLCs, trusts, banks, corporations) and to properties where the named party is not the current owner. The resulting dataset represents the owner-occupied, individual-owned, current-record subset of court filings. It is not the raw scrape count, and percentages in the reports are computed against this verified subset only.

2. Volume thresholds and small-N policy

Reports are published only when sample sizes support meaningful claims:

3. Known limitations

3a. How our counts compare to national foreclosure-data aggregators

If you compare our numbers to the totals published by national foreclosure-data aggregators (e.g. ATTOM Data Solutions, the most-cited industry source), you'll see different numbers. The two are measuring different things and both can be accurate.

What aggregators count. National aggregators combine every stage of the foreclosure process across all 50 states from multiple data feeds: pre-foreclosure default notices, notices of trustee sale, scheduled auctions, sheriff sale results, and bank-owned (REO) transfers. Statewide coverage.

What we count. We measure one specific stage: the foreclosure complaint (lis pendens) filed in court. We scrape this directly from county dockets the day it files. Coverage is the counties we actively monitor, not statewide.

Why both are useful. Court complaints are the leading indicator — the moment a case enters the legal process, before any default notice mailing, before counsel is retained. Aggregator totals are the broadest measurement of total foreclosure activity across all stages. Investors looking for the earliest signal prefer court records. Researchers measuring market-wide foreclosure exposure prefer aggregated feeds.

Expected divergence. A given county may show several times more filings on aggregator feeds than on our reports for the same time window. That gap is the methodology difference, not a data error. A single property in foreclosure typically gets counted multiple times by aggregators (once per stage as it moves through default notice → auction → REO); we count it once at the complaint stage.

4. What we don't publish

5. Update cadence

Reports are regenerated monthly on the 3rd of each month from the live dataset. Each page is dated to its generation. Research notes are dated to publication and are updated only when the underlying audit is re-run.

6. Citation and corrections

Each report carries a suggested citation line at the bottom of the page. The general form is:

Keystone Court Data, "<Report Title>," <generation date>, <URL>

Corrections or questions about a specific number on any report: carson@keystonecourtdata.com.

7. Apply this methodology to specific reports

Examples of this methodology applied:

Updated 2026-06-04