In October 2021, the then Attorney General, Merrick Garland, announced the Combating Redlining Initiative (“CRI”) Shortly thereafter, federal banking regulators began a record-breaking number of referrals to the Department of Justice for alleged “redlining.” A key factor in all those referrals, was the use of “statistical significance” to identify situations in which a lender’s mortgage lending activity in majority-minority census tracts (“MMCTs”), expressed as a percentage of its mortgage activity in the entire area, is so far below the average of other “peer” lenders that it is likely not due to random chance but to some other cause. As a tool to determine potential discrimination statistical significance can be very helpful. However, by itself statistically significant findings do not prove discrimination. In fact, if not used carefully statistical significance can produce misleading results. It’s very important to recognize those situations in which statistically significant findings can lead in the wrong direction. An examination of the underlying assumptions embedded in statistical significance models can expose those weaknesses. In this article we explore one of those assumptions, how regulators define a market for redlining analysis.
Regulators and Department of Justice attorneys under the CRI use the concept of a “REMA” (reasonably expected market area) to determine the appropriate market boundaries for an institution under examination. Unfortunately, coinciding with the CRI regulators announced a radical change that significantly expanded the definition of a REMA. The reasons for the change have never been explained, but the effect of broadly expanded REMAs was an unrealistic delineation of the markets used for analysis. A serious consequence of that approach was to undermine the validity of statistical significance findings in certain situations.
An example of how an overly expanded REMA can lead to misleading statistical analysis appears below.
Table 1 Bank A lending in REMA MMCTs statistical significance

The table above shows a bank (referred to here as Bank A) with a statistically significant low penetration rate in the REMA MMCTs. This does not “prove” that the bank was redlining, but it would suggest that the result is not accidental and further analysis would be appropriate.
Maps of the REMA show that it is a three-county market (“REMA”) and that 65% of the MMCTs are located in county 3 with another 28% in county 2 and only 7% in county 1.
Figure 1: Distribution of MMCTs by county
Another map shows the geographic dispersion of HMDA mortgage market originations
Figure 2: HMDA market distribution by county
The figure 2 map shows that 48.5% of the reported HMDA mortgage lending is concentrated in County 3 and only 19.1% in County 1.
A third map depicts competitor branches by county. It is noticeable that 48.9% of the competition has branches in county 3 where most of the mortgage market activity and MMCTs are concentrated.
Figure 3 Distribution of all bank branches in the REMA
The foregoing factors depict a situation in which the geographic market consists of 3 very different counties characterized by (1) distinctly different racial demographics evidenced by the location and distribution of MMCTs; (2) very different competitive structures reflected in the distribution of bank branches; and (3) a segmented mortgage market depicted by the concentration of mortgage lending in County 3.
These market performance context factors stand in sharp contrast to the distribution of Bank A’s branch network and the geographic distribution of Bank A’s mortgage originations.
Figure 4 Distribution of Bank A’s branch
The bank is headquartered in county 1 and maintains 60% of its branches in that county and one-third of its branches in County 2. To be expected, Bank A originated 70.7% of its HMDA mortgages in its “home community” where it maintains its headquarters and the base of its gradually expanding market, from County 1 to County 2 and now into County 3.
Figure 5 Geographic dispersion of Bank A’s lending
The maps reveal pictures of market factors that do not match Bank A’s resources. To be meaningful, and not misleading, statistical analysis should be done in markets that are compatible with a bank’s resources indicating a “level playing field” where performance comparisons are based on compatible facts.
All the foregoing facts suggest that statistical analysis based on data disaggregated for different markets will be a more reliable indicator of potential discrimination. So, what might that analysis indicate?
The tables below show statistical significance results when the REMA is disaggregated by county.

When evaluated from the disaggregated markets within the REMA completely different contradictory results are obtained. In each of the 3 counties Bank A’s performance is not statistically significant. In fact, in 2 of the counties Bank A’s penetration rates in the MMCTs actually exceed the market. In the single county where Bank A underperforms the market the result is not statistically significant.
The phenomenon that occurs when disaggregated statistical significance results contradict aggregated analysis is known as “Simpson’s Paradox.” It happens when there are “lurking variables” or “confounding factors.”
It shows that blind acceptance of statistically significant results is inappropriate. The appropriate response is to scrutinize the facts underlying the comparisons to determine if there’s an embedded bias that may invalidate the findings based on aggregated results.
[View source.]