A reanalysis

When a slope looks like a step

A widening political gap between small and large towns can pass for a policy effect. A reanalysis of Cremaschi, Rettl, Cappelluti and De Vries (AJPS, 2025).

Cremaschi et al. (2025) study a 2010 Italian reform that required municipalities below 5,000 inhabitants (3,000 in mountain communes) to jointly manage public services with neighbouring communes. Their identification strategy compares municipalities below that population cutoff (treated) with those above it (controls), before and after the reform, using difference-in-differences. They conclude that the reform raised far-right vote share by roughly 1.5 percentage points in the 2013 and 2018 general elections, relative to 2001–2008.

The graph below shows how that estimate is built. Every municipality is sorted into a size bin (about 80 municipalities each); for each bin I plot the change in far-right vote share from before to after 2010. The red line is the average among treated towns (below 5,000), the blue line the average among controls (above it). The gap between them is the published estimate.

Change in far-right vote share by town size

+1.54 pp

treated average (below 5,000) control average (above 5,000)

The problem is immediately obvious from the graph. What Cremaschi et al. read as a 5,000-effect is really a size gradient. Far-right support has been climbing faster in small towns than large ones since the early 2000s. Because the cutoff is population, every treated municipality lies below 5,000 and every control above it, so the design is not just weighing reform against no reform, but also small towns against big ones.

The size gradient is already there before 2010. Break the same towns out by election, 2001 to 2022: the downward slope (smaller towns further right) shows up in every one, and steepens most in 2018 and 2022, years after the reform.

Four anomalies

I set this out in a paper (Frederik, under review), documenting four anomalies in their difference-in-differences result.

1The effect predates the reform, and keeps growing. Run the same specification with a fake "reform" in 2001 and you already find 0.5 pp; in 2006, 0.9; at the real 2010 break, 1.5; with a 2014 break, 2.1. The estimate climbs in roughly equal steps whichever year you call the break.

Anomaly 1: DID estimate by (placebo) break year

already there before 2010

The paper's TWFE specification run with alternative break years. The pre-reform placebos at 2001 and 2006 are already significant, and the estimate rises in even steps through and past 2010. The real reform year (red) does not stand out.

2The effect vanishes at the threshold. If a real jump sat at 5,000, it would be sharpest where you compare the largest treated towns with the smallest controls. Instead the estimate fades as the comparison window narrows: about 1.5 pp in the full sample, drifting toward zero as you close in on the cutoff.

The difference-in-differences: drag to tighten the window

+1.54 pp

Anomaly 2: the estimate by comparison window

The estimate at every window width at once (line + 95% CI ribbon). The red dot marks the window you have set above. Far from the cutoff it holds near 1.5 pp; as the window narrows toward the threshold, where a true discontinuity should show clearest, it collapses into noise around zero.

3Placebo thresholds work just as well. Move the cutoff to 2,000, or 10,000, or 20,000, none of which marks any reform. Every one of them yields a large, highly significant "effect". At 10,000, a wholly arbitrary line, the estimate is bigger than at the real threshold.

The difference-in-differences: drag to move the cutoff

+1.54 pp

Anomaly 3: every cutoff produces an "effect"

The estimate at every cutoff from 1,000 to 50,000 (line + 95% CI ribbon). The red dot marks the cutoff you have set above. The real 5,000 threshold sits on the same smooth rise, nothing out of the ordinary, so the estimate tracks the size gap, not any reform.

You can put this more starkly still. Remove every municipality below 5,000, the ones the reform actually treated, and run the same difference-in-differences on the towns that are left. None of them was treated, yet a placebo cutoff among them still manufactures an effect: split the survivors at 20,000 and the smaller ones (5,000 to 20,000) sit +1.1 pp above the larger ones.

Anomaly 3, pushed further: every town below 5,000 removed

+1.1 pp

treated average (5,000 to 20,000) control average (above 20,000)

The same difference-in-differences on municipalities of 5,000 and up only, with a placebo cutoff at 20,000. None of these towns was treated by the 2010 reform, yet the smaller ones (red) still sit +1.1 pp above the larger ones (blue).

4The same pattern appears where there was no reform. Run the identical design on France (34,710 communes) and Switzerland, where no population-based service reform applies at the 5,000 cutoff. France still "produces" a difference-in-differences estimate of 3.9 percentage points, larger than the Italian headline; Switzerland gives +2.2 pp.

Anomaly 4: France & Switzerland, where no such reform exists

+3.9 pp

treated average (below 5,000) control average (above 5,000)

The same difference-in-differences at a 5,000 cutoff that marks no reform in either country: +3.9 pp in France, +2.2 pp in Switzerland. As in Italy, the treated average (red) sits above the control average (blue) for the simple reason that treated municipalities are smaller.

A unifying explanation

These anomalies all point in the same direction: it is the size gradient, not the reform, that drives the estimate. The simplest test is to add a flexible size control (the interaction of year and population) to the paper's difference-in-differences specification and see whether the headline estimate survives. It does not.

Adding a log(population) × year control

+1.54 pp

treated average (below 5,000) control average (above 5,000)

To sustain a causal interpretation, one would have to explain why the result vanishes the moment a flexible size control is added, why it attenuates to zero in narrow bands around the threshold, why arbitrary cutoffs all produce highly significant effects, and why the same specification yields even larger effects in France and Switzerland, where no such reform exists.

Try it yourself

The chart below lets you check the comparison yourself. Pick a country and an estimator, then drag the population threshold and the bandwidth. The red and blue lines mark the treated and control averages; the gap between them is the estimate. Slide the threshold across its whole range and the gap stays large at every cutoff, because it is reading the size gradient, not a reform.

Country Italy France Switzerland

Estimator Population threshold Bandwidth (RD-style) Size trend control log(pop) × year

Sources & method. Italian municipal panel from Cremaschi et al. (2025) (replication package, Harvard Dataverse), extended to the 2022 general election. Full reanalysis and discussion: Frederik (under review). Code: size-gradient-explorer · open the interactive explorer on its own.

References

Cremaschi, S., Rettl, P., Cappelluti, M., & De Vries, C. E. (2025). Geographies of discontent: Public service deprivation and the rise of the far right in Italy. American Journal of Political Science, 69(4), 1581–1599. https://doi.org/10.1111/ajps.12936
Frederik, J. (under review). Geographies of discontent: A reanalysis and discussion. https://github.com/jessefrederik/geographies-of-discontent-reanalysis/blob/main/size_gradient_report.pdf