Erdos Minimum Overlap: Improving a 2023 Lower Bound
The project produced a conservative verified bound of mu >= 0.379544, a stronger working Phase 5 frontier around mu >= 0.3801218, and a concrete view into where AI-assisted mathematical research helps and where proof boundaries still matter.

Can AI systems do real mathematical research, or are they limited to summarizing work humans have already done? I have been testing that question in the most unforgiving way I know: by pointing a multi-agent research loop at a live numerical frontier in mathematics and asking it to make progress that survives verification.
The test case is the Erdos minimum overlap problem. Informally, the problem asks how much overlap is unavoidable when certain measurable sets are translated against each other. Technically, it is a problem about bounding a constant, usually denoted mu, whose known lower and upper bounds are separated by a narrow numerical gap. That makes it a useful benchmark for AI-assisted research: vague insight is not enough. A result has to move a number, preserve a proof structure, and survive adversarial checking.
The project produced a conservative verified improvement:
mu >= 0.379544, improving Ethan White's 2023 lower bound of mu >= 0.379005 by 5.4e-4.
It also pushed a stronger working frontier:
the current Phase 5 stack reports mu >= 0.3801218 after margin, with a pre-margin headline near 0.3801279.
I would not present those two claims as the same kind of result. The first is the clean conservative contribution: a Bochner-PSD strengthening of White's convex program that has been independently re-encoded and checked against the existing proof path. The second is the active research frontier: a stronger computational certificate from a larger stack of constraints and cover refinement that still deserves the usual care before becoming a formal paper claim.
That distinction is the point. The experiment was not to see whether an AI system could write impressive mathematical prose. It was to see whether an AI-assisted workflow could propose constraints, run experiments, reject bad leads, recover from overclaims, and leave behind a result that a serious technical reader can inspect.
The Mathematical Setup
White's 2023 lower bound uses a convex program over Fourier coefficients, plus an ellipse-covering argument that turns local optimization values into a global lower bound. My approach was deliberately conservative: reproduce White's program first, then add constraint families that are mathematically valid for the same object.
The most important early addition was a Bochner moment-matrix constraint. For a nonnegative 2-periodic function f, the Hermitian Toeplitz matrix built from Fourier coefficients must be positive semidefinite. The same condition applies to 1 - f. These constraints tighten the feasible region without changing the underlying proof target: the program can no longer exploit Fourier coefficient patterns that do not correspond to a valid nonnegative function.
At modest scale, this mattered immediately. With Bochner_n=20, row-level SDP values improved by roughly 1.5e-3 to 2.1e-3, materially more than earlier tested inequalities. The binding row also shifted: instead of the baseline being controlled by White's row6 behavior, the Bochner-augmented program made row4 the critical case across the measured scales.
The conservative result from this path is:
| Quantity | Value |
|---|---|
| White 2023 lower bound | 0.379005 |
| Conservative Bochner-PSD lower bound | 0.379544 |
| Improvement | +0.000539 |
| Together Computer upper bound | 0.380871 |
| Remaining gap after conservative result | about 0.001327 |
In ordinary product language, an improvement in the fourth decimal place sounds microscopic. In this problem, that is the terrain. The known gap is small, the proof obligations are tight, and progress is measured by whether a numerical certificate can survive the full chain from convex program to covering argument.
The AI Research Loop
The project used AI agents less like a chatbot and more like a research team with assigned roles. One agent would translate or modify the convex program. Another would run row sweeps across White's seven ellipse centers. Another would independently re-encode the same constraint to check for implementation mistakes. Others handled dual extraction, Path B ellipse extension, solver-gap review, and red-team critique.
That structure mattered because this kind of computational mathematics fails in boring ways. A constraint can be plausible but invalid. A solver can report a value that is numerically useful but not rigorous enough. A row that looks non-binding at one scale can become binding later. A result can improve the objective locally and still fail the global coverage argument.
The AI loop was strongest where the work decomposed cleanly:
- reproduce White's Section 5 convex program;
- add Bochner/Toeplitz PSD constraints for f and 1 - f;
- sweep all seven Table-3 centers rather than cherry-picking one row;
- extract dual lower bounds and solver-gap information;
- rebuild the ellipse-extension argument through independent code paths;
- test tempting alternatives and record negative results instead of burying them.
It was weakest where a single proof obligation required subtle analysis. That showed up in the Lasserre-style experiments: the early numerical improvement looked promising, but the implementation truncated a polynomial moment expansion without a rigorous tail bound. That made the constraint heuristic, not a valid inequality. The correct response was to retract that part and keep the Bochner-PSD contribution separate.
This is exactly how I want AI research systems to behave. Not omniscient. Not theatrical. Useful, parallel, skeptical, and forced to show their work.
Beyond the Conservative Bound
After the Bochner result, the project continued into stronger constraint families and cover iteration. The later stack explored polynomial-moment constraints, even-Hankel PSD constraints, CDE-discovered centers, T5-prime variants, M-side Bochner relaxations, Lasserre-level ideas, and residual saturation analysis.
The strongest working progression is:
| Stage | Result |
|---|---|
| Bochner-PSD conservative verified bound | mu >= 0.379544 |
| Phase 3 polynomial-moment constraints | mu >= 0.380067 |
| Phase 4A, poly_moment k_max=20 | mu >= 0.3801147 |
| Phase 4B, even-Hankel PSD added | mu >= 0.3801199 |
| Phase 5, combined stack with cover iteration | mu >= 0.3801218 after margin |
Crossing 0.380 was meaningful. The Together Computer upper bound is 0.380871, so the Phase 5 working result closes a large fraction of the gap that existed after White's 2023 lower bound. The remaining open interval is roughly:
mu in [0.3801218, 0.380871]
The later experiments also identified which levers were not worth much. The M-side Bochner SOC relaxation appeared empirically inactive. Hankel-PSD helped at modest scale but had limited marginal effect once the scalar polynomial-moment constraints were already present. Cover iteration continued to improve the result, but by Phase 5 the gains were down to a few units in the sixth decimal place.
That kind of negative information is valuable. It turns "try more math" into a specific research roadmap.
What Survived Review
The most important discipline in this project was separating verified contribution from working certificate from future experiment.
The conservative Bochner-PSD improvement survived independent re-encoding and remains the clean public claim. The stronger Phase 5 result is a serious working artifact with detailed logs, but I still treat it as active computational research rather than a finished theorem announcement.
Two corrections were especially important:
First, the Lasserre-level-2 extension that appeared to produce mu >= 0.379828 was retracted because it relied on a truncated moment expansion without a rigorous tail bound. The idea remains promising, but the original implementation was not enough.
Second, a later framework-ceiling claim around 0.380553 was also retracted. The corrected residual analysis showed that the saturation theorem becomes non-vacuous only at larger N: approximately N >= 12,750 for cosine alone, N >= 16,378 for cosine plus sine, and around N = 20,000 for the projected full stack.
That correction changed the conclusion in a useful way. The project did not prove that the framework had hit a theoretical ceiling. It found a tractable next computational threshold.
Why This Matters
I care about this project because it is a more honest benchmark for AI research than most demos. A model can sound brilliant on a whiteboard and still fail the moment a proof obligation depends on a sign, a tail bound, or a solver certificate. The Erdos minimum overlap problem gives very little room for hand-waving.
The experiment suggests a practical division of labor:
- AI agents are already useful for parallel exploration, implementation, replication, and adversarial review.
- They can accelerate the search for valid mathematical levers.
- They can maintain a better audit trail than a single overloaded human working from memory.
- They still need proof boundaries, independent checks, and humans willing to reject attractive but invalid claims.
That last point is not a limitation to hide. It is the design principle. The future of AI-assisted mathematics is not "the model says the theorem is true." It is a research environment where models generate candidates, run experiments, build certificates, challenge each other, and make it easier for humans to decide what is actually true.
Current Research Direction
The next useful experiment is not vague. The corrected analysis points toward larger N threshold runs, especially around N = 16,000 to 20,000, where the cosine-plus-sine saturation bound becomes non-vacuous against Together's upper bound. That is computationally heavier, but not exotic: the projected memory requirement is in the single-digit gigabyte range, and the wall time looks like an overnight run rather than a research program reset.
If that run improves the empirical lower bound, it may narrow the remaining gap further. If it does not, it still gives a sharper answer about the limits of this constraint stack and where new mathematics is needed.
Either outcome is progress. The broader result is that a local, agent-assisted research workflow produced a verified improvement to a live mathematical bound, found a stronger working frontier, corrected its own overclaims, and left a concrete path for the next experiment. That is the version of AI solving math that I find credible: not magic, not autocomplete, but disciplined acceleration of the research loop.