Why galaxies, web networks, optimization landscapes — and perhaps even chess — form clusters, and what those clusters reveal about the structure of the underlying system

Clumping looks universal.
Galaxies condense out of nearly uniform early-universe matter.
PageRank concentrates probability on a handful of influential webpages.
Combinatorial optimization problems produce dense pockets of near-solutions.
Even chess positions seem to fall into plateaus and pits where evaluation changes slowly or chaotically.
The similarity is tempting — but misleading.
Across physics, networks, complexity theory, and even games, clumping is not a mechanism.
It is a diagnostic: the visible footprint of something deeper.
The geometry of the low-eigenvalue modes of the operator governing a system determines where its clumps form, and what those clumps mean.
Some systems have a handful of smooth, dominant modes (gravity).
Some have intermediate spectral bottlenecks (graphs).
Some have dense, ungapped spectra (NP-hard optimization).
Each produces clumps — but for radically different reasons.
Understanding that spectrum tells us how predictable a system is, how compressible it is, how learnable it is — and how hard.
1. Why low modes are the unifying principle
Every system considered here has three ingredients:
A state space
Density fields, directed graphs, bitstrings, chess positions.
A functional
Gravitational potential; random-walk operator; Hamiltonian or cost function; value function of a game.
A flow rule
Physical dynamics; Markov chain convergence; local search; neural evaluation.
Clumping occurs where this flow slows, accumulates, or fails to escape.
Across all these systems, such regions are controlled by small eigenvalues:
- directions where the functional changes least,
- nearly invariant subspaces under dynamics,
- flat or marginal directions of the Hessian,
- low-conductance sets in a graph,
- rugged basins formed by many near-degenerate minima.
That is why low modes unify gravity, PageRank, spin glasses, and evaluation landscapes:
they determine the shape, scale, and meaning of clumps.
2. Gravity: clumps from smooth, low-dimensional instabilities
(Jeans 1902; Binney & Tremaine)
Gravity is the canonical structured landscape.
A small density fluctuation in a fluid of density and sound speed satisfies the linear Jeans equation:
For long wavelengths such that , the frequency becomes imaginary and perturbations grow exponentially in time, signaling gravitational instability.
Worked example
Let and . Then
A 0.1% perturbation grows tenfold in under one Hubble time. Large-scale overdensities collapse into galaxies.
Interpretation
Gravity has very few dominant modes.
Structure formation is governed by long-wavelength instabilities.
The clumps are smooth, coherent, and predictable.
The system is highly compressible.
3. Web networks: clumps from spectral bottlenecks
(Brin & Page 1998; Chung 1997; Cheeger 1970)
PageRank computes the stationary distribution v of the Google matrix:
PageRank does not use the graph Laplacian explicitly — but slow-mixing regions of the random walk correspond to:
- nearly invariant subspaces of P,
- which correspond to low-conductance sets,
- which correspond to small Laplacian eigenvalues (via Cheeger’s inequality).
Thus clumping remains spectral, tied to bottlenecks in the graph.
Worked example
Construct two triangles connected by a single edge.
Random walks mix rapidly within each triangle but leak slowly between them.
The Laplacian’s second eigenvalue is small.
PageRank assigns disproportionate mass to whichever cluster has stronger internal connectivity.
Interpretation
Clumps reveal topology, not physics.
There are more modes than in gravity, fewer than in NP-hard landscapes.
Compressibility is intermediate.
4. NP-hard optimization: clumps from rugged structure
(Sherrington & Kirkpatrick 1975; Mézard, Parisi & Virasoro 1987)
Take subset-sum:
Plot this objective over the hypercube .
You obtain a landscape analogous to a spin glass:
- exponentially many local minima,
- barriers growing with dimension,
- flat directions interspersed with sharp cliffs,
- a dense spectrum of near-zero eigenvalues.
Worked example
Let and be random integers.
Evaluating all configurations reveals:
- many distinct local minima,
- no dominant basin,
- no coarse structure persisting across scales.
Interpretation
Clumping arises from too many competing minima.
The system is maximally incompressible.
Low modes are dense and uninformative.
This is the opposite of gravity.
5. The compressibility spectrum
These systems lie along a single axis determined by their low-eigenvalue structure:
| System | Operator | Low-mode structure | Basin geometry | Compressibility |
|---|---|---|---|---|
| Gravity | Poisson / Jeans | Few, smooth | Large coherent wells | High |
| Web graphs | Random walk | Moderate, topological | Community clusters | Medium |
| NP-hard | Discrete Hamiltonian | Dense, ungapped | Fragmented minima | Low |
Principle
- Few low modes → structured clumps (predictable)
- Several low modes → spectral clumps (clusterable)
- Many low modes → rugged clumps (hard)
6. Edge cases and transitions
Protein folding
Smooth funnels mixed with glassy regions — a hybrid spectrum.
Hierarchical networks
Successive spectral gaps → layered clumps.
Turbulence
Energy cascades generate multi-scale spectral structure.
Phase transitions
In spin glasses and constraint-satisfaction problems, the low-mode spectrum densifies abruptly.
7. Why this matters: prediction, learning, hardness
Predictability
Gravity is predictable at large scales; NP-hard landscapes are not.
Learnability
Neural networks readily learn spectral structure; they struggle with rugged landscapes.
Computational hardness
Smooth → polynomial approximations possible.
Spectral → clustering helps.
Rugged → exponential barriers dominate.
Clump structure indicates what kinds of inference are fundamentally possible.
8. Chess: a system on the boundary
Chess appears to occupy a hybrid regime.
AlphaZero
Rapid spectral decay in value networks (Silver et al., 2018).
Leela Zero
Strong compression in CNN representations.
Stockfish NNUE
Thousands of parameters suffice, indicating inherent compressibility.
Measurement is feasible
Sampling ∼106 positions and extracting leading eigenvalues via randomized SVD is practical.
Hypothesis (testable)
Chess lies mid-spectrum: globally compressible, locally rugged in tactical regions.
A sharp spectral gap implies structural solvability.
A dense near-zero spectrum implies inherent NP-like complexity.
Either result is meaningful.
9. Bottom line
Clumping is ubiquitous — but not universal in cause.
- Gravity: smooth physical instabilities
- Networks: spectral bottlenecks
- NP-hard systems: competing minima
Across all cases:
Clumps reflect the geometry of the low-eigenvalue spectrum — the determinant of predictability, learnability, and complexity.
Clumping is not the phenomenon.
It is the footprint of the geometry underneath.
Formal timestamp:
The Chess Eigenspectrum Hypothesis was published at Zenodo:
https://doi.org/10.5281/zenodo.17845086
https://thinkinginstructure.substack.com/p/the-hidden-geometry-of-clumping
Leave a Reply