Thinking In Structure

Category: Programming

Programming

Invariant Selection and the Problem of Novelty
Why good work disappears in stable systems — and when systems quietly outlive their legitimacy

If you publish a good Substack, write a strong novel, or ship a thoughtful indie game, the dominant experience is rarely rejection. More often, it is non-selection. The work does not fail. It simply never enters the flow.

This is usually explained away psychologically: bad timing, weak marketing, the wrong audience. But that explanation is unsatisfying, because the same pattern repeats across domains. Writing, games, research, startups — different surfaces, same outcome.

The deeper reason is not cultural.
It is dynamical.

The hidden rule of modern ranking systems

Most large-scale discovery systems — search engines, recommendation feeds, citation graphs, storefronts — are not designed to find what is new. They are designed to identify what is stable.

They rank according to invariant structure: patterns of attention that persist under repeated mixing.

This family of effects is well known. Preferential attachment, Matthew effects, popularity bias, exposure concentration — these have been documented repeatedly in networks ranging from scientific citations to music streaming (Barabási & Albert, 1999; Merton, 1968; Salganik et al., 2006).

The claim here is not novelty of diagnosis, but precision of mechanism: many systems do not merely reward popularity; they reward self-reproducing patterns of flow.

What is being selected is not “what many people liked once,” but “what keeps being encountered after the system updates itself.”

Not just “rich get richer”

This distinction matters because many popular things do not persist.

Most viral content decays rapidly. In citation networks, the median paper receives the majority of its citations within 2–5 years and then effectively disappears from the flow. In app stores, industry analyses routinely show that well over 90% of indie releases receive negligible long-term visibility.

Popularity spikes are common.
Persistence is rare.

What systems converge on is not raw popularity, but configurations that survive repeated redistribution of attention.

The mathematical core (as approximation, not dogma)

To capture this idea cleanly, it helps to use a simplified model.

Let the discovery process be represented by a linear operator $P$ P, describing how attention, citations, or visibility move from one node to another.

Invariant ranking means finding a vector $v^\*$ v\* such that: $P v^\* = v^\*$ Pv\*=v\*

This says: once attention settles into this pattern, the system’s own dynamics keep it there.

Any component not aligned with $v^\*$ v\* decays under repeated application of $P$ P.

So:

Novelty is structurally transient.

This model is deliberately reductive. Real systems are not purely linear. They include nonlinear feedback, external shocks, human editorial interventions, and rule changes. But over long horizons — and between shocks — linear flow models often describe the dominant tendency of attention remarkably well.

Think of this not as a law of nature, but as a local approximation, like frictionless planes in physics: wrong in detail, useful in structure.

Why platform “fixes” only partially work

Platforms know invariance is a problem. They add freshness boosts, exploration noise, personalization, decay of old signals.

These interventions matter. They create eddies and side currents.

But they rarely change the shape of the riverbed.

Once the perturbation fades, attention flows back into the same channels.

Local exploration does not rewrite global invariants.

TikTok: novelty through instability

TikTok is often cited as a counterexample — and rightly so.

It differs in two key ways:
1. The operator is local and conditional
  The For You Page is not one global ranking, but millions of short-horizon, behaviour-conditioned ones.
2. The time constant is short
  Signals decay aggressively. What worked last week may vanish tomorrow.
The result is not the absence of invariants, but rapid cycling between them.

TikTok surfaces novelty — at the cost of persistence. Volatility replaces obscurity; burnout replaces invisibility.

This confirms the trade-off rather than escaping it:
stability suppresses novelty, novelty requires instability.

Why invariant selection is not a bug

Invariant selection often serves users well.

Stable ranking systems:
- reduce cognitive load
- surface vetted material
- suppress spam and adversarial gaming
- converge quickly to “good enough” outcomes
The cost is conservatism, not inefficiency.

The problem is not that invariant systems exist.
It is that they increasingly dominate every discovery context.

Regime exhaustion: when the river keeps flowing but no longer convinces

Here is the crucial transition:

Some systems continue to function long after they have lost legitimacy.

This is regime exhaustion.

The rankings still converge. The metrics still update. The pipelines still run. But users feel that outcomes no longer reflect quality, relevance, or fairness.

At that point, the problem is no longer optimisation.

It is operator replacement — changing the rules by which attention flows at all.

Operator replacement at scale (made concrete)

Operator replacement rarely looks like collapse. More often it looks like attention routing around the official channels.

Academic publishing is a clean example.

Citation networks preserve canonical work extremely well, but integrate novelty poorly. Over time, legitimacy leaked elsewhere:
- preprints (arXiv)
- conferences overtaking journals in CS
- blogs, talks, and open-source code becoming reputation carriers
The old system continued to function.
It simply stopped being where meaning accumulated.

That is operator replacement.

K-pop, briefly, as circulation physics

K-pop illustrates the same structure in culture.

Its success rests on an engineered circulation system: training pipelines, synchronized releases, fan mobilisation, platform-native artefacts.

Attention recirculates efficiently. That efficiency is the strength — and the limit.

Saturation occurs when the system becomes too good at reproducing itself. Novelty survives mainly as surface variation.

The river flows.
Surprise dries up.

Local rewiring: Japanese indie devs and graph shaping

At smaller scales, creators sometimes intervene directly.

Japanese indie developers on Twitter/X form dense clusters of mutual review, retweeting, and visible interaction. This increases internal connectivity, creating a slow-mixing subgraph where attention lingers before leaking out.

They are not changing the algorithm.
They are reshaping the plumbing the algorithm operates on.

This is not marketing.
It is structural.

Beyond individual levers: systemic alternatives

The earlier “three levers” (legibility, local recirculation, graph shaping) describe individual agency. They matter — but they are not the whole story.

Systemic responses also exist:
- decentralised networks (e.g. federated social media) that weaken global invariants
- public-interest discovery systems that privilege diversity over convergence
- regulatory pressure on monopolistic ranking power
None of these are panaceas. Each introduces new trade-offs. But they recognise the same underlying issue: when one operator governs too much of cultural flow, novelty suffocates.

Closing

Ranking systems based on invariant flow are not wrong. They are incomplete by design.

They explain where attention stays, not where it should go. They preserve what already works, not what might work under different conditions.

Understanding this does not guarantee success.
It does something quieter and more honest:

It tells you when the problem is you —
and when it is the riverbed.

And when a river keeps flowing long after it has stopped nourishing the land, the question is no longer how to swim better.

It is whether the course itself needs to change.

Ironically, as this essay itself predicts, its visibility may depend on whether it manages to route around the very invariants it describes.

https://thinkinginstructure.substack.com/p/invariant-selection-and-the-problem
December 15, 2025
The Hidden NP-Complete Problem Sitting in Your Accounting Department
Why matching payments to invoices sometimes defeats software — and what that reveals about modern work.

Everyone learns about NP-complete problems in computer science.
Almost nobody realises that one of them is hidden in the most routine corner of business life:

applying a customer payment to a list of open invoices.

This isn’t a metaphor.
It is literally the subset-sum problem — formally catalogued by Garey & Johnson (Computers and Intractability, 1979) — and explicitly discussed in accounting-reconciliation research such as Pettersson & Strömberg (2007), who identify multi-item invoice matching as a computationally hard variant of subset selection.

But the important point is not that the equivalence exists.
It’s that everyday business practice routinely generates worst-case instances of a famous computational barrier — and accountants are the ones who run into it.

A Worked Example That Shows the Entire Problem

Take a payment of £4,215.

The customer has nine open items:
- £600
- £615
- £700
- £1,200
- £1,300
- £1,415
- £2,000
- £2,015
- (£300) credit note
Try the obvious strategies:
- Greedy (largest-first) → fails
- Date proximity → fails
- Similar-amount grouping → fails
The correct match?

£1,415 + £1,300 + £1,200 + (£300 credit note) = £4,215.

This kind of combination is common in real accounts — especially when customers drip payments or credit notes distort the pattern.

And the combinatorics behind the scene are brutal.
With 1,000 open invoices, the search space is 2¹⁰⁰⁰ — vastly more than atoms in the observable universe.

This is what ERP systems quietly face.

Why This Isn’t Just a Trivia Fact

A few operations-research papers note the connection between reconciliation and subset-sum, but very little writing explains why real-world accounting systems produce the hardest instance types:
1. Repeated invoice amounts
  Creates dense clusters → many candidate subsets.
2. Staggered and partial payments
  Three small payments → exponential branching across ten invoices.
3. Credit notes and adjustments (negative numbers)
  Multiply the space of feasible combinations.
4. Long account histories
  5–15 years of open items is normal in large ERPs.
5. Exact-to-the-penny matching
  No numerical tolerance → no approximate shortcuts.
In other words:
ordinary bookkeeping practices routinely generate pathological subset-sum instances.

ERP Systems Know This — They Just Don’t Say It

When an ERP displays:

“Unable to automatically apply payment.”

the real meaning is:

“You have asked me to solve an NP-complete instance for which no guaranteed fast method exists. Please be the algorithm.”

And this is not speculation.
Real ERP documentation says exactly this — but in more diplomatic language.
- SAP Note 310597 (“Automatic Clearing: Limitations and Manual Intervention”) explicitly acknowledges that SAP’s F.13 auto-clearing fails for “complex multi-item combinations” or when credit memos create ambiguous matches, and must be resolved manually.
- NetSuite’s Help Center — “Applying Payments to Multiple Invoices” states that automatic application may not complete when invoice/credit memo combinations “require user judgment.”
- Oracle Receivables User Guide — “Automatic Receipt Processing Limitations” similarly lists cases where auto-apply halts because “multiple plausible matches exist.”
All three systems — along with Microsoft Dynamics — converge on the same truth:

The software stalls exactly where the mathematics becomes hostile.

Meanwhile, credit controllers perform live combinatorial optimisation.

What Matching Engines Actually Do

Commercial reconciliation tools survive by using layered heuristics:
- date proximity
- behavioural priors (typical ways a customer pays)
- amount clustering
- machine-learned likelihood scoring
- ILP solvers for isolated subproblems
- manual review for anything ambiguous
These handle most cases.
But substantial manual effort persists across large organisations, even after decades of automation attempts — because the bottleneck isn’t a missing feature, it’s a mathematical limit.

AI doesn’t escape this.
Machine-learning tools don’t “solve” the problem; they learn better heuristics for navigating an NP-complete search space.
Manual review remains essential because the hardness is structural, not technological.

And once you accept that, the deeper point comes into view.

The Larger, More Interesting Point

This isn’t really about accounting or ERP failures.
It’s about a much broader phenomenon:

Many workflows in modern organisations look trivial on the surface yet sit directly on top of computationally hard problems.

Invoice matching is just the clearest example.
Other cases include:
- multi-leg cash application
- FX netting across global entities
- portfolio allocation under constraints
- warehouse picking optimisation
- shift scheduling
- bundled-product revenue recognition
- supply-chain backorder allocation
The “clerical” layer often conceals a theoretical limit — and a persistent research opportunity.

Research implication:
Domain-specific versions of subset-sum may admit specialised algorithms far more efficient than generic formulations. This is an underexplored intersection of computer science, accounting, and operations research.

The next time an ERP system refuses to apply a payment automatically, don’t assume incompetence.
Sometimes it’s telling you the truth:

Some tasks in modern business are small on the surface — and NP-complete underneath.

Subset Sum Solver
December 14, 2025
Subset Sum Solver

Subset Sum Solver – Optimized

🎯 Subset Sum Solver

Optimized with proper epsilon handling and performance limits

Target Sum:

Numbers (comma or space separated):
Enter positive or negative numbers, including decimals

February 14, 2012