Unresolved Q: A Control-Theoretic Account of “Ache” in Creative AI

Current generative models routinely produce fluent, stylistically correct music and prose that nevertheless feels empty—over-eager, prematurely resolved, or inert. This failure is often attributed to ineffable “taste” or human intuition. This article advances a narrower, testable hypothesis:

A class of aesthetic effects—call one of them ache—depends on the strategic delay of resolution. Present generative systems are structurally biased toward early certainty, and that bias can be measured, counteracted, and tested.

The proposal is not that taste is solved, nor that aesthetic agreement is universal. It is that a specific failure mode—premature entropy collapse—systematically pushes models into pastiche. We introduce Unresolved Q, a phase-dependent control signal that penalizes early commitment while preserving coherence, and we outline how it can be implemented without adding noise or encouraging incoherence.

1. The Diagonalization Fallacy (as Hypothesis)

Creative domains are compressible. Strong stylistic modes exist, and models find them easily. This motivates two hypotheses:

H1: In creative generation, the highest-likelihood continuation correlates with recognizability rather than necessity.
H2: Human editorial judgment often acts as a negative feedback that nudges output off dominant modes by resisting early closure.

These are empirical claims, not axioms. The remainder of the article is concerned with how to test and operationalize them.

2. “Kill Your Darlings” as a Search Problem

Editors do not merely remove “bad” lines. They often delete lines that are locally satisfying but globally damaging. Computationally:

A darling is a locally high-reward continuation that reduces future option value.

This reframes a literary maxim as a search pathology: the system is too greedy. The problem is not beauty, but premature completion.

3. Constraints, Serialism, and Jazz (Why Optimization Isn’t the Enemy)

The framework must account for creative traditions that optimize heavily.

Constraint-based art (e.g., Oulipo)

Constraints act as negative operators relative to unconstrained generation: they remove easy paths and structurally block early closure. This aligns with Unresolved Q by forcing the system to remain under-articulated longer.

Serialism

Rule-maximal systems can sound sterile, but when they achieve tension, it is often because perceptual resolution is delayed (e.g., through register, density, or timbral smear). The lesson is not “rules fail,” but “early perceptual discharge fails.”

Jazz improvisation

Jazz is real-time optimization, yet it routinely produces ache. The objective is tension trajectory over time, not immediate payoff. Training signals include:

delayed audience response,
internal prediction error (expected resolutions deferred),
social mirroring within the ensemble.

These signals reward when to resolve, not merely what to play.

4. Why Audio Models Appear to Do Better

Audio affords continuous ambiguity: decay, microtiming, spectral blur. Ache can be carried by how sound unfolds without explicit symbolic decisions. Symbolic systems must decide every note or sentence; every decision asserts itself. Unresolved Q targets this assertion pressure.

5. Unresolved Q, Precisely Defined

5.1 Penalizing Premature Entropy Collapse

Let pₜ(a) be the model’s distribution over possible next actions at step t.

Entropy (Hₜ) measures how uncertain the model still is about what comes next. High entropy means many futures are still alive. Low entropy means the model has already decided.

The entropy collapse rate (ΔHₜ) is how fast that uncertainty disappears from one step to the next.

Unresolved Q introduces a penalty when entropy collapses too quickly, early in a phrase or idea — but only if the output remains coherent.

Intuitively: a large early drop in entropy means the system “makes up its mind” too soon — confirming the tonic, closing the cadence, or explaining the point before enough tension has had time to build.

Ache lives in that delay.

Worked example (music)

In a 4-bar melody:

Bar 1: broad options (setup).
Bar 2: a sharp cadence produces high ΔH. If voice-leading and rhythm remain coherent, the penalty applies, nudging the system to defer confirmation.
Bar 4: the penalty relaxes (see §6), allowing resolution.

This is not entropy maximization; it is commitment timing.

5.2 The Coherence Gate (Preventing Incoherence)

The penalty applies only if coherence exceeds a threshold. Coherence can be enforced via:

hard constraints (grammar, voice-leading, register),
a learned discriminator trained on expert pairwise preferences (“A preserves structure while deferring closure; B collapses into noise”),
self-consistency: a move is coherent if it supports multiple distinct, structurally valid continuations at depth +k.

This last criterion reframes coherence as future affordance, not present fit, allowing locally strange but globally fertile moves.

5.3 Structured Uncertainty (Not Noise)

Maintain branches where critics disagree about future value. Penalize moves that collapse this disagreement too early. This preserves meaningful alternatives rather than randomness.

6. Resolution Windows: When Closure Must Occur

Unresolved Q is phase-dependent, not absolute.

Define resolution windows—points where closure becomes desirable (phrase ends, harmonic arrivals, narrative turns). Operationally:

The entropy-collapse penalty decays as the system enters a resolution window.
Resolution is rewarded if it discharges accumulated tension coherently.

Unresolved Q ≠ never resolve. It means resolve at the right time.

Without this decay, the system produces drone or glitch; with it, tension becomes meaningful.

7. A Note on Games (Optional Analogy)

In games with terminal outcomes (e.g., chess), hesitation costs Elo. Still, a delayed-commitment regularizer can improve robustness by preventing premature overfitting in non-tactical positions. This analogy motivates the mechanism (certainty control), not the aesthetic goal, and can be omitted without loss.

8. Why Self-Play for Art Is Hard

Self-play succeeds in games because loss is terminal and external. In art:

payoff is delayed and diffuse,
“winning early” (closure) can be bad,
drafts and deletions—the negative data—are largely invisible.

Two partial substitutes:

Repeated-exposure evaluation to capture fatigue.
Counterfactual pruning to estimate lost optionality.

These are imperfect but testable.

9. Test 0: A Structural Stress Test

Before human studies, run a symbolic stress test (e.g., MIDI/lead sheets, 16–32 bars).

Variant	Decoding	Unresolved Q	Resolution Windows
Baseline	Greedy	❌	n/a
High-Temp	Randomized	❌	n/a
UQ-Early	Moderate	✅	Immediate
UQ-Goldilocks	Moderate	✅	Mid-phrase
UQ-Never	Moderate	✅	Disabled

Automatic metrics

Entropy trajectory: sharp early drops (Baseline), noisy (High-Temp), high plateau then late drop (UQ-Goldilocks).
Structural validity: UQ-Goldilocks ≥85% of Baseline.
Cadence map: tonic circled, landed once late.

Failure modes cleanly diagnose which component is broken.

10. Conclusion

Many creative failures in AI trace to premature certainty, not lack of knowledge. Unresolved Q reframes “ache” as a control objective: penalize early entropy collapse subject to coherence, then relax the penalty at resolution windows.

This does not mystify taste. It renders a familiar human intuition—don’t cash out too early—into an implementable, falsifiable mechanism.

Progress will come less from additional training data on great art, and more from systems that learn when not to decide yet—and when to finally decide.