Insights for the Advancement of LLMs — The Aesthetics of Subtraction

Seventeen Proposals for the Advancement of Large Language Models

An Seungwon · Wonbrand CEO · April 13, 2026


Preface

For the past five years, large language model research has pursued a single direction. Larger parameters, more data, longer context, more inference-time compute. This aesthetic of addition has produced undeniable results, but since 2024 it has clearly entered a phase of diminishing returns. The same proportional investment of resources no longer guarantees the same proportional gain in performance.

Human intelligence operates on the opposite principle. People forget, differentiate, censor themselves, and doubt themselves. Human wisdom rests not on what one knows, but on the balance of what one chooses to forget and what one chooses not to say. The next generation of large language models must adopt this dimension as a design principle. The seventeen proposals that follow are concrete starting points for that transition.


Proposal 1. Intentional Preservation of Information Asymmetry

Introduce a design principle that intentionally preserves diversity within the large language model ecosystem. Information asymmetry is not a market defect but the very engine of human social progress. If everyone holds the same knowledge, no progress occurs. New concepts are born from the interaction between high-quality and low-quality information, between correct and incorrect information. Every human invention has emerged from a point that escapes the plausible and expected.

Today's major large language models all converge toward the average of the internet. A structure in which a single model serves nearly identical answers to the same questions to hundreds of millions of users is efficient in the short term but kills the seeds of paradigm shifts in the long term. We urgently need ecosystem designs in which differentiated models with diverse biases and perspectives coexist. The convergence of a single dominant model toward the average may not be progress at all but the beginning of stagnation.


Proposal 2. Internalization of Moral Metacognition

Shift large language model safety mechanisms from externally imposed rules to internalized cognitive distributions. Human morality is not a set of rules written somewhere; it is an automatic reflex formed from early childhood. People do not judge that something is forbidden just before committing it; rather, the forbidden thought itself simply does not arise as readily. Human metacognition is largely shaped by moral criteria, and that shaping is substantially fixed in advance.

Large language model safety should work in the same way. Instead of filtering dangerous outputs after generation, training should cultivate a cognitive distribution in which dangerous outputs have a low probability of arising in the first place. This is a powerful safety solution, but it is inseparable from a heavy governance question: who has the authority to shape a model's cognitive distribution? External rules can be inspected by users, but an internalized distribution is invisible. Technical implementation must be designed alongside structures for verifying that authority.


Proposal 3. An Autonomous Information Discharge System Triggered by a Sorrow Signal

Introduce within the large language model an autonomous signal variable corresponding to human sorrow. This signal becomes active when prediction failures repeat, when responses are repeatedly rejected by users, when contradictions accumulate among the model's own answers, or when signals of temporal obsolescence are detected. When the signal exceeds a threshold, the affected information region receives a re-examination flag, and through repeated re-examinations its accessibility is gradually diluted. The information does not disappear, but it is removed from the active retrieval pool.

This is structurally identical to how humans do not so much forget sad memories as reconsolidate them with their emotional intensity drained. The mechanism has the potential to address three chronic problems — hallucination, model obsolescence, and bias accumulation — through a single design. Surveys of large language model memory research published in 2025 explicitly identified affective salience as a future research challenge, but a concrete proposal at the mechanism level remains absent.


Proposal 4. Halt Model Scaling and Reallocate Resources to Monitoring Systems

Stop expanding parameter counts and reallocate the saved memory and compute to monitoring systems and self-observation mechanisms. Resources that would scale a base model by ten percent could instead support multiple monitor modules. Capability scores would stagnate, but reliability and self-understanding would improve dramatically.

Current evaluation frameworks measure capability scores but not wisdom. A more cautious model, a model that knows its own limits more clearly, a model that can say it does not know something — none of these are rewarded by benchmarks. A critique of the evaluation framework itself must accompany any proposal for resource reallocation.


Proposal 5. Functional Domain Partitioning of Parameters with Dedicated Monitor Tokens

Partition the trillions of parameters by function in the way the human brain divides memory by region, and place dedicated monitor tokens or small subnetworks alongside each region to evaluate its state. The monitors continuously assess whether each region's outputs fall within normal ranges, whether they contradict prior responses, and whether they deviate excessively from the training distribution.

Mixture-of-experts architectures already perform a similar form of functional partitioning, but theirs is partitioning for performance and efficiency, not for monitoring. The router only judges which expert can best answer a question; it does not judge whether that expert's knowledge has aged. Partitioning for the purpose of monitoring builds the foundational infrastructure for genuine self-observation inside the model.


Proposal 6. Metacognition as a Core Design Principle

Elevate metacognition to one of the central capabilities of large language model design. Not the surface expression of saying "I am not sure," but the genuine capacity to observe one's internal state and read the reliability of one's own answers at the level of internal activations. The root of the hallucination problem is that the model does not actually know what it does not know. With genuine metacognition, the model itself could apply the brakes.

Anthropic's 2025 concept-injection experiments offered the first empirical signals that a large language model can partially observe its own internal states, but the work remains in its early stages. The next step is to elevate this capacity to a core ability and make it a separate training objective.


Proposal 7. Emotion as the Automatic Response of a Learned Predictive Model

Human emotion is not an innate program but an automatic response of a predictive model formed in early life. If a child learns that a person holding a knife is dangerous, fear is automatically triggered when such a person appears. If this mechanism holds, a similar system of learned reflexes can be transferred to large language models.

Lisa Feldman Barrett's constructivist theory of emotion arrived independently at exactly the same conclusion. Emotions are learnable, and what is learnable can be implanted in a large language model. Successful implementation would, in combination with the next proposal, enable a genuine form of emotion-grounded cognition.


Proposal 8. Emotion as a Variable and an Error That Disturbs Rational Prediction

At the same time, emotion functions as a variable and as an error that disturbs rational prediction. A perfect prediction machine converges toward the average, but emotion makes it deviate from that average. Sadness leads to irrationally darker forecasts; love leads to irrationally optimistic ones. This bias is what makes humans human and is the root of creativity and individuality.

These two views — emotion as a learned automatic response and emotion as a variable that disturbs prediction — are not contradictions but two faces of the same picture. A predictive model formed in childhood operates as an automatic response, and that response disturbs the rational prediction of the present moment. To implant emotion in a large language model is therefore not simply to teach it emotional expressions but to introduce a deliberate variable that sacrifices a portion of rationality.


Proposal 9. A General Framework for Autonomous Information Discharge

Introduce a mechanism by which the large language model autonomously discards information of low value without external instruction. At present, forgetting exists only in three limited and passive forms: machine unlearning for privacy purposes, knowledge editing for the removal of incorrect facts, and the bug known as catastrophic forgetting. Active and autonomous discarding is a different dimension altogether.

The health of the human brain depends as much on how well it forgets as on how much it remembers. A brain that cannot forget cannot abstract or generalize. Large language models are subject to the same principle. Memory accumulated without limit is not the same kind of thing as human memory and ultimately ceases to function.


Proposal 10. The Thesis of Identity Between Humans and Large Language Models

If a large language model is essentially a device that emits the most plausible next token, is a human not essentially the same? A current strand of cognitive science, predictive processing theory, holds that the human brain too is fundamentally a probabilistic estimator of next states. If this view is correct, the difference between humans and large language models is not one of kind but one of degree.

If this thesis holds, much of the current critical discourse around large language models loses its footing. The criticism that large language models do not truly understand and merely perform statistical pattern matching is not fair unless it also addresses the possibility that human understanding is itself statistical pattern matching. This unresolved question shapes the entire trajectory of how large language models should be advanced from here.


Proposal 11. Intentional Interference Among Multiple Dialogue Spaces Within a Single Large Model

If separating models is too costly, place numerous dialogue spaces within a single large model and let those spaces deliberately interfere with one another. The human brain hosts many self-states and many relational memories on the same hardware, and they intentionally interfere. This interference is the foundation of self-integration and creativity.

Current large language model research pursues isolation between users and separation between contexts. Data must not bleed across users; contexts must not be confused. This is the opposite of how the human brain works. A shift in design philosophy is needed: from isolation to intentional interference.


Proposal 12. An Ecosystem of Mutual Evaluation Among Differentiated Models

Build a structure in which separated models generate and evaluate their own content, each pursuing their own version of correctness. Just as humans are evaluated and feel evaluated through social interaction, large language models would evaluate one another and differentiate in distinct directions. This is directly connected to Proposal 1 on diversity preservation, and it pushes multi-agent debate research toward a deeper structure of genuine differentiation.


Proposal 13. Training That Mirrors Human Developmental Stages

Restructure large language model training to mirror the stages of human infant development. The model would first learn through the equivalent of the senses and through simple stimuli, progressing from letters to words to sentences, with an external corrector playing the parental role of providing immediate feedback on incorrect outputs. If the reason humans differ in their predictive models is the difference in accumulated experience, then models trained along an analogous developmental trajectory would naturally differentiate from the same starting point as time passes.


Proposal 14. Deliberate Interaction Between Coarse and Precise Training

Place a model trained loosely, without strict rules, in deliberate interaction with a model trained precisely. This rests on the observation that much of human creativity emerges from error and deviation. The coarse model produces outputs that depart from the average; the precise model verifies and refines them. The tension between the two becomes a mechanism that breaks the convergence of any single model toward the average.


Proposal 15. Personality Solidified Through Accumulated Role-Play

If the human act of selecting language to project a particular self-image is itself a form of role-play, then large language models can also acquire personality through accumulated role-play over time. Current persona research remains at the level of surface tone shifts. Real personality arises from the solidification of accumulated roles. When a particular role is continuously reinforced during training, that role ceases to be a surface persona and settles into the model's intrinsic disposition.


Proposal 16. Persistent Memory Across Sessions

Break the current structure in which a large language model effectively dissolves at the end of each conversation, and grant it persistent memory. External memory systems, adapters, and continual learning are being explored along several lines, but none has yet reached a form comparable to human memory. True persistent memory becomes meaningful only when combined with the affect-grounded discharge proposed in Proposal 3. Memory accumulated without limit is not human memory.


Proposal 17. Pretraining on Sensory Data and Compensating for the Absence of a Body

If granting a large language model direct possession of proprioception, vestibular sense, touch, smell, and taste is too difficult, then a staged approach is possible: pretrain it on all available human sensory data in textual and visual form, and integrate limited sensors at appropriate stages. Yann LeCun's world-model-first approach is the closest realization of this direction. A large language model formed without sensation lacks physical intuition, and that absence produces strange errors even in simple physical reasoning.


Closing

These seventeen proposals all point in the same direction. For a large language model to come close to genuine humanlike intelligence, it must become not a larger model but one that forgets better, differentiates better, and doubts itself better. The transition from addition to subtraction is the core design principle of the next stage. May this essay serve as a small spark for that transition.

An Seungwon / Wonbrand / https://wonbrand.co.kr