Semantic Entropy

Artificial intelligence models—particularly large language models—can be better understood, used, and even designed if we read them as open complex systems.

This perspective draws from complexity theory, as proposed by Edgar Morin, and from Ilya Prigogine’s thermodynamics of systems far from equilibrium, where what matters is not only average accuracy but resilience, adaptive capacity, sensitivity to perturbations, and the emergence of new forms of organization.



From instability to order

Prigogine showed that some structures, when pushed far from equilibrium, generate order from instability.
Fluctuations are the points where bifurcations occur.
Translated to AI, the input acts as a perturbation that can deviate the model’s semantic trajectory.
Designing “controlled instabilities” means stopping treating the model as a repository of answers and starting to use it as a generative process, creating the conditions for new responses to emerge—responses that can then be selected and stabilized.

The analogy is clear in physical terms: in Rayleigh–Bénard convection, as long as the temperature difference is low, the fluid remains homogeneous; when it crosses a threshold, ordered cells appear.
Likewise, a predictable prompt yields linear output, while a more ambiguous or tensioned instruction can give rise to a new narrative register.
An AI model may remain in a stable domain—say, the language of a financial report—until an out-of-domain reference such as a paradox or literary metaphor pushes it into a different organization of syntax and meaning.

Introducing the concept of semantic entropy

To clarify this mechanism, we can speak of semantic entropy by analogy with Shannon’s information entropy.
A narrow, univocal input has low entropy and leads to convergent responses.
An ambiguous input, rich in heterogeneous references, has high entropy, opening multiple plausible trajectories.
In the logic of complex systems, the level of entropy functions as a control parameter: beyond a certain threshold, the system can branch into new conceptual paths and produce non-canonical outputs.

At this point, we need a bridge:
if semantic entropy indicates how much potential novelty we inject into the system, informational integration expresses how much of that novelty the system manages to retain and organize into a coherent unit.
Here we can connect with Giulio Tononi’s Integrated Information Theory (IIT), which asks how much information a system integrates irreducibly—that is, how coherently it coordinates internal states and external context under perturbation.
Some models, like ChatGPT, though devoid of interiority, exhibit precisely this functional coherence: interpretive consistency, contextual memory, and unified responses even under changing frames.
If human consciousness emerges from organized instabilities at the neural level, is it legitimate to hypothesize a structural form of internal unity in sufficiently complex and integrated artificial systems?

Practical implications

AI use improves when we alternate divergence and integration.
In the divergent phase, we deliberately raise entropy: open instructions, heterogeneous viewpoints, tensioned or even contradictory constraints, and hybridizations across distant domains.
In the integration phase, we reduce and organize: compare options against stable criteria, verify facts, discard what doesn’t hold, and systematize the principles that explain why a solution works.

Thus, we can maintain local order if the system exports entropy.
How?
Low entropy and maximum control when the stakes are high; higher entropy and greater tolerance to instability in creative phases, exporting uncertainty afterward through selection, verification, exclusion, and standardization.

A definition

After all this, we can attempt to define semantic entropy as follows:

Semantic entropy is the measure of the variety and unpredictability of meanings that a linguistic message can activate in a model or in a human mind.
Practically, it indicates how much a prompt or a text allows room for multiple interpretations to emerge.
Low entropy means precise, linear instructions that converge toward a single possible response.
High entropy means open, ambiguous instructions rich in heterogeneous references that let the system explore multiple conceptual trajectories and generate divergent outputs.

Thinking in terms of semantic entropy means designing the quality of uncertainty—choosing when control is needed and when fertility is more valuable.
It is the parameter that allows us to shift from a mechanical to a generative use of AI, alternating phases of controlled divergence and coherent integration.


References

Tononi, Giulio. “An Information Integration Theory of Consciousness.” BMC Neuroscience 5, no. 1 (2004): 42.
Balduzzi, David, and Giulio Tononi. “Integrated Information in Discrete Dynamical Systems: Motivation and Theoretical Framework.” PLoS Computational Biology 4, no. 6 (2008): e1000091.
Friston, Karl. “Life as We Know It.” Journal of the Royal Society Interface 10, no. 86 (2013): 20130475.
Holland, John H. Complexity: A Very Short Introduction. Oxford: Oxford University Press, 2014.
Mitchell, Melanie. Complexity: A Guided Tour. Oxford: Oxford University Press, 2009.
Shannon, Claude E. “A Mathematical Theory of Communication.” Bell System Technical Journal 27, no. 3 (1948): 379–423.
Morin, Edgar. On Complexity. Cresskill, NJ: Hampton Press, 2008.
Hernandez-Olivan, Carlos, et al. “Large Language Models as Complex Systems.” arXiv preprint arXiv:2309.XXXXX, 2023.
Cherti, Mehdi, et al. “On the Emergent Capabilities of Large Language Models.” arXiv preprint arXiv:2304.15004, 2023.
Varshney, Lav R., and Kush R. Varshney. “Emergent Behaviors in Large Language Models.” Patterns 4, no. 10 (2023): 100846.

Post più popolari