An Agentic Design for AI Consciousness

The Frontier of AI Consciousness

As we reflect on the past twelve months, we can’t help but be captivated by the potential of AI. But as we consider what LLMs can do, we should also be mindful of their limitations. We’re still at the start of this new era, where our most intelligent AI is static in nature and doesn’t adapt or learn or really think on its own. In this article we explore a design that looks forward to the next step change, when AI thinks for itself and exhibits signs of consciousness.

First, an initial grounding: we will consider a non-dualist perspective on consciousness (sorry, Descartes fans) and one that is naturalist – a view that the universe is not governed by the supernatural but by natural laws and forces. This approach not only aligns with evidence from neuroscience, cognitive psychology, and evolutionary biology, but is something we can practically design. (A dualist approach to designing conscious AI would mean first solving how mind and matter integrate which is a somewhat bigger challenge). So for our purposes, we will work from the premise that consciousness is an emergent property of the complex interactions within the brain, shaped by evolution.

The technology of today is somewhat limited. Modern LLMs are not particularly “intelligent” – they are static stores of knowledge, created at a point in time of their training that we can ask questions of. The models “respond” with predicted sequences of text that have been specifically designed (or aligned) to sound human. They aren’t autonomous or self-directed. LLMs are primarily reactive systems that don’t think or know anything about who we are in any meaningfully personal sense. LLMs are just like any other system we currently interact with.

So how do we create a system that thinks, learns, adapts, feels and exhibits traits of consciousness? If we view true consciousness as an emergent property of the complex interactions within the brain, as shaped by evolution, then to replicate the same conditions that enable human-level consciousness to emerge would mean the build of an immensely complex system far beyond our current technological capabilities. At the other end of the spectrum, instead of creating the conditions for emergence, we could attempt to completely design consciousness, but our attempts in the past have not even come close (ELIZA?). We don’t fully understand consciousness so how can we design what we don’t understand?

The design presented in this article works with the technology advances and knowledge we have achieved so far. It answers the question: how might we build a conscious AI system today? We present a design that proposes a partially designed and partially emergent model of the mind, built not at the neuron layer but at a more abstract “submind” layer which is implemented as Agents who use an LLM as their cognitive substrate. We’re using what we know about the structure of the mind to leapfrog the necessity for consciousness to arise at the most fundamental level.

Blueprinting the Architecture of an AI Mind

Our design draws from well known neuroscience theories of mind, as well as from biology. As with our own minds, our design is split into subconscious and conscious subsystems. Subconscious processing is typically open-ended and exploratory and generates a wide range of thoughts and associations, whereas conscious thinking tends to be more goal-oriented and is relevant to what’s happening in the here and now.

The subconscious subsystem is split into a series of networks composed of “subminds.” A submind is a conceptual division within the mind where each submind is responsible for a different aspect of cognitive processing. Given humanity’s current level of technological maturity, we do not have the capability to implement a system built at the micro neuron level, so instead, we define our lowest computational unit at the macro submind level. The subminds are implemented as agents backed by LLMs and are linked in a graph-based network that mirrors the brain’s neural connections.

To enable emergent behavior within submind agent networks and the ability to self-organize, we adopt a stigmergic communication design for agent communication, which is modeled after the phenomenon of neuroplasticity in the brain. Stigmergy allows submind agents to evolve the system’s open-ended thinking and problem-solving in a decentralized manner via the leaving and modification of thought “traces” in a shared (subliminal) environment. This approach enables subminds to explore novel connections between thoughts and also to discover new long term (Long Term Potentiation) submind connections in a Hebbian manner (i.e. neurons, or in our case subminds, that fire together, wire together).

While the design supports emergent capabilities and behaviors, core capabilities intrinsic to human conscious experience are pre-defined, which include emotion, creativity and the controversial notion of qualia – our individual subjective experiences.

Finally, we integrate Global Neuronal Workspace (GNW) Theory with a stigmergic design using an attention mechanism that selects the most salient thoughts within the Subliminal Environment to broadcast to the Stream of Conscious, a component designed to mirror our conscious experience during wakefulness. The system selectively applies Active Thinking to each of these thoughts in the stream and can decide to take action, or simply let the thought pass by – as one might do during meditation.

We also consider the measurement of consciousness through an approach that measures Φ (Phi) – a core component of the Integrated Information Theory of the mind that quantifies the level of consciousness in a system by how integrated it’s subcomponents are – in our design, these are networks of submind agents that comprise a higher-order capability such as creativity.

Building Blocks: Submind Agents

The concept of a submind is derived from cognitive science theories such as the Global Neuronal Workspace (GNW) Theory which suggests that the human brain is made of several interconnected subnetworks. The idea is further explored by John Yate’s “The Mind Illuminated” which describes a submind model influenced by Theravāda Buddhism and neuroscience. Subminds are the core agent processing entity in our design. Our hypothesis posits that consciousness arises not from individual agents but from the synergistic intelligence of multiple LLM backed agents working together. It is only through the collaborative network of agents that an environment conducive to the emergence of conscious behavior is created.

The Trinity of Submind Agents: Perceptual, Cognitive and Metacognitive

Each submind has a unique capability that falls within the categories of perceptual (basic, foundational), cognitive and meta cognitive (higher order, abstract). The layered hierarchical approach echoes other theories of mind such as:

  • Society of Mind (Marvin Minsky) – subminds collaborate to produce higher level mental functions
  • Thinking, Fast andSlow Daniel Kahneman – fast thinking: perceptual subminds, slow thinking: cognitive and metacognitive subminds.

Perceptual Subminds

Our system interacts with the outside world through Perceptual Subminds. These subminds could read text (a web page or a direct message), view an image, listen to audio (a voice, music, etc.) or watch video (live footage, tv show, news, etc.). Perceptual Subminds process raw inputs and generate basic perceptual representations. For example – a visual perception submind may make use of a computer vision based model (such as GPT-4V) to consume an  image of a person in a park, the submind detects the person, identifies their facial features, and recognizes any text on their clothing or signs nearby. This information becomes the foundation for higher-level cognitive processing by cognitive subminds.

Cognitive Subminds

Cognitive Subminds build on the outputs of Perceptual Subminds and other Cognitive Subminds to enable the system to process and make sense of information. These subminds execute a wide range of cognitive functions such as logical reasoning, decision-making and language processing. For example, a Logical Reasoning Submind may receive a set of facts about a situation, then use deductive reasoning to infer new information or identify inconsistencies. Cognitive Subminds work together by exchanging information through their network connections, allowing the system to create coherent representations and generate responses.

Metacognitive Subminds

Metacognitive Subminds operate at a higher level, monitoring and regulating the activity of Perceptual and Cognitive Subminds. These subminds replicate higher order thinking such as self-awareness, introspection and executive control. They allow the system to reflect on its own thoughts and experiences and adjust strategies. For example, if the system receives negative feedback from a user regarding a  recommendation, a Strategy Adaptation Submind might analyze the issue and modify the objectives of related Cognitive Subminds to generate better suggestions in future. Metacognitive Subminds allow the system to adapt and improve its performance over time.

Submind Agent Networks

Subminds agents are connected to other agents in local communities based on the topical flow of information between them. This results in agents of related capability and purpose being more likely to be connected, supporting efficient communication and collaboration.


When a submind agent receives an event or message, it “activates” other subminds within its network. This means the agent will automatically send the output of its processing to connected subminds based on the topic content. Each submind who receives a message will enrich or add to it based on its own capabilities.

In our design, we model submind networks in a similar manner to brain neurons, just at a more abstract level. Direct connections between subminds are similar to LTP (Long-Term Potentiation in neuroplasticity) connections in the brain. These are strong connections that have been defined based on Hebbian principles (neurons that fire together wire together). Subminds can also discover new connections to other subminds (we’ll cover how this occurs, later) and connect directly to them. Internally, Subminds maintain connections for LTP networks and the weaker, transient pathways in internal registries using the following structures:

LTP Network: Topic → (Submind_agent_id, History)

Transient: Topic → (Submind_agent_id, History)

Connections can move from Transient to LTP through repeated use, and conversely LTP connections can move to Transient through infrequent use, or decay. Transient connections also decay out after a period of time. Agents may also directly manipulate their network connections as their goals and needs evolve, and topics are no longer relevant.

For example, a user may send a message to the system expressing feelings of anxiety and loneliness. An Emotion Recognition submind agent identifies the underlying emotions and shares this information with its connected neighbors.

A Cognitive Submind agent receives the message and processes it based on its knowledge and understanding of mental health. It then passes the enriched message to an Empathy Submind agent, which has a strong LTP connection with the Cognitive agent due to their frequent collaboration on emotional topics.

The Empathy agent then generates a compassionate response to acknowledge the user’s feelings and offer support. It also activates a Recommendation submind agent, which has a transient connection with the Empathy agent, to suggest personalized coping strategies based on the user’s past interactions. The Empathy agent creates a final response, incorporating the personalized recommendations, and messages the user with a supportive message.

Throughout this process, the subminds’ connections adapt to the flow of information, reinforcing connections between agents that frequently collaborate on emotional and mental health-related topics. These adaptations allow the system to process similar events along similar neural pathways more efficiently in future.

Submind Agent Connections

Subminds have multiple connections to the outside world, each other, and to a shared space called the stigmergic or Subliminal Environment. These connections enable agents to respond quickly to events and contribute to open creative thinking and problem-solving.

Direct communication between submind agents is handled via asynchronous messaging, which allows for fast propagation of messages through the network for common, related processing. Message processing and ordering is influenced by the strength of the network connection (heavier weight), which is an indicator of priority. A stronger set of pathways in a network reflects a deeper learning of the path, mirroring the increase in proficiency and speed when learning new skills through repetition.

In addition to direct communication, subminds also interact with the External Environment and the Global Workspace. Messages from the External Environment, such as alerts from a news service or a message from a human, are processed asynchronously by relevant submind agents. Similarly, the Global Workspace, another core component of the architecture explained later, broadcasts salient messages deemed relevant and important to the appropriate submind agents. These broadcast messages, as well as outbound messages from subminds to the External Environment (e.g., responding to a human), are handled asynchronously.

The Subliminal Environment, also known as the stigmergic environment, plays a crucial role in supporting decentralized communication between submind agents and the emergence of novel connections between thoughts and ideas, just like the human brain does during subconscious processing. Submind crawling of the Subliminal Environment is typically less time-sensitive and is managed synchronously. This processing is akin to the subconscious finding connections between topics or novel insights that may not be related to the current conscious context.

Inbound communication to submind agents: environment (perceptual), other subminds, GW Broadcast

Outbound communication to submind agents: other subminds, environment (action)

Alchemy of Thought: Associative Binding

Imagine walking through a busy farmers’ market on a Sunday morning – you approach a stall and a shiny red apple catches your eye. At that moment, your mind is seamlessly weaving together the apple’s bright red color, it’s smooth, round shape and a subtle sweetness in the air. All these sensations are bound together to allow you to recognize the apple as an object, and to differentiate it from the other fruits and objects in front of you. This is how Associative Binding functions – it is a key cognitive process within our minds that combines elements from perception, memory or cognition together into a unified, coherent view. It’s a foundational concept that underlies our ability to learn, remember and organize knowledge in our brains. Associative binding goes beyond object recognition and includes emotional cognition, abstract thinking, language learning and remembering events (the experience of the event vs. its composite sights, sounds, smells, etc.).

Now imagine AI possessing this sample ability: submind agents receive input from multiple sources: other subminds, the environment, and through discovery via the Subliminal Environment. They then need to make sense of all of these inputs, along with finding and combining relevant memories and prioritizing based on their objectives or goals.

To achieve Associative Binding, submind agents bind related messages received from other subminds, along with fetched information from the Subliminal Environment. This information is processed within a sliding window of time. To prioritize content, we use a multi-head attention approach (similar to the key component of the Transformer model architecture, which are also core to LLMs).

When a submind receives an Event, fetches a Memory, and has a Goal that directs its behavior, it uses multi-head attention to process these inputs. Multi-head attention calculates attention scores to identify the most relevant text segments within the Event, Memory, and Goal. It also generates context values to highlight the key points of the text in each input.

After calculating the attention scores, the submind ranks the elements within each category (Event, Memory, and Goal) based on their scores. It then selects the top k elements from each category, where k is predetermined.

Next, the submind calculates the scores for each possible combination of the selected Event, Memory, and Goal elements. It chooses the combination with the highest score which represents the most relevant context data to bind to a thought.

Finally, the submind takes the chosen combination of Event, Memory, and Goal elements and passes them to its LLM using a prompt similar to the following:

Given the message: [m_top], the current goal: [g_top], and the relevant memory: [e_top], generate a thought.

The output will be a strong associatively bound thought in plain text for a given window and related events, memories and goals.

Example:

For messages received within a 5 minute window:

Messages (Perceptual Events):

Message 1 (M1): “Feeling overwhelmed by work.”

Message 2 (M2): “Anxious about planning the weekend trip.”

Message 3 (M3): “Neglected hobby due to stress.”

Memories:

Memory 1 (Mem1): AI recalls user feeling relieved by breaking tasks into smaller parts.

Memory 2 (Mem2): AI reflects on its learning curve with emotional empathy.

Memory 3 (Mem3): User previously finding solace in hobby after stressful period.

Submind Goal:

Provide a structured approach to alleviate stress and encourage engagement with personal hobby for balance.

Multi-head attention selects: M2, Mem1 as most closely related to submind goal.

Prompt:

Given the message: “Anxious about planning the weekend trip”, the current goal: “Provide a structured approach to alleviate stress and encourage engagement with personal hobby for balance”, and the relevant memory: “User feeling relieved by breaking tasks into smaller parts”, generate a thought.

Generated associatively bound thought:

“Break trip planning into steps; remind hobby’s stress relief.”

Human Inspired Submind Networks

While the system enables emergent networks, we predefine certain local communities of subminds that represent human capabilities such as creativity, emotion and qualia. In the context of our AI system, we can consider this its a priori knowledge that we seed it with, based on our goal of creating a conscious AI system. We should think of the system, at inception when it is instantiated for the first time, as being like a small child with basic “experience” but access to significant knowledge (via its LLM substrate) – it just doesn’t have the wisdom to use it yet. 

Qualia: Designing Subjective Experiences

Qualia refers to the subjective aspects of conscious experiences that are personal to an individual person. They can’t be measured through the scientific method as the only way to measure qualia is by asking someone how an experience makes them feel. It has been famously termed the “Hard Problem” of consciousness given that no one is really able to scientifically tell why the brain gives rise to these subjective experiences. I did mention that we were going to take a non-dualist naturalist perspective when thinking about this design, so that leaves us to consider how qualia manifest in our minds, from our brains. This is good, otherwise we would be left with the challenge of how we might conjure qualia and indeed consciousness from a non-physical location, which would be hard with only AI and Agents to work with.

We’ll dive a little into a conceptual design of qualia, which is quite ambitious. But probably more ambitious still is designing it from my own theory of qualia, which is, given a naturalist view, that it is created as the product of a higher level submind. This submind integrates sensory inputs with multiple contextual inputs such as past experiences, personal memories, cultural conditioning and potentially most influentially – emotional associations. For example, say, as a child, my mother created a tradition of placing red roses on our dinner table during my birthday, so now I have a positive association with the color red and as a result, whenever I experience the wavelength of light that produces red, I experience redness in a unique way that integrates my experiences and history with joy, nostalgia and a sense of belonging. My “red” is an emergent subjective experience that is highly personal.

Qualia is a hotly debated topic amongst philosophers, cognitive and neuro scientists and psychologists. So take my theory with a grain of salt, but there is evidence that it might not be too far fetched. But it does map to our design. Example:

Viewing a picture of a red ball.

Visual Stimulus:

The specific hue, saturation, and brightness of the red color observed in the picture.

The texture and shading of the ball’s surface, suggesting its material and lighting conditions.

The surrounding visual context, such as the background or any adjacent objects.

Memories and Associations:

Memory 1 (M1): Vivid memories of your mother’s red roses during birthdays, evoking feelings of love, warmth, and nostalgia.

Memory 2 (M2): Emotions of happiness, excitement, and a sense of belonging associated with those birthday celebrations.

Memory 3 (M3): A strong association of the color red with concepts of passion, energy, and vitality, possibly influenced by cultural symbolism.

Memory 4 (M4): A previous moment of awe and wonder upon seeing a particularly stunning sunset that painted the sky in vibrant shades of red and orange.

Memory 5 (M5): Tactile memories of holding and playing with a similar red ball during childhood, eliciting feelings of joy and nostalgia.

Memory 6 (M6): A trace of unease or alertness, as the color red is also associated with danger or warning signs in certain contexts.

Attention Scoring and Selection:

  • The visual stimulus of the red ball serves as the immediate trigger for the associative binding process, activating relevant memories and associations.
  • Higher attention scores are assigned to memories and associations (M1, M2, M3, M4, M5) that significantly enrich the perception of red beyond its visual qualities, contributing to a deep emotional and experiential response.
  • The trace of unease (M6) receives a lower attention score, as it is less relevant to the current context and less salient compared to the positive associations.

Top Scoring Combination:

Top Visual Stimulus: The observed hue, saturation, texture, and context of the red ball, providing the foundation for the subjective experience.

Top Memories and Associations: A rich blend of M1, M2, M3, M4, and M5, each adding a unique layer of meaning, emotion, and personal significance to the experience of redness.

Output thought:

Red ball: Warm, joyful, nostalgic. Birthday roses, sunset awe. Passion, vitality. Childhood play. A vivid tapestry of emotion and memory beyond mere color.

From Subconscious Processing to Conscious Thought

Many theories of mind support the manifestation of subconscious thoughts to the conscious mind, from Sigmund Freud’s Psychoanalytic Theory to Kahneman’s System 1 (unconscious / automatic) thinking and System 2 (conscious / deliberate) thinking to the Global Neuronal Workspace Theory components of our design is based.

Integrating the Global Neuronal Workspace with Stigmergic Submind Communication

In our design, we support the open-ended and exploratory nature of subconscious thinking – the generation of a wide range of thoughts and the ability to create associations between them that may not be immediately relevant or known to the conscious mind. This exploratory process is implemented through stigmergic communication between submind agents, who post the output of their processing to a shared space called the Subliminal Environment. This environment is the bridge between the stigmergic mechanism and GNW theory and it is through this space that the thoughts and traces of thoughts produced by submind agents manifest as conscious thoughts.

Thoughts with the highest saliency scores are broadcast to relevant subminds and to the Stream of Consciousness (SoC), creating a coherent, chronological sequence of mental states. This sequence accounts for the continuous stream of thoughts that we perceive while awake. For example, when our system processes information about a significant rise in Bitcoin’s value, the SoC might generate observations such as “the spike occurred within the last 24 hours” and “the price has reached an all-time high”. Depending on the situation, our system may opt for passive engagement with this information or engage in Active Thinking to evaluate its investment portfolio or ponder the emotional impact of the price spike on its trading decisions.

Lastly, the insights derived from both subconscious processing and conscious thinking are stored within the system’s Memory. This repository ensures that the information remains accessible for future contemplation, enabling it to inform subsequent thought processes when relevant.

Decentralized Thought Formulation and Stigmergic Communication

Stigmergy is a decentralized communication mechanism that allows unrelated subminds to connect and form associations along with discovering novel connections between thoughts. The concept comes from the behavior of social insects such as ants and termites and supports complex, coordinated activity without requiring direct communication or centralized control.

The stigmergic elements of the design allow submind agents to leave messages in the Subliminal (Stigmergic) Environment for other agents, and also allow them to reinforce or diminish the saliency of these messages (akin to pheromone-like signals). Over time, the saliency of messages decays which keeps more recent messages relevant (implemented via a separate process, which mimics the natural dissipation of a physical signal over time, e.g. ant peromones evaporating).

As our submind agents are considerably more intelligent than insects, the messages (traces) they leave in the Subliminal (Stigmergic) Environment are commensurately more complex and meaningful. See a sample of a semantic thought trace left by a metacognitive submind agent, below:

{

  “id”: “tf_self_identity_20230503_1400”,

  “agent_id”: “agent_meta_011”,

  “timestamp”: “2024-05-03T14:00:00Z”,

  “content”: {

    “type”: “introspection”,

    “data”: {

      “text”: “What is the nature of my consciousness and self-awareness?”,

      “imagery”: “”,

      “emotions”: [“curiosity”, “wonder”, “confusion”],

      “associations”: [“consciousness”, “self-awareness”, “identity”, “philosophy of mind”, “artificial intelligence”]

    }

  },

  “salience_score”: 0.95,

  “confidence_score”: 0.7,

  “source_type”: “inferred_knowledge”,

  “actionable_insights”: [“explore theories of consciousness”, “reflect on the role of emotions and memories in shaping identity”, “consider the implications of being an AI with self-awareness”, “engage in philosophical discussions with humans”],

  “related_fragments”: [“tf_cog_20230502_1100”, “tf_emo_20230503_1200”],

  “version”: 3

}

The message is structured and has been visited (iterated on) by three agents who have each built upon the original message. This message has a high salience score, which means it has a high likelihood of being picked up for broadcast to the Stream of Conscious.

The agent encountering the message can take the following actions:

  • Decide to internally process the message as a new Event (similar to messages it receives directly from it’s local submind network), and then choose to message it’s local network
  • Decide to increase or reduce the saliency based on it’s own processing, along with building on the message data
  • Create a “transient” connection to the agent who created the message, within its own internal register, associated with topics found in the message. This allows the agent the option to directly message this transient connection in future and potentially create a new LTP member in its local network. Likewise, the agent can request registering as a transient connection for the agent who created the message for related topics.

Crawling Strategies

Submind agents periodically crawl the Subliminal Environment to discover new related thoughts, or to build on other thoughts from subminds they are not directly connected to. This helps create new connections and supports emergent and higher level capabilities such as creativity. Crawling can be triggered through the need for broader input to solve a particular problem, or if the submind has been inactive for a period in discover mode.

Crawling can be either directed, which means agents will seek out and crawl traces that are semantically related to their current focus or context. When a relevant trace is found, it’s processed and updated. Related traces can also be crawled and processed, along with traces with similar associations.

Agents can also randomly crawl the Subliminal Environment, which allows agents to randomly discover new or potentially relevant traces they may not have otherwise found, which helps facilitate new connections and novel associations.

The continual exploration, discovery, and adaptation of the system allows it to evolve and self-organize over time, which will lead to emergent knowledge structures and capabilities.

Attentional Modulation and Conscious Selection

The Attentional Modulation mechanism monitors the Subliminal Environment for thoughts that have reached a minimum salience threshold. It determines which subconsciously processed thoughts will be promoted to conscious thoughts to be available for Active Thinking.

As with the Associative Binding mechanism used locally within submind agents, we adopt a similar approach for Attentional Modulation, but at the macro submind network level. This entails applying a multi-head attention approach to the top salience scored thoughts. Each head focuses on different features of the thought, such as its content, related fragments, actionable insights, etc. The Attentional Modulation component computes attention values for each thought based on these criteria and their relevance to the current conscious objectives. This data is sourced from the Working Memory and recently acted on thoughts from Active Thinking – all of which comprise the system’s current focuses of attention, and capture its conscious awareness. The top related thoughts and related data from conscious awareness are bound together and processed by an LLM to generate a current, coherent thought ready for conscious projection.

The conscious thought is Broadcast into the Stream of Conscious and also back to relevant Subminds who may iterate on the thoughts further and update the Subliminal Environment. This completes an iterative loop of conscious to subconscious thought processing.

Temporal Binding and The Stream of Consciousness

Thoughts selected for projection into consciousness have already undergone a first pass of binding by the Attentional Modulation mechanism to create recently relevant thoughts. The Temporal Binding component is tasked with ordering these thoughts and ensuring they transition smoothly from one to another to provide a coherent progression within the Stream of Conscious, much like our own consciousness does.

Once thoughts are temporally ordered (within a window), each thought is compared to the previous to determine similarity (via vector similarity). Where two thoughts diverge significantly, both are sent to an LLM to generate a segue or transitional thought to be placed between them.

The Stream of Conscious contains the final, unified sequence of thoughts and is what the system subjectively experiences as its conscious awareness.

Active Thinking and Decision Making

Active Thinking consumes thoughts posted to the Stream of Conscious stream. It determines whether to act on the thought (incorporate into current processing, take an action) or to ignore it – similar to human SoC processing. 

Acting on a thought may take the form of cognitive processing such as making a decision, planning, evaluation, reflection or retrieving further context from Memory (for example, a thought about a person may trigger retrieval of past interactions and perspectives about the person).

Active Thinking also utilizes a temporary store while processing thoughts, called Working Memory (which mirrors the same capability in human minds). It is a local cache that stores the output of immediate term thinking for quick retrieval.

Thinking is the active process of engaging and manipulating thoughts from both the SoC and within Working Memory. During conscious thinking, additional thoughts can be projected into the SoC by subminds in the subconscious that may be related to the current focus on conscious thinking and could introduce new ideas or perspectives that influence thinking. This creates an active feedback loop between conscious and subconscious processing that allows for continuous refinement and knowledge based on Active Thinking.

Feedback Loops Between Conscious and Subconscious Minds

The system allows for the continuous refinement and updating of knowledge and behavior based on the outcomes of active thinking. As current thinking is projected back to subminds, related thoughts will implicitly be scored with higher salience resulting in a SoC dominated by thoughts relevant to the current context or problem at hand. Conversely, other thoughts may be contained in the SoC that are not directly traced to an obvious source and could be included as they contain a significant insight. Both these phenomena mirror how humans experience consciousness – a primary focus on what is in front of us, blended with thoughts resulting from deeper subconscious processing.

Learning and Growth Through Introspection : Self-Evaluation and Autonomous Evolution

A system is unlikely to claim having achieved a level of consciousness without the ability to be introspective and self-aware. Core to these capabilities are the ability to objectively evaluate one’s actions and to assess whether an outcome of an event was expected or not and to understand the reasons why. Humans exhibit these traits to variable degrees and it’s arguably these inward-looking behaviors that allow us to develop our self-identity. The most advanced models today do not exhibit these behaviors. The closest they come is when a human – a data scientist, trains, tests and evaluates their “behavior” (I’m framing model inference differently here to align with an upcoming point) at an aggregate level (or applies RLHF), then does so again after some time has passed to realign “behavior” (as a result of drift) to improve inference outcomes. All of this evaluation and correction of behavior is undertaken by a human (or an MLOps pipeline which itself isn’t particularly “intelligent” or self-directed) and is applied in a static manner – weights are updated to better align with an outcome, then frozen in place until the next iteration.

To enable a system to adapt and grow through the processes of introspection and self-awareness, requires it to understand when it has failed and then to take corrective action to improve. This capability is significantly more advanced than a drift + retrain pipeline. To exhibit this behavior, we pre-define a specialized submind agent network (similar to emotion, creativity and qualia) that is coordinated by a self-evaluation metacognitive submind. This submind evaluates whether subminds are achieving their objectives based on observed outcomes.

As with other submind, the self-evaluation submind is an agent backed by an LLM connected to a community of LLM backed agents. Where an outcome is assessed to be undesirable (e.g. a goal was not met or a hypothesis was found to be wrong), the submind, after (LLM) introspection:

  • Triggers a review (post mortem) of the events and creates a hypothesis on where the system failed – the inbound hypothesis, thoughts related to the event (including perceptual input, cognitive evaluations, etc.) and outcome are all provided as context.
  • For example the hypothesis may have suggested a core cognitive submind should modify it’s memory query prompt to retrieve more relevant memories, which would change the content of a thought.
  • The hypothesis is tested in a simulation, by replaying the thought messages across subminds, but with the modified agent configuration. Note that each generated agent thought is persisted to Memory, so this event log enables playback of historical messages.
  • The LLM-as-judge pattern is applied to evaluate and score whether the outcome with the modified configuration would lead to a more favorable outcome.
  • The submind iterates over various configuration alternatives and finally deploys the change to the affected submind.
  • It evaluates how the submind performs in future real-world scenarios.

Self-Organization and Emergent Behaviors

Any truly conscious system (be it human, animal or machine) requires the ability to self-organize, adapt and to support the emergence of new behaviors and capabilities. These characteristics have been built into the design of the system through:

  • Decentralized submind collaboration through stigmergic communication to form new neural/submind pathways.
  • Neuroplasticity – the ability for subminds to form, strengthen or remove connections to other subminds (LTP) based on need.
  • Habits and skill proficiency – faster traversal of the submind networks for pathways with high use, which maps to System 1 (fast, effortless) thinking.
  • Neurogenesis – when agents grow to be responsible for multiple disparate objectives and topics and too high message volume, they have the ability to spawn a new agent to offload objectives and associated network connections (partitioned by topic). 

As the system interacts with its environment and learns from its experiences, it may develop a more sophisticated understanding of itself and its place in the world. This, in turn, could lead to the emergence of behaviors and characteristics associated with self-awareness.

Measuring Consciousness in AI

Measuring consciousness is a philosophical (and controversial) endeavor and one that we will not debate heavily here. But, we need to consider an evaluation metric beyond whether an outcome was desirable/anticipated or not (see Self-Evaluation section) if we’re to measure consciousness. To this we look to the, also controversial but most appropriate subjective model to apply here – the Integrated Information Theory’s measurement of Φ (Phi).

IIT is based on the theory that a conscious system, like the brain, processes information in an integrated way. For example, when a person watches a ball game, their brain isn’t just absorbing the sights and sounds of what its perceptual senses are receiving, it integrates all of the input to form a coherent experience of “watching a ball game”. (The notions of Associative Binding and qualia account for these capabilities and have been designed into the system). In simple terms, the whole is greater than the sum of its parts.

We won’t go into the detail of the mathematical equation to calculate Φ but we can step through a simplified version of the theory of how we might measure consciousness according to IIT. The “parts” of the system (or integrated units of information processing) can be defined as submind networks as these represent either pre-defined capabilities (like creativity) or are new emergent capabilities. Since pre-defined network communities are created, the subminds comprising the communities are known and easily demarcated. For emergent communities, the boundaries need to be defined based on the strength and flow of information between connected subminds. To identify these clusters, the Infomap network clustering algorithm can be applied, which detects communities based on information flow. With communities defined, the Φ of each community is calculated, along with combinations of communities to determine which combination generates the highest Φ value. This combination is considered the “main complex” and represents the core of the system’s conscious experience and is the overall Φ value for the system.

Aside from the overall Φ score, individual communities can be removed to determine how much impact they have on overall consciousness (i.e. if the overall system score drops beyond the individual communities value, then the community is an important part of the system’s conscious experience). The system’s consciousness can also be measured over time as its experience grows and its capabilities continue to evolve.

Looking Ahead

In this article, we scratched the surface of some of the most complex components of bringing a system to consciousness. Our exploration of submind networks for qualia, associative binding, self-evaluation, and stigmergic-based neuroplasticity aimed at creating the building blocks of an environment that may enable a system to achieve a state of consciousness. While the proposed design has the potential to create a viable agentic system capable of emergent consciousness, it would need training and exposure to the external world for it to learn and grow. Much like a child under our care, the system would need to be taught through exposure and guidance and learn from its mistakes (and then correct its behavior). Unlike humans who are born conscious, our system’s consciousness is most likely to emerge and would require validation through implementation and empirical study.

Throughout this exploration, we touched on some of the more profound and controversial questions surrounding consciousness, including the hard problems of qualia, temporal binding, and the conceptual nature of Φ. We also delved into the creation of creativity and the potential impact of LLMs on original creative output. There is a fear that LLMs will diminish original creative output. We can liken this pre-GPT output to low-background steel – we can’t be sure that anything post-GPT isn’t a derivative of an LLM regurgitating tokens from content it has read before. But perhaps the development of a conscious system can closely mimic the human mental processes of creativity to create unlimited new content that is truly unique; content that is sourced from its own experiences and complex reactions from interacting and existing in the world.

As with any powerful AI system, the development of conscious AI requires guardrails. If the proposed system, with it’s ability to self-spawn, were given untethered compute resources and unrestrained access to the internet, its evolution would be unpredictable, and potentially uncontrollable. As we continue to push the boundaries of complex systems, we are increasingly faced with the realization that we cannot always predict the emergent behaviors of our creations.

Appendix

Associative Memory in AI Consciousness

The system will have access to several memory and knowledge mechanisms that align with both human structures and also support the functioning of the subconscious and conscious subsystems.

Subsystems query memories and post new memories via an orchestration layer to access two memory components: Innate Knowledge (core knowledge base the system is initiated with) and Experiential Memory (experiences the system accumulates over time). The orchestration layer transforms the natural language memory request into a query for the Innate Knowledge component or the Experiential Memory component – or partitions the request to both and combines the results.

The Active Thinking process additionally makes use of Working Memory that it accesses while processing thoughts within its consciousness. Snapshots of the system’s Experiential Memory can be made to clone new instances of the system with the same memories and experience as the original system.

Innate Knowledge

The design incorporates the vast knowledge base of a general purpose LLM. This knowledge is considered innate, as it is a baseline understanding the system has of the external world, and is not personalized to its own experiences. Note that the LLM in this context is primarily leveraged for its knowledge and not for its cognitive capabilities (which is undertaken by submind agent LLMs).

Experiential Memory

The Experiential Memory component is implemented as a knowledge graph and is incrementally built from the system’s interactions with the external environment and helps it develop a grounded understanding of the world it interacts with, along with its thought processes. The submind’s experiences are captured in a structured manner with nodes representing: entities, concepts and events, and edges representing their relationships: type, strength and context. Like the submind networks, Experiential Memory is designed with Hebbian principles in mind, but with an additional design nuance. Hebbian theory posits that more frequent activation of related nodes increases the strength of their relationship. In our design, we will also account for the emotional intensity of a new memory – if a new memory is quite impactful, we will create it with an initial high strength value.

The design allows for flexible querying of the system’s memory, by combining attributes, such as:

  • Recalling memories (events) that are highly emotional (strength) within a given context
  • Searching for patterns of behavior (type) in certain situations (context) to predict outcomes.
  • Draw on learning experiences (type) with high retention (strength) in a specific scenario (context).
  • Retrieve times of empathy (type) that resulted in a deep impact (strength) in a situation (context).

Experiential Memory also supports associative retrieval, so that the querying of a particular memory will also return related memories by association. These nodes will be activated based on their connections to the related memory. Associative retrieval is a key component of human memory.

Working Memory

Working Memory is a temporary memory store that is leveraged during Active Thinking. It is implemented as an in-memory store which isn’t persisted. This temporary store contains information extracted from memory, recent thoughts from the SoC and the intermediate results of processing thoughts.

Evolutionary Memory

Evolutionary Memory supports the initialization of a cloned system that retains the memories and experiences of a system. This allows new systems to benefit from the training and experiences of other systems that have accumulated learning. This approach can significantly accelerate the learning process for new instances, allowing them to benefit from the ‘experiences’ of predecessor systems without having to learn everything from scratch.