Neuro-Symbolic AI

Every LLM is stochastic. Even at temperature zero, a transformer-based model relies on probability distributions over tokens - and probability distributions produce variation. Neuro-symbolic AI is the practice of blending that stochastic neural behavior with deterministic software to reduce variation to a level that production systems can tolerate.

What Is Neuro-Symbolic AI?

The term combines two words that describe two fundamentally different computing paradigms.

Neuro refers to neural networks - the architecture behind LLMs, vision models, and embedding systems. Neural networks learn patterns from data. They are flexible, creative, and capable of extraordinary generalization. They are also inherently stochastic. A transformer model doesn't compute an answer - it samples from a probability distribution over possible answers. Even with a temperature of zero, the model is selecting the highest-probability token at each step from a distribution that shifts with context length, quantization, batching, and model updates. The output is probabilistic by nature.

Symbolic refers to predefined rules, deterministic software, formal logic, or pre-trained specialized small models that behave predictably. A SQL grammar is symbolic. An abstract syntax tree compiler is symbolic. A rule that says "revenue = SUM(line_amount) grouped by fiscal quarter" is symbolic. These systems do exactly what they are told, every time, without variation.

Neuro-symbolic AI is the practice of composing these two paradigms - using the neural component where flexibility and language understanding are needed, and the symbolic component where correctness, consistency, and auditability are required. The goal is not to eliminate stochastic behavior entirely, but to constrain it - to let the neural network do what it's good at (understanding intent, resolving ambiguity, generalizing across language) while the symbolic system ensures the output meets a deterministic standard.

The core insight:

LLMs will continue to be stochastic as long as they are built on transformer architectures. Transformers are neural networks at their core - they rely on probability distributions and will always exhibit variation. Neuro-symbolic AI accepts this reality and builds systems that tolerate it rather than pretending it doesn't exist.

AlphaGeometry - Google DeepMind, 2024

Google DeepMind · Nature, January 2024

AlphaGeometry

AlphaGeometry solves International Mathematical Olympiad-level geometry problems by combining a neural language model with a symbolic deduction engine. The language model, trained on 100 million synthetically generated theorems, suggests auxiliary geometric constructions - new points, lines, or circles that might help prove a theorem. The symbolic deduction engine then applies formal rules of geometry to verify whether those constructions lead to a valid proof.

On a benchmark of 30 IMO geometry problems, AlphaGeometry solved 25 - approaching the average score of human gold medalists (25.9). The previous state-of-the-art symbolic-only system solved 10. Neither the neural model nor the symbolic engine could achieve this alone.

The neural component provides creative leaps - the kind of intuition that says "draw a line from A to the midpoint of BC." The symbolic component provides rigor — the formal chain of deductions that proves why that construction leads to the answer. This is neuro-symbolic AI at its clearest: intuition constrained by proof.

Trinh et al., "Solving olympiad geometry without human demonstrations," Nature (2024)

Neuro-Symbolic AI in Practice: How Walt Approaches Text-to-SQL

Text-to-SQL - converting a natural language question into a database query - is one of the most commercially relevant applications of neuro-symbolic AI today. It is also one of the most instructive, because the failure modes of a purely neural approach are immediately visible: the query either returns the right data or it doesn't. There is no room for "approximately correct."

Walt's architecture separates the problem into what an LLM is good at and what it is not.

1. Stable Logic Models — The Deterministic Engine

At the core of Walt's approach is an analytical inference engine that does not ask the LLM to write SQL. Instead, the LLM interprets the user's question - understanding intent, resolving ambiguity, selecting the right analytical pattern - and the deterministic engine constructs the SQL through a formal, auditable process.

Three components work together:

ReasonBase - a data context graph that defines the ontology, metrics, entities, and relationships in the data. This is the semantic knowledge the system reasons over.
Logic Models - predefined analytical patterns (time-over-time comparisons, cohort analysis, fanout-protected joins, multigrain models) that encode safe, proven query structures. These are not prompts. They are deterministic logic.
AST SQL Compiler - transforms the structured logic into optimized SQL through an abstract syntax tree, building each clause step by step. Every query is traceable back to the exact logic model and semantic definition that produced it.

*Stable Logic Models (SLM) — a three-part engine that converts user questions into safe, consistent, and accurate SQL.*

‍

The LLM's role is interpretation, not generation. It understands what the user is asking. The symbolic system - templates, logic models, the AST compiler - ensures that the resulting SQL is correct, consistent, and auditable. Same question, same SQL, every time.

2. The Entity Resolution Problem - A search problem hiding inside a BI problem

Dashboards have dropdowns. A user clicks "Region," selects "North America" from a list, and the filter is exact. Natural language doesn't work that way.

When a user types "How is Boondocs doing in NA?", two problems emerge simultaneously:

The right side of the WHERE clause is unknown. Even if we knew which column to match against, we don't know how "Boondocs" is stored in the database. It might be "Boondocks Inc.", "BOONDOCKS", "Boondocks International Ltd.", or a product code. It will almost never be an exact string match. And fuzzy matching with LIKE '%boondocs%' is both unreliable and dangerous - it either misses the value or matches too broadly.

The left side of the WHERE clause is unknown. "Boondocs" could be a customer name, a product line, a brand, a vendor, or a campaign. There may be dozens of columns across multiple tables where this value could plausibly live. How does the system determine which column the user is referring to?

Put simply: we don't know either side of the WHERE clause.

This is a problem that neither a pure LLM nor a pure deterministic system can solve. An LLM will guess - and guessing at filter values in SQL produces queries that return empty results or, worse, wrong data that looks right. A deterministic system needs exact inputs it doesn't have.

Walt's approach is neuro-symbolic: the system automatically detects low-cardinality attributes across the data, vectorizes every distinct value, and builds a semantic index. When the user says "Boondocs," the neural component (embedding similarity) resolves it to the closest matching value in the index - "Boondocks Inc." in the customer_name column - and the symbolic component ensures that resolution is applied as an exact filter in the generated SQL. The creativity of semantic matching is bounded by the determinism of what actually exists in the database.

This is the neuro-symbolic pattern in miniature: the neural component handles the fuzzy, ambiguous, human part (language → meaning → approximate match). The symbolic component handles the precise, structural part (exact value → correct column → valid SQL). Neither works without the other.

How to Build Neuro-Symbolic Systems (And What Doesn't Work)

It is tempting to take existing deterministic software - a BI tool, a workflow engine, a rules system - and layer an LLM chatbot or agent on top of it. This approach feels pragmatic: the software already works, you're just adding a natural language interface.

It doesn't work.

The existing software was built for a human operator. That operator was trained - through documentation, onboarding, trial and error - to provide inputs in the exact format the software expects. They know which buttons to click, which fields to fill, which sequences to follow. The software's "interface" is a set of rigid constraints that a trained human navigates.

An LLM is not a trained human operator. It doesn't read manuals. It doesn't follow click sequences. It interprets intent from natural language and produces outputs that are approximately what you asked for. When you point an LLM at software designed for exact human input, you get the worst of both worlds: the LLM's stochastic outputs collide with the software's rigid expectations, and the system fails in unpredictable ways.

The anti-pattern: Taking deterministic software built for humans and wrapping it with an LLM. The software expects precision. The LLM provides approximation. The mismatch produces brittle, unreliable systems.

Building a neuro-symbolic system requires designing from the ground up with an understanding of stochastic behavior. The system must:

Define tolerance boundaries. Where can the system accept variation, and where must it be exact? The LLM can vary in how it interprets a question. It cannot vary in the SQL it produces for a given interpretation.
Design the interface between neural and symbolic components. The handoff point - where neural output becomes symbolic input - is the most critical design decision. It must be structured enough that the symbolic system can consume it reliably, and flexible enough that the neural system isn't artificially constrained.
Validate at the boundary. Every output from the neural component should be validated before it enters the symbolic pipeline. If the LLM's interpretation doesn't map to a known analytical pattern, the system should refuse the query rather than guess.
Preserve the LLM's strengths. The goal is not to eliminate the neural component - it's to channel it. Over-constraining the LLM defeats the purpose. Users speak in natural language because it's flexible. The system must preserve that flexibility while bounding the downstream consequences.

The right mental model is not "LLM on top of software." It is two systems designed to collaborate - one that understands human intent, and one that enforces computational correctness - with a carefully engineered interface between them.

Why This Matters Now

The AI industry is moving past the phase where "an LLM can do X" is impressive. The question has shifted to: can it do X reliably, consistently, and safely enough to deploy in production?

For most production use cases - financial reporting, medical diagnosis support, autonomous systems, legal analysis, data analytics - the answer with a purely neural approach is no. Not because the models aren't capable, but because stochastic variation is fundamentally incompatible with systems that require deterministic guarantees.

Neuro-symbolic AI is not a compromise. It is a recognition that intelligence - whether human or artificial - has always been a combination of intuition and rigor. The neural network provides intuition. The symbolic system provides rigor. Together, they produce systems that are both flexible and trustworthy.

The companies that figure out this integration - not just in theory, but in production, with real users, at scale - will define the next era of applied AI.