Sequel to Making Math Learn: Bauer vs Spivak on a Type Theory for AI
Pick three specific diverse experts in related fields and write a dialogue where, starting from a lay definition of the real-world [AI dimensionality] challenges, they formalize and address this problem in a novel way called the “Grammagraph”
ChatGPT Prompt (condensed)
Participants:
- Emily Riehl – Category Theorist, Johns Hopkins University
- Max Tegmark – Physicist & ML Researcher, MIT
- Andrej Karpathy – AI Researcher, OpenAI/Tesla
1. Real-World Challenge
Moderator:
“Today’s AI struggles to extract low-dimensional semantic meaning from high-dimensional syntactic input, like raw text or source code. How can we move from noise to structure?”
2. Clarifying the Problem
Karpathy:
“Transformers read millions of tokens but don’t understand them. They often fail to form structured, reusable representations of meaning.”
Tegmark:
“In physics, we extract simple, elegant laws from messy observations using symmetry and geometry. Can we do the same in ML?”
Riehl:
“In category theory, we formalize deep structure by modeling how different systems relate. The core issue is finding a representation that preserves meaning across variation.”
3. The “Grammagraph” Concept
Karpathy:
“What if instead of raw strings, we model syntax as structured graphs—not just tokens, but typed relationships—then extract semantics from their shape?”
Riehl:
“Let’s call it a Grammagraph: a graph encoding both syntax and semantic constraints. We can define a functor from a category of syntax to a category of meanings.”
Tegmark:
“Like a tensor network in physics—graphs representing high-dimensional data flows, compressed into latent representations.”
4. Formalizing the Grammagraph
Riehl:
“We define:
- A Syntactic Category (Syn)
- Objects: structured expressions (e.g., parse trees, ASTs).
- Morphisms: transformations preserving grammatical structure.
- A Semantic Category (Sem)
- Objects: concepts or types of meanings.
- Morphisms: valid semantic transformations (e.g., logical entailment).
The Grammagraph is a functor:
[ F: \textbf{Syn} \rightarrow \textbf{Sem} ]
It maps syntactic objects to semantic meanings while preserving composition.”
5. Implementation in ML
Karpathy:
“Graph Neural Networks (GNNs) or Transformers could embed syntax graphs. But instead of raw embedding, we’d constrain learning to preserve functorial structure.”
Tegmark:
“That would enforce invariance and equivariance—meaning-preserving transformations. Physics meets linguistics.”
Riehl:
“This structure would support compositional reasoning, type inference, and semantic interpretability.”
6. Novel Contributions
- Type Theory: Imposes semantic constraints on syntax
- Category Theory: Formalizes transformations and compositional structure
- Geometric Deep Learning: Embeds structure respecting invariance/symmetry
- Information Bottlenecks: Filters relevant meaning from high-dimensional form
- Graph-Based Models: Provide syntactic inductive bias
7. Impact and Applications
Karpathy:
“Grammagraphs guide models toward semantically valid representations.”
Tegmark:
“They offer low-dimensional, interpretable structure—aligned with physical and causal intuition.”
Riehl:
“They form a formal foundation for learnable mathematics, merging symbolic and geometric paradigms.”
8. References and Related Work
- Emily Riehl – Category Theory
- Max Tegmark – Life 3.0, AI + Physics
- Andrej Karpathy – Neural Networks and ML
- Seven Sketches in Compositionality – Brendan Fong & David Spivak
- Functors and Compositionality in NLP
- Graph Neural Networks
- Equivariant Neural Networks
- Type Theory and ML
- Information Bottleneck
- DreamCoder – Neural program synthesis using symbolic priors
9. Summary
Grammagraphs offer a formal, compositional, and learnable structure that bridges syntax and semantics, allowing AI systems to extract low-dimensional meaning from high-dimensional expression—guided by types, structured by categories, and compressed via geometry.
Appendix A: Feasibility
ChatGPT Prompt
Write an Appendix analyzing the feasibility.
The Grammagraph—a structured, category-theoretic model mapping syntactic graphs to semantic representations—aims to unify deep learning, type theory, and categorical semantics into a system for learnable, interpretable meaning. This appendix evaluates its feasibility across five key dimensions.
A1. Theoretical Feasibility
Strengths
- Built on well-established mathematical foundations:
- Functorial semantics (Lawvere)
- Dependent type theory (Martin-Löf type theory)
- Invariance principles from geometric deep learning
- Naturally compositional:
- Category theory models how complex meanings arise from simpler partsReinforced by existing work:
- Reinforced by existing work:
- Formal syntax-semantics interfaces in NLP
- Cartesian closed categories in programming language theory
- Compositional models in functional programming and logic
Risks
Abstraction barrier:
- High conceptual complexity may limit practical usability without supporting infrastructure
A2. Computational Feasibility
Strengths
- Graphs are a natural fit:
- Syntax (ASTs, parse trees) aligns with graph-based ML models
- Existing neural mechanisms for constraint integration:
- Type constraints can function as inductive biases or regularization
- Hybrid symbolic-neural models already exist:
- Examples include DreamCoder and LeanDojo
Risks
- Training complexity: Enforcing functorial or type-theoretic constraints during backpropagation is nontrivial
- Scale limitations: Type checking and symbolic reasoning can be computationally expensive and difficult to parallelize
A3. Engineering Feasibility
Opportunities
- Can leverage and extend existing toolchains:
- Modular by design:
- Components (syntax graph parsing, functorial decoding, type-checking) can be developed independently
Challenges
- Lack of unified framework:
- No current ML system integrates categorical semantics, dependent types, and gradient-based learning
- Tooling immaturity:
- Limited debugging and developer tools for such hybrid systems
A4. Human Feasibility
Advantages
- Naturally attracts interdisciplinary researchers:
- Formal methods (PL/FM), category theory, ML, symbolic reasoning
- Aligns with current trends:
- Increasing interest in interpretable, verifiable, and structured AI
Limitations
- High learning curve:
- Requires deep knowledge across formal logic, algebraic structures, and modern ML
- Scarcity of cross-domain expertise:
- Success depends on coordinated collaboration between specialized communities
A5. Long-Term Potential
Short-Term
- Improve interpretability and correctness of neural-symbolic models
- Applications in code, math, logic, and other structured domains
Mid-Term
- Enable compositional and transferable learning across problem domains
- Encourage architectural standardization via typed and modular interfaces
Long-Term
- Establish a unified mathematical foundation for structured learning systems
- Form the basis for formal AI, where semantics, reasoning, and generalization are grounded in types and categories
A6. Overall Feasibility Summary
Feasible Components
- Theoretical foundations: Solid and well-documented
- Computational frameworks: Viable, though need innovation
- Engineering approach: Parts exist, integration needed
- Collaborative interest: Present but fragmented
- Strategic value: Very high, especially long-term
A7. Recommendations
- Prototype the Grammagraph in Lean or Dex on structured domains (e.g., code, proofs)
- Train GNNs or Transformers constrained by functorial or type-theoretic loss functions
- Develop a minimal tooling layer to bridge symbolic typing and neural training
- Build interdisciplinary collaborations across category theory, ML, and programming language theory
Appendix B: Revisiting Computational Feasibility
ChatGPT Prompt
Computational feasibility seems to actually be quite low. Reflect on this on in a second appendix
Upon further reflection, computational feasibility of the Grammagraph is more challenging than previously characterized. While theoretically promising and conceptually elegant, several deep practical barriers emerge when attempting to realize it at scale within modern ML systems.
B1. Fundamental Computational Barriers
1. Symbolic vs. Continuous Mismatch
- Neural networks operate in continuous, differentiable vector spaces.
- Category theory and type theory operate in discrete, symbolic, compositional structures.
- Bridging these two worlds often introduces non-differentiable operations or expensive symbolic computation that breaks GPU/TPU optimization pipelines.
2. Constraint Propagation Is Computationally Expensive
- Enforcing type soundness, functorial consistency, or semantic correctness requires runtime or compile-time constraint solving.
- This can be NP-hard, especially when dealing with:
- Higher-order types
- Recursive structures
- Rich dependent types
3. Scalability of Symbolic Verification
- Even systems like Coq, Lean, and Agda struggle with verification at scale.
- Attempting to apply these tools during training (e.g. gradient steps) is far beyond current capabilities.
B2. Limitations of Current ML Frameworks
1. Training-Time Symbolic Supervision is Prohibitively Slow
- Most ML toolchains assume stateless, dense numerical flows.
- Injecting symbolic evaluation or constraint logic into each minibatch introduces a massive bottleneck.
2. Differentiable Programming is Not Yet Type-Aware
- Frameworks like JAX, PyTorch, and TensorFlow focus on function transformation, not type safety.
- There is no native support for:
- Tracking types during forward/backward passes
- Enforcing type-level constraints in model weights or activations
B3. Empirical Limitations of Hybrid Systems
1. Neural-Symbolic Systems Are Fragile
- Systems like DreamCoder and Neural Theorem Provers work well on toy problems but scale poorly.
- They require huge search spaces, complex heuristics, or slow symbolic inference steps.
2. GNNs and Transformers Lack Structural Guarantees
- While graph-based models seem like a good fit for syntactic structures, they typically do not guarantee functoriality or compositional correctness.
- Adding such constraints explicitly during learning is still an open research problem.
B4. Reframing the Role of the Grammagraph
Given these challenges, the Grammagraph should not be positioned as a core architecture for end-to-end training, but rather as a:
- Semantic interface layer: A constraint-checking wrapper or adapter around learned representations.
- Post-processing verifier: A formal layer to check and refine outputs from standard models.
- Training-time regularizer: Used intermittently or softly to nudge models toward compositional semantics.
This modular role makes the system more feasible without requiring full symbolic integration during each training step.
B5. Strategic Directions for Overcoming Barriers
- Soft Constraints over Hard Verification
- Use loss-based penalties rather than symbolic rejection to encourage type-safe or functorial behavior.
- Inspired by differentiable logic and relaxed constraint satisfaction.
- Compiled Grammagraph Priors
- Precompile type-theoretic or categorical structures into graph priors that influence learning indirectly (e.g., masking, initialization, sparsity).
- Hybrid Inference Pipelines
- Use deep models to propose candidate structures.
- Apply symbolic Grammagraph constraints as a post-hoc verifier or filter.
- Analogous to how AlphaZero uses neural search with symbolic game rules.
B6. Revised Assessment
Computational feasibility is currently low for real-time, end-to-end integration, but:
- Partial implementations are viable as interfaces, regularizers, or analysis tools.
- Substantial progress will require:
- Novel programming languages (e.g., Dex)
- Typed autodiff systems
- Compiler-level support for typed tensors and symbolic gradients
B7. Conclusion
The Grammagraph remains a powerful conceptual framework, but its computational realization is limited by current paradigms in deep learning. Its near-term future is likely as a modular semantic scaffold rather than an all-encompassing architecture.
A successful path forward will balance formal structure and pragmatic approximation, gradually tightening the semantic net around learning systems as tools evolve.

Leave a comment