HLJ
← Back to projects

Independent Research Proposal

Deep Concept Model (DCM)

March 2026 — Present

A proposed neuro-symbolic AI research programme for building a formal concept algebra over mathematics, using dependent type theory, Lean 4, Mathlib, ontology graphs, hyperbolic embeddings, and verified concept generation to explore reasoning beyond natural language.

Overview

Deep Concept Model (DCM) is a research proposal for building a formal system that reasons directly over concepts rather than over natural language tokens. The central idea is that concepts can be represented as compositional mathematical structures, generated from primitive elements, verified through formal proof tools, and registered into an expanding ontology.

The proposal argues that language may be a bottleneck for machine reasoning. Current large language models operate primarily through token prediction, which means their reasoning is mediated by linguistic sequences. DCM explores a different route: define concepts formally, compose them through algebraic and type-theoretic operations, verify whether the resulting structures are meaningful, and use the verified results to grow a computable ontology.

The first target domain is formal mathematics, especially elementary number theory, because mathematical concepts are naturally compositional and can be verified using proof assistants. Lean 4 and Mathlib provide the formal infrastructure: definitions, theorems, proof dependencies, type checking, and a large existing library of verified mathematical knowledge.

The project is scoped as a research programme with five conceptual layers:

  1. Philosophical foundation.
  2. Ontological structure.
  3. Concept generation and verification.
  4. Model training and architecture.
  5. Curriculum traversal and reasoning evaluation.

The current proposal focuses on Layers 0 through 2: defining what a concept is, extracting and structuring a mathematical ontology, and building a verified combine-verify-register loop for generating new concepts.

Research Motivation

The motivation behind DCM comes from a question about the future of AI reasoning:

What if meaningful machine reasoning should not be forced to happen only through language?

Large language models have achieved impressive results, but their core training objective remains linguistic: predict the next token. This gives them broad fluency, but it also ties their reasoning process to the expressive limits of language. Inspired by Wittgenstein's claim that the limits of language shape the limits of thought, DCM asks whether AI systems could reason more fundamentally at the level of concepts themselves.

The proposal also draws inspiration from observations about systems such as AlphaGo, where powerful strategies emerged that were effective but difficult for humans to interpret. If future AI systems develop reasoning pathways that do not map cleanly onto human linguistic categories, then a concept-native system may be better suited for representing and studying those forms of reasoning.

A second motivation comes from creativity research. Many novel ideas arise from combining previously separate concepts. DCM formalizes this intuition as a computational mechanism:

existing concepts
      ↓
composition operators
      ↓
candidate concepts
      ↓
formal verification
      ↓
ontology registration
      ↓
future concept generation

In this view, creativity is not random invention. It is structured concept combination followed by validation.

Problem

Modern AI systems are strong at generating language, but less transparent about the conceptual structures underlying their outputs. This creates several problems:

  1. Language dependence - Reasoning is expressed through words even when the underlying structure may be mathematical, logical, spatial, or conceptual.
  2. Weak concept boundaries - Embeddings can capture similarity, but they do not always provide explicit definitions, compositional structure, or verification guarantees.
  3. Limited verifiability - Token-level models can generate plausible statements without proving that the underlying concept is well-formed or meaningful.
  4. Opaque novelty - LLMs can suggest new combinations of ideas, but there is no built-in mechanism to check whether those combinations are valid, non-trivial, or useful.
  5. Combinatorial explosion - If concepts can be freely combined, the search space grows rapidly unless generation is guided by type constraints, ontology structure, and interestingness metrics.

DCM addresses these problems by grounding concept generation in formal mathematics, where well-formedness and verification can be handled by Lean's type checker and proof infrastructure.

Core Thesis

The central thesis of DCM is:

A concept can be represented as a type in dependent type theory, and new concepts can be generated by composing existing types through formally defined operations.

Under this framing:

This makes DCM different from embedding-only approaches. It does not treat a concept as just a vector or sentence representation. Instead, a concept has a formal identity, a type-theoretic definition, provenance in an ontology, and a verification status.

Relationship to Existing Work

Meta's Large Concept Model

Meta FAIR's Large Concept Model moved from token-level modeling to sentence-level modeling by predicting sentence embeddings in the SONAR representation space. DCM accepts the broad intuition that reasoning should operate above individual tokens, but argues that sentence embeddings are still language-derived representations.

DCM pushes the idea further by defining concepts independently from sentences. In DCM, a concept is not just a compressed sentence; it is a formally constructed and verifiable object.

Conceptual Spaces

Peter Gärdenfors' theory of conceptual spaces represents concepts as geometric regions in quality spaces. This is useful for perceptual concepts, similarity, and graded categories. DCM borrows the intuition that concepts can have geometric structure, especially for embedding and clustering the ontology, but its primary formal representation is type-theoretic rather than purely geometric.

Type Theory and Formal Mathematics

Dependent type theory provides the formal foundation for DCM. In proof assistants such as Lean 4, propositions can be treated as types and proofs as programs. This makes it possible to represent mathematical concepts in a way that is both compositional and machine-checkable.

Lean 4's Mathlib serves as the starting knowledge base. Rather than constructing a mathematical ontology from scratch, DCM proposes extracting, annotating, and augmenting a subgraph of Mathlib.

Library Learning and Program Synthesis

DreamCoder provides an important analogy. It learns reusable program abstractions through a wake-sleep process: solve problems, discover patterns, compress them into reusable library elements, and use those elements in future search.

DCM proposes a similar loop for concepts:

compose candidate concept
verify it
register it
reuse it in later generation

The difference is that DCM targets mathematical concepts and formal ontologies rather than program libraries alone.

Proposed Architecture

The DCM architecture is organized into layered stages.

Layer 0: Philosophical foundation
        ↓
Layer 1: Ontological structure
        ↓
Layer 2: Concept generation and verification
        ↓
Layer 3: Concept-native model training
        ↓
Layer 4: Curriculum traversal and reasoning evaluation

The current research proposal focuses on Layers 0 through 2.

Layer 0: Philosophical Foundation

Layer 0 defines the theoretical commitments of the system.

The main deliverable is a position paper that specifies:

The target proof-of-concept domain is elementary number theory. The proposal suggests curating 20-50 primitive concepts, such as definitions and basic lemmas around natural numbers, divisibility, primality, factorization, and the Euclidean algorithm.

These primitives become the initial conceptual atoms from which later candidate concepts can be composed.

Layer 1: Ontological Structure

Layer 1 builds the computable ontology.

The proposed starting point is the dependency subgraph of the Fundamental Theorem of Arithmetic from Lean 4's Mathlib. This theorem is a strong initial target because it is conceptually rich but still bounded: it involves natural numbers, primes, divisibility, products, uniqueness, and factorization.

The ontology construction process has three stages.

Extraction

Use Lean metaprogramming and LeanDojo-style extraction tools to extract a dependency subgraph from Mathlib.

Expected output:

Mathlib definitions and theorems
        ↓
dependency subgraph
        ↓
50-100 concept nodes
        ↓
Lean-linked ontology seed

Each node remains connected to its original Lean object, preserving formal traceability.

Curation and Annotation

Raw dependency graphs show which definitions and lemmas depend on others, but they do not fully describe conceptual structure. DCM proposes augmenting the graph with typed relationships:

The intended representation is a labelled property graph using tools such as NetworkX.

Embedding and Clustering

After annotation, the graph can be embedded into hyperbolic space using Poincare embeddings. Hyperbolic geometry is useful because mathematical ontologies often have tree-like or hierarchical structure.

Community detection methods such as Louvain or spectral clustering can then identify coherent conceptual subregions of the ontology.

This creates two complementary views:

  1. A formal symbolic graph linked to Lean definitions.
  2. A geometric embedding useful for clustering, visualization, and novelty estimation.

Layer 2: Concept Generation and Verification

Layer 2 is the central technical contribution of the proposal: a system that can generate new candidate concepts by composing existing ones and verifying the results.

The proposed pipeline is:

select concepts
      ↓
choose composition operator
      ↓
construct candidate type expression
      ↓
check well-formedness in Lean
      ↓
attempt inhabitedness proof or witness construction
      ↓
check non-triviality against existing concepts
      ↓
score interestingness
      ↓
register verified concept in ontology

Component Selection

The system selects two or more existing concepts from the ontology.

Candidate selection strategies include:

Composition Operators

Composition is not treated as one generic operation. In type theory, there are several composition mechanisms:

These operators allow the system to construct candidate concepts from existing types.

Verification Levels

The proposal defines four levels of verification.

1. Well-formedness

The first check asks whether the candidate type expression is syntactically and semantically valid in Lean.

This can often be checked quickly using Lean's type checker.

2. Inhabitedness

The next check asks whether the concept has at least one object satisfying it. A concept may be well-formed but empty.

The system can attempt to construct witnesses using Lean tactics, examples, or existing lemmas.

3. Non-triviality

A generated concept should not simply duplicate an existing concept. Non-triviality checking compares the candidate against existing ontology entries and attempts to detect equivalence, redundancy, or direct collapse into a known definition.

4. Interestingness

The hardest criterion is whether the concept is mathematically meaningful. DCM proposes proxy metrics such as:

Registration

If a candidate passes the required checks, it is registered as a new ontology node.

The registered concept stores:

Once registered, the concept becomes available for future rounds of generation.

Managing Combinatorial Explosion

A naive system would try every possible pair of concepts with every possible operator. With N concepts and k operators, binary composition alone creates k * N^2 candidates per generation step.

DCM proposes several pruning strategies:

Type-Directed Search

Lean's type system naturally rejects incompatible compositions. The search can use type signatures to avoid generating candidates that are unlikely to type-check.

Analogy-Driven Generation

Instead of composing randomly, the system can look for structural analogies. If one part of the ontology contains a useful composition pattern, the system can attempt a similar construction in another region.

Curriculum-Guided Generation

Generated concepts can be introduced gradually by depth. Concepts verified at one depth become inputs for the next generation round, preventing uncontrolled recursive explosion.

Neural-Guided Search

A lightweight learned model can be trained to predict which candidate compositions are likely to pass verification or score highly on interestingness metrics.

This is inspired by the use of heuristic search in neural theorem proving systems.

Asynchronous Concept Generation

One distinctive idea in the proposal is asynchronous concept generation. The combine-verify-register loop could run continuously, independently of any immediate user query or downstream reasoning task.

This resembles:

The ontology would grow over time, accumulating verified concepts that could later support reasoning, theorem proving, curriculum design, or concept-aware model training.

Expected Contributions

If successful, the first phase of DCM would produce four main contributions.

1. Formal Concept Algebra for Mathematics

The project would define a rigorous concept algebra where concepts are types, primitive definitions serve as domain-level axioms, and composition is expressed through type construction.

2. Computable Mathematical Ontology

The project would produce an annotated, embedded ontology derived from a bounded Mathlib subgraph, with relationship types beyond raw dependency edges.

3. Verified Concept Generation Pipeline

The project would implement a working system that composes mathematical candidate concepts and verifies them through Lean.

4. Empirical Study of Machine-Generated Concepts

The project would evaluate whether generated concepts are well-formed, inhabited, non-trivial, and interesting. This would provide early evidence about whether automated concept generation can produce meaningful mathematical abstractions.

Milestones

Phase 1: Foundation

Planned deliverables:

Phase 2: Ontology Construction

Planned deliverables:

Phase 3: Concept Generation and Evaluation

Planned deliverables:

Evaluation Plan

The evaluation should measure both formal correctness and research value.

Possible metrics include:

| Evaluation Area | Example Metric | |---|---| | Well-formedness | percentage of generated candidates accepted by Lean's type checker | | Inhabitedness | percentage of well-formed concepts with constructed witnesses | | Non-triviality | percentage not equivalent to existing ontology concepts | | Interestingness | proof-shortening, graph-bridging, or embedding-novelty score | | Ontology quality | consistency of typed relationships and cluster interpretability | | Scalability | generation throughput and pruning efficiency | | Reuse | number of later candidates or proofs that use a generated concept |

The most important question is not simply whether the system can generate valid types. The stronger research question is whether it can generate concepts that help organize or simplify mathematical reasoning.

Risks and Mitigations

Trivial Concept Generation

The system may generate concepts that are technically valid but mathematically uninteresting.

Mitigation:

Lean Integration Complexity

Lean metaprogramming and Mathlib extraction can be technically difficult.

Mitigation:

Manual Annotation Burden

Typed ontology relationships may require manual curation.

Mitigation:

Scaling Beyond a Toy Domain

A 50-100 node proof-of-concept may not scale directly to thousands of concepts.

Mitigation:

Technical Stack

Planned or relevant technologies include:

Future Directions

Layers 3 and 4 extend the proposal beyond ontology construction and concept generation.

Layer 3: Concept-Native Model Training

Once the ontology exists, a model could be trained to reason over concept structures directly.

Possible architectures include:

Layer 4: Curriculum Traversal

The ontology can also define a natural learning curriculum. Given a target theorem or reasoning problem, the system could identify prerequisite concepts and construct a traversal path through the ontology.

This could support:

Potential evaluation targets include formal mathematics benchmarks such as miniF2F and PutnamBench.

What This Research Could Demonstrate

The strongest version of DCM would demonstrate that machine-generated concepts can be more than linguistic paraphrases. They can be formal objects with definitions, dependencies, proofs, and reusable roles in a growing ontology.

Even a modest proof of concept would be valuable if it shows that:

Current Status

DCM is currently at the research proposal stage. The proposal defines the philosophical motivation, theoretical framing, related work, methodology, milestones, risks, and planned implementation strategy.

The immediate next step is to move from proposal to prototype:

  1. select the elementary number theory target subdomain
  2. identify primitive concept candidates
  3. extract a small Mathlib dependency graph
  4. annotate typed ontology relationships
  5. implement the first Lean-backed composition and verification loop

Summary

Deep Concept Model is a proposal for a concept-native AI system grounded in formal mathematics. Instead of treating concepts as sentence embeddings or natural language descriptions, DCM represents them as types in dependent type theory, composes them through formal operators, verifies them with Lean, and registers successful candidates into an expanding ontology.

The project sits at the intersection of neuro-symbolic AI, type theory, theorem proving, ontology engineering, concept algebra, and AI for mathematics. Its long-term goal is to explore whether machine reasoning can move beyond language by operating directly on formally verified conceptual structures.