ProofGraph

Network science and spectral graph theory applied to formalized mathematics.

ProofGraph applies network science and spectral graph theory to dependency graphs of Lean 4 libraries. A working prototype has completed full Mathlib extraction and spectral analysis at scale, revealing structural properties of formalized mathematical knowledge that are invisible in informal treatments.

View on GitHub

About the Project

The core insight comes from the Curry-Howard-Lambek correspondence: proofs are programs, propositions are types, and logical systems are categories. A formalized mathematics library like Lean’s Mathlib is a complex network whose structure can be studied empirically.

ProofGraph makes this structure visible and queryable. It extracts declaration-level dependency graphs from Lean 4 projects and applies network science and spectral graph theory to study them. A working prototype has completed full Mathlib extraction (323,489 declarations, 7,516,850 dependency edges) and spectral analysis at full scale.

The Fiedler bipartition separates 2,440 tactic and metaprogramming declarations from 314,056 mathematical declarations using dependency structure alone, with no access to module paths or declaration kinds. Recursive spectral bisection to depth 15 reveals concentric structure: algebraic connectivity increases 186x from the root as peripheral mathematical layers peel off, while algebra, analysis, geometry, category theory, and measure theory remain fused in the core.

Research Areas

  • Dependency graph extraction from Lean 4 projects (Complete: 323K declarations, 7.5M edges from Mathlib)
  • Spectral graph theory applied to proof networks (Complete: Fiedler analysis, recursive bisection to depth 15)
  • Network science metrics: community detection, centrality, core-periphery decomposition (Planned)
  • Proof-theoretic properties: constructivity, axiom dependence, computability (Planned)
  • CLI and REST API for programmatic access to graph data (Planned)