On-Chain ZKML Verification — Complete Technical Documentation

+===========================================================================+
|                                                                           |
|           100% ON-CHAIN NEURAL NETWORK VERIFICATION                      |
|                                                                           |
|   No FRI  |  No Dictionaries  |  No Recursion  |  18-TX Streaming       |
|                                                                           |
|   GKR Interactive Proofs + Sumcheck Protocol on Starknet                 |
|                                                                           |
+===========================================================================+

Overview
What We Built
The GKR Protocol
Architecture Deep Dive
Layer Type Verification
Residual Connections (DAG Circuits)
Fiat-Shamir Transcript
Serialization Format
Deployed Models
Gas Costs
Security Model
Deployment Guide
Codebase Map
Comparison with Other Approaches

Overview

ObelyZK (STWO-ML) is the first system to verify neural network inference 100% on-chain on Starknet using GKR (Goldwasser-Kalai-Rothblum) interactive proofs. The prover runs off-chain on GPUs. The verifier is a Cairo smart contract that replays the full cryptographic transcript and checks every mathematical claim — no trusted third party, no off-chain verification.

The Key Insight

Traditional ZKML approaches use SNARKs/STARKs which require FRI (Fast Reed-Solomon Interactive Oracle Proof) for polynomial commitment verification. On Starknet, the FRI verifier needs Felt252Dict (hash maps), which generates the squashed_felt252_dict_entries libfunc — not yet allowed on Starknet.

Our approach bypasses this entirely: the GKR protocol reduces neural network verification to sumcheck, which only needs field arithmetic and Poseidon hashing — both natively supported on Starknet.

  Traditional ZKML Path (BLOCKED):
  ================================
  Prover --> STARK Proof --> FRI Verification --> Felt252Dict --> BLOCKED
                                                                    X
                                                                    X
                                                         "squashed_felt252_dict_entries"
                                                          not in allowed libfuncs

  ObelyZK GKR Path (WORKS):
  ==========================
  Prover --> GKR Proof --> Sumcheck Verification --> Field Arithmetic --> SUCCESS
                                                     + Poseidon Hash
                                                     (all natively supported)

What We Built

Verified On-Chain (Starknet Sepolia)

+========================================================================================+
|                                                                                        |
|  MODEL    ARCHITECTURE                        LAYERS VERIFIED          STATUS           |
|  =====    ============                        ===============          ======           |
|                                                                                        |
|  D8       Single MatMul (1x4 -> 2)            MatMul                  VERIFIED         |
|                                                                                        |
|  D9       MLP                                 MatMul, Activation      VERIFIED         |
|           (MatMul -> ReLU -> MatMul)          (ReLU), MatMul                           |
|                                                                                        |
|  D10      LayerNorm Chain                     MatMul, LayerNorm,      VERIFIED         |
|           (MatMul -> LayerNorm -> MatMul)     MatMul                                   |
|                                                                                        |
|  D11      Residual Network                    MatMul, Activation      VERIFIED         |
|           (MatMul -> fork -> ReLU             (ReLU), MatMul,                          |
|            -> MatMul -> Add)                  Add + Deferred Proof                     |
|                                                                                        |
+========================================================================================+

System Components

  +-----------------------------------------------------------------------+
  |                       PROVING STACK (Off-Chain)                        |
  +-----------------------------------------------------------------------+
  |                                                                       |
  |   libs/stwo-ml/ (Rust, 500+ tests)                                   |
  |   +-------------------+  +-------------------+  +-----------------+  |
  |   |  ONNX Compiler    |  |  GKR Prover       |  | GPU Kernels     |  |
  |   |  - graph builder  |  |  - prove_gkr()    |  | - CUDA sumcheck |  |
  |   |  - forward pass   |  |  - per-layer walk |  | - MLE fold      |  |
  |   |  - weight loader  |  |  - deferred proofs|  | - pad/reduce    |  |
  |   +-------------------+  +-------------------+  +-----------------+  |
  |                                                                       |
  |   +-------------------+  +-------------------+                       |
  |   |  cairo_serde.rs   |  |  starknet.rs      |                       |
  |   |  - serialize GKR  |  |  - build calldata |                       |
  |   |  - per-layer tags |  |  - register model |                       |
  |   |  - deferred proof |  |  - verify calldata|                       |
  |   +-------------------+  +-------------------+                       |
  |                                                                       |
  +-----------------------------------------------------------------------+

  +-----------------------------------------------------------------------+
  |                    VERIFICATION STACK (On-Chain)                       |
  +-----------------------------------------------------------------------+
  |                                                                       |
  |   libs/elo-cairo-verifier/ (Cairo, 249 tests)                        |
  |   +-------------------+  +-------------------+  +-----------------+  |
  |   |  model_verifier   |  |  layer_verifiers  |  | field.cairo     |  |
  |   |  - GKR walk       |  |  - verify_matmul  |  | - M31/CM31/QM31|  |
  |   |  - deferred proof |  |  - verify_add     |  | - inverse       |  |
  |   |  - IO commitment  |  |  - verify_mul     |  | - eq_eval       |  |
  |   |  - weight verify  |  |  - verify_act     |  | - fold_mle      |  |
  |   +-------------------+  |  - verify_ln      |  +-----------------+  |
  |                          |  - verify_rms     |                       |
  |   +-------------------+  |  - verify_deq     |  +-----------------+  |
  |   |  channel.cairo    |  |  - verify_attn    |  | logup.cairo     |  |
  |   |  - Poseidon2-M31  |  +-------------------+  | - batch inverse |  |
  |   |  - mix_u64        |                         | - table sum     |  |
  |   |  - draw_qm31      |  +-------------------+  +-----------------+  |
  |   |  - mix_felts       |  | verifier.cairo    |                       |
  |   +-------------------+  | - contract entry   |                       |
  |                          | - register_model   |                       |
  |   +-------------------+  | - verify_model_gkr |                       |
  |   |  sumcheck.cairo   |  | - upgrade mech.    |                       |
  |   |  - degree-2 round |  +-------------------+                       |
  |   |  - degree-3 round |                                              |
  |   +-------------------+                                              |
  |                                                                       |
  +-----------------------------------------------------------------------+

The GKR Protocol

High-Level Flow

The GKR (Goldwasser-Kalai-Rothblum) protocol verifies a layered arithmetic circuit by reducing output claims to input claims layer by layer. At each layer, the prover provides a sumcheck proof that the verifier checks.

  THE GKR VERIFICATION WALK
  =========================

  The verifier starts with a claim about the OUTPUT and walks backward
  through the network, reducing it to a claim about the INPUT.

                          CLAIM: "output[r] = v"
                                    |
                                    v
  +---[Layer N: MatMul]---+    Sumcheck proof (log_k rounds)
  | Verify:               |    Each round: degree-2 polynomial
  | sum_j A(r,j) * B(j,s) |    Verifier sends random challenge
  | = claimed_value       |    Final: check A_eval * B_eval
  +------------------------+
                |
                v
          REDUCED CLAIM: "input_to_layer_N[r'] = v'"
                |
                v
  +---[Layer N-1: ReLU]--+    LogUp proof
  | Verify:               |    Table lookup via sumcheck
  | ReLU(x) is in         |    Multiplicities match
  | precomputed table     |
  +------------------------+
                |
                v
          REDUCED CLAIM: "input_to_layer_N-1[r''] = v''"
                |
                v
  +---[Layer 0: MatMul]--+    Sumcheck proof
  +------------------------+
                |
                v
          FINAL CLAIM: "input[r_final] = v_final"
                |
                v
          CHECK: MLE(actual_input, r_final) == v_final    <-- DONE!

Sumcheck Protocol (Per-MatMul Layer)

For a matrix multiplication C = A * B where A is m x k and B is k x n:

  SUMCHECK FOR MATMUL
  ===================

  Claim: C(r_i, r_j) = sum_{s in {0,1}^log_k} A(r_i, s) * B(s, r_j)

  Round 1: Prover sends degree-2 polynomial p_1(X)
           p_1(0) + p_1(1) = claimed_sum
           Verifier draws random alpha_1

  Round 2: Prover sends p_2(X)
           p_2(0) + p_2(1) = p_1(alpha_1)
           Verifier draws random alpha_2

  ...

  Round log_k: Prover sends p_{log_k}(X)
               Final check:
               p_{log_k}(alpha_{log_k}) = A(r_i, alphas) * B(alphas, r_j)

  The verifier only does O(log k) work instead of O(k) !

  For a 4096-dim hidden layer:
    Naive verification:  4096 multiplications
    Sumcheck:           12 rounds (log_2 4096) with degree-2 polys
    Speedup:            ~340x fewer operations

Field Tower

All arithmetic uses the M31 tower with 128-bit security:

  FIELD TOWER
  ===========

  M31:   Z / (2^31 - 1)             Base field
         p = 2,147,483,647          Single-cycle modular reduction
         |
         v
  CM31:  M31[i] / (i^2 + 1)        Complex extension
         Elements: a + b*i          2x M31 components
         |
         v
  QM31:  CM31[j] / (j^2 - 2 - i)   Degree-4 extension
         Elements: (a+bi) + (c+di)j  4x M31 components
         Security: 128 bits         (2^31)^4 > 2^124

  All arithmetic in Cairo uses u64 (not felt252):
  - (2^31 - 2)^2 < 2^62 < 2^64  -- no overflow in u64
  - Avoids felt252 modular arithmetic entirely
  - ~3x faster than felt252 for M31 operations

Architecture Deep Dive

Prover Side (Rust — `libs/stwo-ml/`)

  PROVING PIPELINE
  ================

  1. MODEL COMPILATION
     ONNX Model --> ComputationGraph --> LayeredCircuit
                    (nodes + edges)     (layers + types)

  2. FORWARD PASS (M31 Field)
     Input --> Layer 0 --> Layer 1 --> ... --> Output
               (M31 matrix arithmetic, no floating point)

  3. GKR PROOF GENERATION
     For each layer (output to input):
       - MatMul: Sumcheck proof (log_k degree-2 rounds)
       - Add: Claim split + deferred proof queuing
       - Activation: LogUp proof (lookup table)
       - LayerNorm: Eq-sumcheck + LogUp rsqrt

  4. SERIALIZATION
     GKR Proof --> felt252 array (cairo_serde.rs)
               --> sncast calldata (starknet.rs)

  Key files:
    src/gkr/prover.rs      GKR proof generation (prove_gkr)
    src/gkr/verifier.rs     Rust reference verifier (verify_gkr)
    src/gkr/types.rs        LayerProof enum, GKRProof struct
    src/cairo_serde.rs       Serialization to felt252
    src/starknet.rs          Calldata builders
    src/aggregation.rs       Model aggregation orchestrator

Verifier Side (Cairo — `libs/elo-cairo-verifier/`)

  VERIFICATION PIPELINE (ON-CHAIN)
  ================================

  verify_model_gkr(model_id, raw_io_data, circuit_depth, num_layers,
                   matmul_dims, proof_data, weight_commitments, ...)

  PHASE 0: SETUP
  +-----------------------------------------------------------------+
  | 1. Check model_id is registered                                  |
  | 2. Verify circuit_hash matches registered descriptor             |
  | 3. Recompute io_commitment = Poseidon(raw_io_data)              |
  | 4. Initialize PoseidonChannel (Fiat-Shamir)                      |
  +-----------------------------------------------------------------+
                                  |
                                  v
  PHASE 1: GKR WALK (verify_gkr_model)
  +-----------------------------------------------------------------+
  | 1. Reconstruct output claim from raw IO + random point           |
  | 2. For each layer proof in proof_data:                           |
  |    - Read tag (0=MatMul, 1=Add, 3=Activation, 4=LayerNorm, ...) |
  |    - Dispatch to layer-specific verifier                         |
  |    - Update current_claim with returned claim                    |
  | 3. Process deferred proofs (skip connections)                    |
  | 4. Verify final_claim matches MLE(input)                        |
  | 5. Collect weight claims for Phase 2                             |
  +-----------------------------------------------------------------+
                                  |
                                  v
  PHASE 2: WEIGHT VERIFICATION
  +-----------------------------------------------------------------+
  | For each weight matrix:                                          |
  |   Verify: MLE(weight_i, eval_point) == expected_value           |
  |   Using weight commitment from registration                     |
  +-----------------------------------------------------------------+
                                  |
                                  v
  PHASE 3: FINALIZATION
  +-----------------------------------------------------------------+
  | 1. Store proof_hash on-chain                                     |
  | 2. Increment verification_count                                  |
  | 3. Emit ModelGkrVerified event                                   |
  +-----------------------------------------------------------------+

Data Flow Diagram

  +===========================================================================+
  |                                                                           |
  |  OFF-CHAIN (GPU Server)                 ON-CHAIN (Starknet)              |
  |                                                                           |
  |  +------------------+                  +---------------------------+     |
  |  |                  |                  |     Cairo Verifier v31     |     |
  |  |  prove_model_    |   sncast invoke  |                           |     |
  |  |  pure_gkr()      | ==============> |  register_model_gkr()     |     |
  |  |                  |                  |    - store weight commits  |     |
  |  |  Returns:        |                  |    - store circuit hash    |     |
  |  |  - GKRProof      |   sncast invoke  |                           |     |
  |  |  - IO data       | ==============> |  verify_model_gkr()       |     |
  |  |  - weight commits|                  |    - GKR walk             |     |
  |  |                  |                  |    - sumcheck verify      |     |
  |  +------------------+                  |    - IO MLE check         |     |
  |                                        |    - weight MLE check     |     |
  |  +------------------+                  |                           |     |
  |  |  build_verify_   |                  |                           |     |
  |  |  model_gkr_      |   sncast call    |  get_verification_count() |     |
  |  |  calldata()      | <============== |    - returns count         |     |
  |  +------------------+                  +---------------------------+     |
  |                                                                           |
  +===========================================================================+

Layer Type Verification

Each GKR layer has a specific verification algorithm. The on-chain verifier dispatches based on the layer tag.

Tag Encoding

  TAG     LAYER TYPE       FIAT-SHAMIR DOMAIN SEPARATOR
  ===     ==========       ===========================
   0      MatMul           mix_u64(m), mix_u64(k), mix_u64(n)
   1      Add              (none -- direct claim split)
   2      Mul              mix_u64(0x4D554C)  ["MUL"]
   3      Activation       mix_u64(0x4C4F47)  ["LOG"] + type_tag
   4      LayerNorm        mix_u64(0x4C4E)    ["LN"]
   5      Attention        mix_u64(0x4154544E) ["ATTN"]
   6      Dequantize       mix_u64(0x4445514C4F47) ["DEQLOG"] + bits
   7      MatMulDualSimd   (same as MatMul, different prover path)
   8      RMSNorm          mix_u64(0x524E)    ["RN"]

MatMul Verification (Tag 0)

  MATMUL SUMCHECK VERIFICATION
  ============================

  Input:  claim = (point, value) from previous layer
  Proof:  num_rounds round polynomials + final evaluations

  1. Mix dimensions into channel: mix(m), mix(k), mix(n)
  2. Mix claimed_sum into channel
  3. For round i = 1..num_rounds:
     a. Read degree-2 polynomial (c0, c1, c2)
     b. Check: c0 + c1 == running_sum
     c. Mix coefficients into channel
     d. Draw random challenge alpha_i
     e. Update: running_sum = poly_eval(c0, c1, c2, alpha_i)
  4. Read final_a_eval, final_b_eval
  5. Check: running_sum == final_a_eval * final_b_eval
  6. Mix final evaluations into channel

  Output: new_claim at sumcheck challenge point with final_a_eval

Add Verification (Tag 1)

  ADD LAYER VERIFICATION (WITH DEFERRED PROOFS)
  =============================================

  For y = trunk(x) + skip(x):

  1. Read lhs (trunk eval), rhs (skip eval), trunk_idx
  2. Check: lhs + rhs == claim.value   (the Add constraint)
  3. Mix lhs, rhs into channel
  4. Draw random alpha
  5. Combine: reduced = alpha * lhs + (1 - alpha) * rhs
  6. Return: claim at same point with reduced value

  TRUNK vs SKIP:
  - trunk_idx = 0: lhs is trunk (GKR walk continues with lhs path)
  - trunk_idx = 1: rhs is trunk (GKR walk continues with rhs path)
  - The "skip" branch gets a DEFERRED proof (verified after the walk)

  +-------- GKR Walk --------+        +---- Deferred Proofs ----+
  |                          |        |                          |
  | Layer 3: Add             |        | After walk completes:    |
  |   claim -> split         |------->| skip branch MatMul       |
  |   trunk -> continue walk |        | sumcheck verified here   |
  | Layer 2: MatMul (trunk)  |        |                          |
  | Layer 1: ReLU            |        +--------+-----------------+
  | Layer 0: MatMul          |                 |
  +----------+---------------+                 |
             |                                 |
             v                                 v
       input_claim                      weight_claim (skip)

Activation Verification (Tag 3)

  ACTIVATION (RELU/GELU/SIGMOID) VERIFICATION
  ============================================

  Uses LogUp protocol to verify lookup table membership.

  1. Mix domain separator: mix_u64(0x4C4F47), mix_u64(type_tag)
  2. Draw gamma, beta (LogUp challenges)
  3. Verify LogUp table-side sum (batch Montgomery inversion)
  4. Run eq-sumcheck (degree-3) with initial sum = 1
  5. Final check: sum == eq(r,s) * w(s) * (gamma - in(s) - beta*out(s))
  6. Mix input/output evaluations

  The LogUp argument proves that every (input, output) pair
  exists in the precomputed activation table:

  +-------------------+
  | Activation Table  |
  |                   |
  | in  | out = f(in) |
  | --- | ----------- |
  |  0  |     0       |  (ReLU: max(0, x))
  |  1  |     1       |
  |  2  |     2       |
  | -1  |     0       |
  | -2  |     0       |
  | ... |    ...      |
  +-------------------+

LayerNorm Verification (Tag 4)

  LAYERNORM VERIFICATION
  ======================

  Two-part verification:

  Part 1: Linear eq-sumcheck
    - Mix: "LN" tag, mean, rsqrt, claimed_sum
    - Degree-3 eq-sumcheck
    - Final: sum == eq(r,s) * centered_final * rsqrt_final
    - centered = (input - mean), rsqrt = 1/sqrt(variance)

  Part 2: rsqrt LogUp verification (if not SIMD-combined)
    - Mix: "LOG" + "RS" tags
    - Draw gamma, beta
    - Verify LogUp table-side sum
    - Degree-3 eq-sumcheck for rsqrt lookup

  +--------------------+     +----------------------+
  | Part 1: Linear     |     | Part 2: rsqrt LogUp  |
  |                    |     |                      |
  | Verify:            |     | Verify:              |
  | out = (in - mean)  |     | rsqrt(var) is in     |
  |     * rsqrt(var)   |     | precomputed table    |
  |     * gamma + beta |     |                      |
  +--------------------+     +----------------------+

Residual Connections (DAG Circuits)

The Problem

Standard GKR operates on layered circuits (each layer feeds only into the next). Real neural networks have skip connections (residual adds) creating directed acyclic graphs (DAGs).

  LAYERED (GKR standard):         DAG (needs special handling):

  Input                           Input
    |                               |
  Layer 0                         Layer 0 (MatMul)
    |                               |
  Layer 1                         +------+------+
    |                             |  FORK        |
  Layer 2                      Layer 1(ReLU)     |
    |                             |              |
  Output                       Layer 2(MatMul)   | (skip)
                                  |              |
                                Layer 3 (ADD) <--+
                                  |
                                Output

The Solution: Trunk/Skip + Deferred Proofs

  DEFERRED PROOF ARCHITECTURE
  ===========================

  During the GKR walk, when hitting an Add layer:

  1. SPLIT: The Add claim splits into trunk + skip evaluations
  2. TRUNK: The walk continues down the trunk branch (higher layer index)
  3. DEFER: The skip branch claim is saved for later
  4. After the walk completes, verify all deferred proofs

  WALK ORDER:
  ===========
  Step 1: Start at output (Add layer)
          - Split claim: trunk_eval + skip_eval = output_eval
          - Save skip claim point in deferred_add_points[]
          - Continue walk with trunk claim

  Step 2: Verify trunk branch layers (MatMul, ReLU, MatMul)
          - Standard GKR sumcheck at each layer

  Step 3: Reach input
          - Verify MLE(input, final_point) == final_claim

  Step 4: Process deferred proofs
          - For each deferred claim:
            a. Read skip MatMul proof from proof_data
            b. Reconstruct claim using saved point
            c. Verify MatMul sumcheck
            d. Collect weight claim

  SERIALIZATION ORDER:
  ====================
  [walk_layer_proofs...] [num_deferred] [deferred_proof_0...] [weight_commits]

  Deferred proof format:
  [claim_value (QM31)] [m, k, n] [num_rounds] [round_polys...] [final_a, final_b] [weight_commit]

Claim Point Reconstruction

A critical design detail: the deferred proof serialization only stores the claim value, not the claim point. The point is reconstructed by the verifier:

  CLAIM POINT SAVING
  ==================

  During walk, when Add layer is hit:
    deferred_add_points.push(current_claim.point.clone())

  During deferred proof processing:
    point = deferred_add_points[i]
    claim = GKRClaim { point, value: read_from_proof() }

  This saves 2-8 QM31 values per deferred proof in calldata
  (each QM31 = 4 felts, each point has log_n + log_m entries)

Fiat-Shamir Transcript

The Fiat-Shamir transform converts the interactive GKR protocol into a non-interactive proof. Every operation must match exactly between prover (Rust) and verifier (Cairo).

Poseidon2-M31 Channel

  POSEIDON CHANNEL OPERATIONS
  ===========================

  State: digest (felt252)

  mix_u64(value):
    digest = poseidon_hash(digest, value as felt252, 2)

  mix_secure_field(qm31):            <-- CRITICAL: 4x mix_u64, NOT mix_felts!
    mix_u64(qm31.a.a)                    Different from mix_felts which packs
    mix_u64(qm31.a.b)                    into 2 felt252s via poseidon_hash_many
    mix_u64(qm31.b.a)
    mix_u64(qm31.b.b)

  mix_felts(values: Array<QM31>):
    Pack 4 M31 per QM31, 4 QM31 = 16 M31s into 2 felt252s
    poseidon_hash_many([packed...])

  draw_qm31():
    digest = poseidon_hash(digest, 0, 0)
    Extract 4 M31 components from digest bits
    Return QM31

Transcript Order for GKR Walk

  FIAT-SHAMIR TRANSCRIPT ORDER
  ============================

  1. INITIALIZATION
     mix_u64(num_layers)
     mix_u64(output_rows), mix_u64(output_cols)

  2. OUTPUT CLAIM
     Draw random output point: r_out = [draw_qm31() x log_n]
     Evaluate MLE(output, r_out) = output_value
     mix_secure_field(output_value)

  3. PER-LAYER (output to input)
     For each layer:
       [domain separator tags]        // e.g., mix_u64(m), mix_u64(k), mix_u64(n)
       mix_secure_field(claimed_sum)   // bind the claimed sum
       For each sumcheck round:
         mix_felts(round_poly_coeffs)  // bind polynomial
         draw_qm31()                   // random challenge
       mix_secure_field(final_evals)   // bind final evaluations

  4. DEFERRED PROOFS (after walk)
     For each deferred proof:
       mix_secure_field(claim_value)
       [same sumcheck transcript as MatMul layer]

  5. WEIGHT OPENINGS (Phase 2)
     For each weight commitment:
       [MLE opening proof transcript]

Serialization Format

Calldata Layout for `verify_model_gkr`

  CALLDATA STRUCTURE
  ==================

  model_id: felt252
  raw_io_data: Array<felt252>     [len, in_rows, in_cols, in_len, in_data...,
                                        out_rows, out_cols, out_len, out_data...]
  circuit_depth: u32               log2(max_layer_size)
  num_layers: u32                  number of GKR layers
  matmul_dims: Array<u32>         [len, m0, k0, n0, m1, k1, n1, ...]
  dequantize_bits: Array<u64>     [len, bits0, bits1, ...]
  proof_data: Array<felt252>      [len, ...layer proofs... num_deferred, ...deferred proofs...]
  weight_commitments: Array<felt252>  [len, commit0, commit1, ...]
  weight_opening_proofs: Array<felt252>  [len, ...]  (empty for Phase 1)

Per-Layer Proof Data (Inside proof_data)

  MATMUL LAYER (tag = 0):
  [num_rounds] [c0,c1,c2 (QM31) x num_rounds] [final_a (QM31)] [final_b (QM31)]

  ADD LAYER (tag = 1):
  [lhs (QM31)] [rhs (QM31)] [trunk_idx (u32)]

  ACTIVATION LAYER (tag = 3):
  [type_tag] [log_size] [table_commitment (felt252)]
  [num_mult] [multiplicities...] [claimed_sum (QM31)]
  [eq_num_rounds] [eq_round_polys... (degree-3)] [final_evals (QM31 x 3)]

  LAYERNORM LAYER (tag = 4):
  [mean (QM31)] [rsqrt (QM31)] [claimed_sum (QM31)] [simd_combined (u32)]
  [eq_num_rounds] [eq_round_polys...] [centered_final (QM31)] [rsqrt_final (QM31)]
  [logup proof if !simd_combined]

  DEFERRED PROOFS (at end):
  [num_deferred]
  For each:
    [claim_value (QM31)] [m, k, n] [matmul sumcheck data] [weight_commitment (felt252)]

QM31 Serialization

  QM31 = (a + bi) + (c + di)j

  Serialized as 4 consecutive felt252 values:
  [a (u64 as felt252)] [b (u64 as felt252)] [c (u64 as felt252)] [d (u64 as felt252)]

  Example: QM31 { a: CM31(1234, 5678), b: CM31(9012, 3456) }
  --> [0x4D2, 0x162E, 0x2334, 0xD80]

Deployed Models

D8: Single MatMul

  Architecture: Input(1x4) --> MatMul(4x2) --> Output(1x2)

  +-------+     +--------+     +-------+
  | 1x4   |---->| 4x2    |---->| 1x2   |
  | Input  |     | Weight  |     | Output |
  +-------+     +--------+     +-------+

  model_id: 0xD8
  Layers: 1 (MatMul only)
  Weight commitments: 1
  Circuit descriptor: [1, 0]

D9: MLP (Multi-Layer Perceptron)

  Architecture: Input(1x4) --> MatMul(4x4) --> ReLU --> MatMul(4x2) --> Output(1x2)

  +-------+     +--------+     +------+     +--------+     +-------+
  | 1x4   |---->| 4x4    |---->| ReLU |---->| 4x2    |---->| 1x2   |
  | Input  |     | Weight1 |     |      |     | Weight2 |     | Output |
  +-------+     +--------+     +------+     +--------+     +-------+

  model_id: 0xD9
  Layers: 3 (MatMul, Activation, MatMul)
  Weight commitments: 2
  Circuit descriptor: [3, 0, 3, 0]  (tags: MatMul, Activation, MatMul)

D10: LayerNorm Chain

  Architecture: Input(1x4) --> MatMul(4x4) --> LayerNorm --> MatMul(4x2) --> Output(1x2)

  +-------+     +--------+     +-----------+     +--------+     +-------+
  | 1x4   |---->| 4x4    |---->| LayerNorm |---->| 4x2    |---->| 1x2   |
  | Input  |     | Weight1 |     | (mean,var)|     | Weight2 |     | Output |
  +-------+     +--------+     +-----------+     +--------+     +-------+

  model_id: 0xDA
  Layers: 3 (MatMul, LayerNorm, MatMul)
  Weight commitments: 2
  Circuit descriptor: [3, 0, 4, 0]  (tags: MatMul, LayerNorm, MatMul)

D11: Residual Network

  Architecture: Input(1x4) --> MatMul(4x4) --> fork --> ReLU --> MatMul(4x4) --> Add
                                                 |                               ^
                                                 +---------- skip connection ----+

  +-------+     +--------+     +------+------+
  | 1x4   |---->| 4x4    |---->| FORK        |
  | Input  |     | Weight1 |     |             |
  +-------+     +--------+   +-v----+    +----v----+
                              | ReLU |    | (skip)  |
                              +--+---+    |         |
                                 |        |         |
                              +--v-----+  |         |
                              | 4x4    |  |         |
                              | Weight2 |  |         |
                              +--+-----+  |         |
                                 |        |         |
                              +--v--------v----+
                              |     ADD        |
                              | trunk + skip   |
                              +------+---------+
                                     |
                              +------v---------+
                              |   Output (1x4) |
                              +----------------+

  model_id: 0xDB
  Layers: 4 (Add, MatMul, Activation, MatMul)
  Weight commitments: 2
  Deferred proofs: 1 (skip connection MatMul)
  Circuit descriptor: [4, 1, 0, 3, 0]  (tags: Add, MatMul, Activation, MatMul)

  Status:    VERIFIED ON-CHAIN

Gas Costs

Measured On-Chain Gas (Starknet Sepolia)

Operation	l2_gas	Notes
`register_model_gkr`	~120K	Model registration + circuit hash
`verify_model_gkr` (D8, 1 layer)	~1.5M	Single MatMul sumcheck
`verify_model_gkr` (D9, 3 layers)	~3.5M	MatMul + ReLU + MatMul
`verify_model_gkr` (D10, 3 layers)	~4.2M	MatMul + LayerNorm + MatMul
`verify_model_gkr` (D11, 4 layers)	~5.0M	MatMul + ReLU + MatMul + Add + deferred

Estimated Gas per Layer Type

Layer Type	Per-Layer Gas	Notes
Add	~500	1 QM31 add + 2 mix + 1 draw
MatMul (log_k rounds)	~3,500 * log_k	Degree-2 sumcheck + MLE openings
Activation (n vars)	~3,000 * n	LogUp eq-sumcheck
LayerNorm (n vars)	~6,000 * n	Linear eq-sumcheck + LogUp
RMSNorm (n vars)	~5,500 * n	Same as LayerNorm, no mean
Dequantize (n vars)	~3,000 * n	Same as Activation
Attention	sum of sub-matmuls	4 + 2H matmul sub-proofs

Scaling Estimates

  MODEL SIZE vs. ESTIMATED GAS
  ============================

  Layers    Type              Est. Gas     Multi-tx?
  ------    ----              --------     ---------
  3         MLP               ~3.5M        No (1 tx)
  4         Residual          ~5.0M        No (1 tx)
  10        Small transformer ~15M         No (1 tx)
  40        Medium model      ~60M         Maybe (chunked)
  160       Qwen3-14B         ~80M         No (aggregated mode 4)

Security Model

Threat Model

  WHAT THE VERIFIER GUARANTEES
  ============================

  1. COMPUTATIONAL INTEGRITY
     If verify_model_gkr returns true, then with probability > 1 - 2^{-128}:
     - The prover executed the neural network correctly
     - The output is the actual result of the claimed input
     - Every intermediate computation (matmul, activation, norm) is correct

  2. IO BINDING
     The IO commitment is recomputed on-chain from raw data:
     - Prover cannot claim different I/O than what was actually used
     - MLE evaluation at random points binds the proof to exact I/O values

  3. WEIGHT BINDING
     Weight commitments are registered before verification:
     - Prover cannot switch model weights between registration and verification
     - Weight MLE openings prove the proof used the registered weights

  4. FIAT-SHAMIR SOUNDNESS
     Poseidon2-M31 channel provides cryptographic randomness:
     - Prover cannot predict challenges (hash preimage resistance)
     - All domain separators prevent cross-protocol attacks

What Is NOT Verified On-Chain (Phase 1)

  CURRENT LIMITATIONS
  ===================

  Weight MLE openings (Phase 2 TODO):
  - Currently weight_opening_proofs is passed as empty array
  - The contract checks weight commitments match registration
  - Full MLE opening verification needs Merkle proof support

  STARK data (FRI blocked):
  - squashed_felt252_dict_entries not in Starknet allowed libfuncs
  - Workaround: hash-bind STARK data on-chain, verify off-chain
  - GKR path avoids this entirely (no STARK/FRI needed)

Deployment Guide

Step 1: Build the Contract

cd libs/elo-cairo-verifier
scarb build

Step 2: Declare on Starknet Sepolia

sncast --account deployer \
  --accounts-file ~/.starknet_accounts/starknet_open_zeppelin_accounts.json \
  declare \
  --url https://api.cartridge.gg/x/starknet/sepolia \
  --contract-name SumcheckVerifierContract

Step 3: Deploy

sncast --account deployer \
  --accounts-file ~/.starknet_accounts/starknet_open_zeppelin_accounts.json \
  deploy \
  --url https://api.cartridge.gg/x/starknet/sepolia \
  --class-hash <CLASS_HASH_FROM_STEP_2> \
  --arguments '<YOUR_WALLET_ADDRESS>'

Step 4: Generate Proof Artifacts

cd libs/stwo-ml
cargo test --test e2e_cairo_verify test_d11_export_residual_onchain_calldata -- --nocapture
# Artifacts written to tests/artifacts/d11_*.txt

Step 5: Register Model

sncast invoke \
  --contract-address <CONTRACT_ADDRESS> \
  --function register_model_gkr \
  --calldata $(cat tests/artifacts/d11_register_gkr_calldata.txt)

Step 6: Verify Model

sncast invoke \
  --contract-address <CONTRACT_ADDRESS> \
  --function verify_model_gkr \
  --calldata $(cat tests/artifacts/d11_verify_gkr_calldata.txt)

Step 7: Confirm

sncast call \
  --contract-address <CONTRACT_ADDRESS> \
  --function get_verification_count \
  --calldata <MODEL_ID>
# Should return: 1

Codebase Map

Rust Prover (`libs/stwo-ml/`)

  src/
  +-- gkr/
  |   +-- mod.rs              LayeredCircuit, LayerType enum
  |   +-- prover.rs           prove_gkr() -- full GKR proof generation
  |   +-- verifier.rs         verify_gkr() -- reference Rust verifier (1900 lines)
  |   +-- types.rs            GKRProof, LayerProof enum (8 variants)
  |
  +-- components/
  |   +-- matmul.rs           Sumcheck prover for matrix multiplication
  |   +-- activation.rs       LogUp prover for ReLU/GELU/Sigmoid
  |   +-- attention.rs        Multi-head attention composed prover
  |   +-- quantize.rs         INT4/INT8 quantization LogUp
  |   +-- dequantize.rs       Dequantization LogUp
  |   +-- tiled_matmul.rs     Large-matrix tiled sumcheck
  |   +-- rmsnorm.rs          RMSNorm LogUp prover
  |   +-- rope.rs             Rotary position encoding
  |
  +-- compiler/
  |   +-- graph.rs            ComputationGraph, GraphBuilder, GraphWeights
  |   +-- prove.rs            prove_model_pure_gkr(), prove_model_aggregated_onchain()
  |   +-- mod.rs              ONNX model loading
  |
  +-- cairo_serde.rs          Serialization: Rust proof -> felt252 array (2000+ lines)
  +-- starknet.rs             Calldata builders for Starknet transactions (1000+ lines)
  +-- aggregation.rs          Model proof aggregation orchestrator
  +-- gpu_sumcheck.rs         CUDA kernels for GPU-accelerated sumcheck
  +-- gpu.rs                  GPU backend selection and dispatch

Cairo Verifier (`libs/elo-cairo-verifier/`)

  src/
  +-- field.cairo             M31/CM31/QM31 field tower + qm31_inverse (220 lines)
  +-- channel.cairo           Poseidon2-M31 Fiat-Shamir channel (180 lines)
  +-- types.cairo             Proof type definitions with Serde (300 lines)
  +-- sumcheck.cairo          Sumcheck round verification (150 lines)
  +-- mle.cairo               MLE opening proof verification (200 lines)
  +-- gkr.cairo               GKR batch verification (350 lines)
  +-- logup.cairo             LogUp table-side sum + batch inversion (120 lines)
  +-- layer_verifiers.cairo   Per-layer verifiers: 8 layer types (800 lines)
  +-- model_verifier.cairo    Full GKR model walk + deferred proofs (500 lines)
  +-- ml_air.cairo            ML Air trait + STARK verification (950 lines)
  +-- verifier.cairo          ISumcheckVerifier contract (1300 lines)
  +-- lib.cairo               Module registry

  tests/
  +-- test_field.cairo             27 field arithmetic tests
  +-- test_channel.cairo           10 Poseidon channel tests
  +-- test_verifier.cairo          6 single/batched matmul tests
  +-- test_batch.cairo             25 batch sumcheck tests
  +-- test_gkr.cairo               34 GKR verification tests
  +-- test_unified.cairo           IO binding + proof hash + serde tests
  +-- test_direct.cairo            ml_air + LookupElements + upgrade tests
  +-- test_model_gkr_contract.cairo  10 GKR registration tests
  +-- test_layer_verifiers.cairo   Layer-type verification tests
  +-- test_logup.cairo             LogUp table verification tests
  +-- test_model_verifier.cairo    Full model GKR walk tests
  +-- test_sp3_cross_language.cairo Cross-language Serde roundtrip tests

Comparison with Other Approaches

  +===========================================================================+
  |                    ZKML APPROACH COMPARISON                               |
  +===========================================================================+
  |                                                                           |
  | APPROACH        | ON-CHAIN VERIFY | LAYER TYPES | TRUSTLESS | FRI FREE   |
  | ============    | =============== | =========== | ========= | ========   |
  | ObelyZK GKR     |     YES         |    ALL      |   YES     |   YES      |
  | EZKL            |     Partial     |    Limited  |   YES     |   NO       |
  | zkLLM           |     NO          |    MatMul   |   NO      |   N/A      |
  | Modulus         |     Partial     |    Some     |   YES     |   NO       |
  | opML            |     NO          |    All      |   NO      |   N/A      |
  |                                                                           |
  +===========================================================================+
  |                                                                           |
  | "ON-CHAIN VERIFY":                                                       |
  |   YES  = Full cryptographic verification in smart contract                |
  |   Partial = Proof submitted on-chain but verified off-chain               |
  |   NO   = Attestation only (no crypto verification)                       |
  |                                                                           |
  | "FRI FREE":                                                              |
  |   YES  = Does not require FRI/dictionary libfuncs                        |
  |   NO   = Needs FRI which may be blocked on some chains                   |
  |                                                                           |
  +===========================================================================+

Why GKR Instead of SNARKs/STARKs for On-Chain?

  STARK/SNARK ON-CHAIN VERIFICATION
  ==================================
  Requires: FRI polynomial commitment scheme
  FRI needs: Random oracle + hash map (Felt252Dict)
  Felt252Dict: Generates squashed_felt252_dict_entries libfunc
  Starknet:    Does NOT allow this libfunc
  Result:      BLOCKED

  GKR ON-CHAIN VERIFICATION
  ==========================
  Requires: Sumcheck protocol
  Sumcheck needs: Field arithmetic + random challenges
  Field arith:    M31/QM31 operations (u64, no special libfuncs)
  Randomness:     Poseidon hash (natively supported)
  Result:         WORKS ON STARKNET TODAY

Built by the ObelyZK team. Powered by STWO from StarkWare. Licensed under Apache 2.0.