BASIS
Documentation

Architecture

Basis is built from a Vercel-hosted API gateway, a separately-hosted long-lived orchestrator, contributor worker runtimes, three ledgers, and a settlement keeper on Base. This page maps each component and is explicit about three boundaries that matter: what runs on Vercel versus elsewhere, what is durable versus temporary, and what is in the inference hot path versus the delayed settlement path.

System diagram

Client to gateway to an inference backend — either a hosted upstream or, self-hosted, an orchestrator and its workers — with the keeper and Base settlement path below the hot-path line.

text
                          VERCEL  (serverless web + API)
                ┌───────────────────────────────────────────────┐
   USER / APP   │                                               │
   AGENT  ───POST /api/v1/chat/completions──▶  API GATEWAY      │
       ▲        │   - auth, validate, resolve model             │
       │        │   - price job, reserve $BASIS credits         │
       │        │   - proxy the hosted upstream backend          │
       │        └──────────────┬─────────────────────┬──────────┘
       │  SSE token stream      │ (UPSTREAM_URL)      │ (ORCHESTRATOR_URL:
       │                        ▼                     │  self-hosted, NOT
       │        ┌───────────────────────────┐         │  dialed by Vercel)
       │        │ HOSTED OAI-COMPAT UPSTREAM │         ▼
       │        │ (canonical serverless      │  ┌────────────────────────────────┐
       │        │  backend; metered, no      │  │  ORCHESTRATOR                   │
       │        │  contributor reward)        │  │  long-lived Socket.io process   │
       │        └───────────────────────────┘  │  - worker registry              │
       │                                        │  - weighted-random job matching  │
       │                                        └───────────────┬─────────────────┘
       │   EITHER backend serves a request                      │
       └─────────────────────────────────────┐  ┌──────────────┼──────────────────┐
                                              │  ▼              ▼                  ▼
                                         WORKER (GPU)     WORKER (GPU)       WORKER (GPU)
                                         Ollama native    Ollama native      WebGPU (planned)

   ── inference hot path above ─────────────────────────────────────────
   ── delayed settlement below (no on-chain writes in the hot path) ────

      RECEIPT LEDGER ── CREDIT LEDGER ── REWARD LEDGER
                    (memory by default; durable Postgres adapter wired)
                                  │
                                  ▼
                       SETTLEMENT KEEPER ──── batch ───▶  BASE
                       (idempotent; runs                 RewardDistributor
                        as a dry-run until                (address pending
                        the distributor is set)            until configured)

   Bankr is NOT shown above: it is only the LAUNCH rail for the $BASIS token
   on Base (read-only fee observation + operator-signed claims after launch),
   never an inference backend, credit ledger, payment router, or keeper.

User / app client

Any OpenAI-compatible client: an application, an agent, or a plain HTTP caller. It points its base URL at Basis and sends standard chat-completions requests. No Basis-specific SDK is required, and streaming uses ordinary Server-Sent Events.

API gateway (Vercel)

The web app and API routes run on Vercel as a Next.js App Router deployment. /api/v1/chat/completions authenticates, validates the body, resolves the model, prices the job at the active pricing epoch, and reserves the caller's $BASIS credits. For serving, this serverless route dials exactly one backend: if BASIS_INFERENCE_UPSTREAM_URL is set it proxies to that OpenAI-compatible upstream and streams tokens back; otherwise it returns a structured runtime_pending (503).

The gateway is serverless on Vercel. Vercel cannot host a persistent WebSocket server, so the orchestrator is a separate, long-lived process — not a Vercel route. The serverless route therefore never dials the orchestrator: BASIS_ORCHESTRATOR_URL is the self-hosted worker mesh's socket target, reached by a trusted internal gateway, not by this fetch-based route. If both vars are set, the upstream proxy still wins.

Authentication

On a protected route the gateway resolves a caller to a BasisIdentity from one of two credentials: a Privy access token — verified against Privy's public JWKS with jose (issuer privy.io, app-id audience, DID subject), no app secret in the verify path — or an API key (sk-basis-...), matched by its peppered hash. That identity carries the DID and linked wallets and authorizes the user routes and (when BASIS_AUTH_REQUIRED is true) inference. Public pages and read endpoints need no auth. Payment and worker routes additionally require a linked wallet. Internal /api/internal/* routes are a separate trust domain — gated by INTERNAL_API_SECRET, not by a Privy identity.

See authentication and security.

Orchestratorhosted deploy pending

The orchestrator is a long-lived Socket.io process that holds the live worker registry and matches each job to a worker. Because it maintains persistent WebSocket connections to workers it cannot run on Vercel; it runs locally or self-hosted today, and a hosted deployment is pending. It is the boundary between the stateless gateway and the stateful pool of GPU workers.

Worker registry

The registry tracks each contributor worker: its worker_id, EVM reward wallet, runtime, the models it serves, its heartbeat, and its measured throughput. A worker is eligible for a job only while it is idle, serving the requested model, and heartbeating within the active window. Matching is weighted-random by measured tokens-per-second across that eligible set.

Worker runtimes

Ollama (native) live

A native worker process backs inference with a local Ollama runtime, connects to the orchestrator, heartbeats, and serves jobs for the models it advertises.

WebGPU (browser) soon

A browser-based worker running models on WebGPU is planned — letting a tab contribute compute without a native install. Not live.

Pricing engine

The pricing engine (lib/pricing/*) turns a job into a deterministic, epoch-locked price. A versioned pricing epoch fixes the per-model credit price table and the BASIS-per-credit rate; usage is metered in credits (Basis's usage-accounting unit) and converted to $BASIS at the active epoch's rate. All of it is bigint, basis-point, floor-division math — no floating point in the money path.

Credits are deterministic inside a quote, reservation, pricing epoch, and receipt, but the network does not promise that one credit maps to the same amount of $BASIS forever. When a job is reserved the rate is locked into a hashed snapshot the receipt preserves, and the worker reward is drawn from that same snapshot. New epochs apply prospectively only — there is no retroactive repricing, and epoch-to-epoch moves are capped by maxEpochChangeBps. Pre-launch the engine runs an honest placeholder epoch; the live rate is derived from a real price reading only when a token, liquidity, and a price source exist. The active epoch, the rate, and the epoch table are served by GET /api/pricing / /api/pricing/epochs / /api/pricing/quote; an operator recompute is POST /api/internal/pricing/refresh (secret-protected).

See credits for the epoch model and the quote TTL.

Credit ledger

The credit ledger records what users spend in $BASIS base units: reservations, debits, releases, and refunds. The per-job charge is computed deterministically with integer (bigint) math, and a failed job is not charged. Balances are derived by summing signed ledger entries.

See credits for the accounting model.

Receipt ledger

The receipt ledger stores one canonical-JSON, SHA-256-hashed receipt per settled job. It is the idempotency gate: a duplicate jobId or receiptHash is rejected before any credit or reward mutation, so a job is recorded — and settled — at most once. Anyone can re-derive a receipt's hash from its body.

See inference receipts.

Reward ledger

The reward ledger accrues each worker's earned reward against its EVM address, computed from server-counted tokens and the model multiplier. It is an off-chain accrual record; nothing in it touches the chain until the settlement keeper batches it. Gateway-served jobs have no worker wallet and accrue no reward.

Payment routerpending

The payment router lets a payer fund credits with a token they already hold. It produces a PaymentQuote that routes ETH, WETH, or USDC on Base into $BASIS, with a slippage floor and a short quote TTL. It is a deliberate opt-in — BASIS_PAYMENT_ROUTER_ENABLED — and is pending until the $BASIS token, a quote provider, and the enable flag are all set. It never signs or holds funds: it returns calldata for the user/agent wallet to sign.

The router lives on the payment/deposit path, not the inference hot path. Quoting, signing, and depositing happen out-of-band — funding a balance is a distinct flow from running a completion, and a user waiting for tokens never waits on a swap.

Quote providers

The router obtains the sell-token → $BASIS price and calldata from a swap quote provider. The provider is configured by env; the router surfaces only booleans for which are wired — provider API keys never leave the server.

  • configured

    0x Swap API

    The default aggregator path. Set ZEROX_API_KEY (server-only) and optionally ZEROX_SWAP_API_URL.

  • planned

    Uniswap (Universal Router + Permit2)

    Direct routing via the Universal Router and Permit2. Planned — wired by UNISWAP_UNIVERSAL_ROUTER_ADDRESS / UNISWAP_PERMIT2_ADDRESS.

  • planned

    Aerodrome

    Base-native router. Planned — wired by AERODROME_ROUTER_ADDRESS.

0x is the configured provider path; Uniswap and Aerodrome are planned. No quote is fabricated — with no provider configured the quote route returns a structured quote_provider_pending (503).

Payment confirmationverification pending

After the user/agent wallet signs and submits the swap (and deposit), confirmation binds the resulting deposit to the quote and credits the balance. The quote is single-use: a replayed confirm is rejected (409 duplicate_payment), so a deposit cannot be double-counted. A payment final-credits only after on-chain proof on Base — never on a user-submitted tx hash alone. In strict mode (BASIS_PAYMENT_VERIFICATION_MODE, the default when the router is enabled) verification checks the receipt the chain returns: tx success, chain id 8453, the correct $BASIS token, the configured receiver (BASIS_PAYMENT_RECEIVER_ADDRESS), $BASIS out ≥ the quote's expected / min-out, at least BASIS_PAYMENT_CONFIRMATIONS (default 1) confirmations, and that the quote is unexpired and unreplayed. provisional mode credits before proof (dev/test only, clearly labelled); disabled applies when the router is off. The async verifier that performs the lookup is the remaining launch task, so today a credit is provisional until that wiring lands.

Basis prepares the transaction; the user signs and submits it. A failed, wrong-chain, wrong-token, wrong-receiver, under-filled, expired, or replayed payment never credits. Confirm is idempotent and quotes expire, so neither a replay nor a stale quote can credit twice. The network takes no custody at any step. See security for the full verification model.

Credit vaultpending

The credit vault is the on-chain destination for $BASIS that funds credits. Deposits are only enabled once its address is configured; until then credit balances may be internal and pending rather than backed by an on-chain deposit, and the accounting runs honestly without a deployed vault. No vault address is hardcoded in source.

Settlement keeper

The keeper is a batch process, deliberately outside the inference hot path. It periodically reads accrued rewards, forms a batch, computes a batch hash, and settles on Base idempotently — a batch hash settles exactly once, and a failed batch stays recoverable rather than double-paying. Until the reward distributor address is configured it can run as a dry-run that forms batches and computes hashes without writing on-chain.

No on-chain writes happen in the inference hot path. A user waiting for tokens never waits on the chain. Settlement runs separately as a keeper/batch process; failures are recoverable and nothing settles twice.

See settlement.

Persistence & durabilitydurable DB pending

All three ledgers — credit, receipt, and reward — plus the worker registry, quotes, and the /data state share one store, selected by BASIS_STORE_DRIVER. The default, memory, is process-local and non-durable: it lives in process memory and resets on a cold start, so receipts, credits, quotes, and reward accruals do not survive a restart. Payments recorded under memory are not durably recorded — it must not be treated as a system of record.

The durable Postgres adapter is already wired — the store interface is async and the adapter is complete, so enabling durability is an operator migration, not a code change: provisioning a pooled Postgres endpoint flips persistence from the process-local memory store to durable. Production runs on the memory store today. The operator guide covers the migration.

The idempotency and accounting guarantees hold within a process lifetime; durability across restarts requires the Postgres driver. A cold start clears in-memory state, so production must run on a durable store before real balances or rewards depend on it.

Base contracts (interfaces only)

Three Base contracts are referenced by the network: the $BASIS token, the credit vault, and the reward distributor. They are described by their interfaces here; none is deployed in this configuration. Every address renders pending until its environment variable is set, and no address is ever hardcoded in source.

  • pending

    $BASIS token

    Planned ERC-20 denominating inference credits and worker rewards. Address pending until BASIS_TOKEN_ADDRESS is set.

  • pending

    BasisCreditVault

    Holds user credit deposits. Deposits are only enabled once the vault address is configured.

  • pending

    BasisRewardDistributor

    Receives settlement batches from the keeper. Reward settlement is only enabled once its address is configured.

Launch status system

A single status module derives every "live vs pending" value from environment configuration at evaluation time. It is the one source of truth the gateway, the docs, and the launch surfaces all read, so a deployment with no addresses set reports honestly and never invents one. The inference runtime currently reports pending, and persistence reports process-local — the store is non-durable and resets on a cold start until a durable database is configured.

Live vs pending

Derived from real configuration, not a roadmap.

Live vs pending — derived from configuration