Documentation
Guides
User guide
Go from zero to a streamed completion against the Basis inference API. Basis is OpenAI-compatible, so each step is the same call you already know — just pointed at the Basis base URL.
Choose a model
Pick a model id for the model field. Start with basis-default — it is the baseline. Use basis-small for cheaper, lighter calls or basis-large for a larger context window.
| Model | Context | Multiplier | Status |
|---|---|---|---|
| basis-small | 16,384 | 0.50x | soon |
| basis-default | 32,768 | 1.00x | soon |
| basis-large | 131,072 | 3.00x | soon |
The multiplier scales deterministic per-token accounting (1.00x baseline). Full model details are in the API reference.
Sign in & dashboard
To manage keys, credits, receipts, and a worker, sign in on basis.watch with Privy — connect a wallet or use an email. Reading the docs, the data, and the public pages needs no account; you only sign in to manage your own things.
- 01Sign in with a wallet or email (Privy).
- 02Open /dashboard — your DID, linked wallet, credits, keys, receipts, and worker.
- 03Link a wallet so you can fund credits and (optionally) register a worker against it.
- 04Check your credit balance under billing.
Full sign-in, dashboard, and linked-wallet details are on the authentication page.
Get an API key
Once signed in, mint an API key in the dashboard. The raw key is shown once — copy it immediately, because Basis stores only a peppered hash and cannot show it again. Authenticate with it as a bearer token prefixed sk-basis- in the Authorization header, just like OpenAI.
Authorization: Bearer sk-basis-...
API key issuance is pending. Pre-launch the API is open, so a key is accepted but not required — you can send a placeholder sk-basis-... today and follow the steps below.
Check launch & runtime status
Before sending traffic, confirm what is configured. The network exposes its honest state at GET /api/launch-status — a fresh deployment with nothing configured reads as pending, never a promise.
curl -s https://basis.watch/api/launch-status
The same response reports the inference runtime mode under inferenceRuntime, which mirrors the server's inferenceMode(): "proxy" when an inference backend is configured (models report available: true), or "pending" when none is. Right now the runtime is "pending" and models report available: false.
Send your first request
Post a model and a messages array to /api/v1/chat/completions. This is the standard OpenAI chat-completions call.
curl https://basis.watch/api/v1/chat/completions \
-H "Authorization: Bearer sk-basis-..." \
-H "Content-Type: application/json" \
-d '{"model":"basis-default","messages":[{"role":"user","content":"Summarize this for an agent."}]}'If a backend is configured you get an OpenAI-shaped completion. If not, you get a structured runtime_pending (503) — see Handle a pending-runtime response.
Stream the result
Set stream=True to receive tokens as Server-Sent Events. The OpenAI SDK handles the wire format and the terminating [DONE] sentinel for you — iterate the chunks and print the deltas.
from openai import OpenAI
client = OpenAI(base_url="https://basis.watch/api/v1", api_key="sk-basis-...")
stream = client.chat.completions.create(
model="basis-default",
messages=[{"role": "user", "content": "Write a one paragraph summary."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")The streaming wire format is documented in the API reference.
Look up a receipt
Every settled job produces a verifiable inference receipt — a canonical-JSON, SHA-256-hashed accounting record. Fetch one by its hash; the response includes the receipt and a verified boolean from a server-side re-derivation of the hash.
curl -s https://basis.watch/api/inference/receipts/<receipt_hash>
An unknown hash returns 404 not_found. The full receipt schema and the hashing scheme are in Inference receipts.
Understand credits
Usage is metered with deterministic per-token accounting, denominated in planned $BASIS credits and computed with integer math — no floating point in the money path. A model's multiplier scales its per-token cost. Basis does not publish a USD price.
See Credits for the per-token formula and the reserve → debit → release → refund lifecycle. $BASIS is planned, not live. soon
Handle a pending-runtime response
When no inference backend is configured, chat completions return 503 with a structured runtime_pending error in the OpenAI error envelope. This is expected: the API contract is live, the worker backend is pending. Detect it by status code plus error.code — "runtime_pending" — and surface a clear message instead of retrying blindly.
import httpx
resp = httpx.post(
"https://basis.watch/api/v1/chat/completions",
headers={"Authorization": "Bearer sk-basis-..."},
json={"model": "basis-default",
"messages": [{"role": "user", "content": "ping"}]},
)
if resp.status_code == 503:
err = resp.json()["error"]
if err["code"] == "runtime_pending":
print("Runtime backend is pending — try again once a backend is configured.")
else:
resp.raise_for_status()
print(resp.json())Why this happens
Basis ships the OpenAI-compatible contract independently of any hosted backend. Until an upstream backend is configured (inferenceMode() === "proxy"), the route is honest about being unable to serve tokens rather than returning fabricated output. Poll runtime status to know when it flips to configured.