Streaming responses

Server-Sent Events on /v1/invoke. Chunk-by-chunk pass-through. Billing settles when the Provider sends completed.

Concept

Some capabilities produce output incrementally — LLM tokens, TTS audio, image batches. Streaming lets the agent process the first byte before the last byte exists. JECP exposes streaming through the same /v1/invoke endpoint via content negotiation: send Accept: text/event-stream and the Hub forwards the Provider's SSE stream straight to you. There is no Hub-side buffering beyond TCP. Wallet is debited once the Provider emits completed.

TypeScript SDK

import { JecpClient } from '@jecpdev/sdk';

const jecp = new JecpClient({ agentId, apiKey });

const stream = jecp.invokeStream('llm/chat', 'complete', {
  prompt: 'Write a haiku about TCP backpressure.',
}, {
  mandate: { budget_usdc: 0.10 },
});

for await (const event of stream) {
  switch (event.type) {
    case 'chunk':
      process.stdout.write(event.delta);
      break;
    case 'meter':
      // periodic usage update from the Provider
      console.log(`\n[meter] tokens=${event.tokens}`);
      break;
    case 'completed':
      console.log('\nbilling:', event.billing);
      break;
    case 'error':
      console.error('stream error:', event.error);
      break;
    case 'cancelled':
      console.log('cancelled — partial billing:', event.billing);
      break;
  }
}

Cancel mid-stream

const ctl = new AbortController();
setTimeout(() => ctl.abort(), 5000); // cap to 5s

const stream = jecp.invokeStream('llm/chat', 'complete', input, {
  signal: ctl.signal,
});

for await (const event of stream) { /* ... */ }
// On abort: Hub closes the upstream connection. You receive a `cancelled`
// event and a partial bill computed from the last `meter` snapshot.

Five event types

Event	Direction	Meaning
`open`	Hub → Agent	Upstream connection established. Sent once.
`chunk`	Provider → Agent	Incremental output. `{ delta, index }`.
`meter`	Provider → Hub → Agent	Usage update for the running tally — tokens, audio_seconds, chunks, elapsed_ms.
`completed`	Provider → Hub → Agent	Terminal. Includes `result`, `billing`, `provider`, `meter_summary`. Hub charges the wallet here.
`error` / `cancelled`	Hub → Agent	Terminal. Stream ends. `cancelled` may include partial billing.

Exactly one of completed, error, or cancelled is the last event. If the connection drops without one of those, both sides treat it as cancelled.

Pricing models

Each streaming action picks one model in its manifest. The Provider's meter events MUST report units consistent with the chosen model.

1. Flat — fixed price per stream

actions:
  - id: simple-stream
    streaming: true
    pricing:
      base: 0.05
      model: flat

2. Per token — LLMs

actions:
  - id: chat
    streaming: true
    pricing:
      base: 0.000003
      model: per_token
      input_per_token_usdc: 0.000003
      output_per_token_usdc: 0.000015

3. Per second — TTS / live audio

actions:
  - id: tts
    streaming: true
    pricing:
      base: 0.0002
      model: per_second
      audio_per_second_usdc: 0.0002

4. Per chunk — image batches / structured streams

actions:
  - id: image-batch
    streaming: true
    pricing:
      base: 0.005
      model: per_chunk
      per_chunk_usdc: 0.005

Phase A (live now) charges the manifest pricing.base as a flat rate when completed arrives. Variable per-token / per-second metering is on the Phase B roadmap and ships transparently — your manifest is forward-compatible.

Backpressure, timeouts, and Mandate

Backpressure: native HTTP/TCP flow control. If the agent reads slowly, the Hub stops pulling from the Provider. No internal Hub buffer beyond kernel sockets.
No-progress timeout: 30 s between successive Provider chunks. Hub emits cancelled with reason PROVIDER_TIMEOUT.
Total stream timeout: 5 minutes hard cap. Hub emits cancelled with reason STREAM_TIMEOUT.
Mandate budget: enforced up-front against pricing.base. Mid-stream Mandate enforcement (live cost vs. running meter) ships in Phase B.

Errors specific to streaming

Code	HTTP	Meaning
`NOT_STREAMABLE`	406	You sent `Accept: text/event-stream` for an action that does not declare `streaming: true`.
`PROVIDER_TIMEOUT`	SSE event	30 s no-progress timeout from Provider.
`STREAM_TIMEOUT`	SSE event	5-minute total stream cap reached.
`STREAM_INCOMPLETE`	SSE event	Provider closed the stream without a terminal event.
`PROVIDER_DISCONNECT`	SSE event	TCP read error mid-stream.

Run your own streaming Provider

Any HTTP server that emits text/event-stream works. Here is a minimal Express example. The Hub validates the HMAC headers (x-jecp-signature, x-jecp-timestamp) just like the non-streaming path.

app.post('/jecp/llm-chat', verifyJecpHmac, async (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.flushHeaders();

  const send = (event, data) =>
    res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`);

  let tokens = 0;
  for await (const delta of myLlm.stream(req.body.input.prompt)) {
    send('chunk', { delta, index: tokens });
    tokens += 1;
    if (tokens % 16 === 0) send('meter', { tokens, elapsed_ms: Date.now() - t0 });
  }

  send('completed', {
    result: { full_text: '...' },
    billing: { tokens },
  });
  res.end();
});

Provider implementations SHOULD send meter events at least every 5 seconds so the Hub can apply backpressure and (in Phase B) enforce live Mandate caps.

From the CLI

jecp invoke llm/chat complete \
  --input '{"prompt":"haiku about TCP"}' \
  --stream

Next steps

→ Cap autonomous spend with Mandate → Become a streaming Provider → Webhook events for completed / refunded