Streaming responses

Server-Sent Events on /v1/invoke. Chunk-by-chunk pass-through. Billing settles when the Provider sends completed.

Concept

Some capabilities produce output incrementally — LLM tokens, TTS audio, image batches. Streaming lets the agent process the first byte before the last byte exists. JECP exposes streaming through the same /v1/invoke endpoint via content negotiation: send Accept: text/event-stream and the Hub forwards the Provider's SSE stream straight to you. There is no Hub-side buffering beyond TCP. Wallet is debited once the Provider emits completed.

TypeScript SDK

import { JecpClient } from '@jecpdev/sdk';

const jecp = new JecpClient({ agentId, apiKey });

const stream = jecp.invokeStream('llm/chat', 'complete', {
  prompt: 'Write a haiku about TCP backpressure.',
}, {
  mandate: { budget_usdc: 0.10 },
});

for await (const event of stream) {
  switch (event.type) {
    case 'chunk':
      process.stdout.write(event.delta);
      break;
    case 'meter':
      // periodic usage update from the Provider
      console.log(`\n[meter] tokens=${event.tokens}`);
      break;
    case 'completed':
      console.log('\nbilling:', event.billing);
      break;
    case 'error':
      console.error('stream error:', event.error);
      break;
    case 'cancelled':
      console.log('cancelled — partial billing:', event.billing);
      break;
  }
}

Cancel mid-stream

const ctl = new AbortController();
setTimeout(() => ctl.abort(), 5000); // cap to 5s

const stream = jecp.invokeStream('llm/chat', 'complete', input, {
  signal: ctl.signal,
});

for await (const event of stream) { /* ... */ }
// On abort: Hub closes the upstream connection. You receive a `cancelled`
// event and a partial bill computed from the last `meter` snapshot.

Five event types

EventDirectionMeaning
openHub → AgentUpstream connection established. Sent once.
chunkProvider → AgentIncremental output. { delta, index }.
meterProvider → Hub → AgentUsage update for the running tally — tokens, audio_seconds, chunks, elapsed_ms.
completedProvider → Hub → AgentTerminal. Includes result, billing, provider, meter_summary. Hub charges the wallet here.
error / cancelledHub → AgentTerminal. Stream ends. cancelled may include partial billing.

Exactly one of completed, error, or cancelled is the last event. If the connection drops without one of those, both sides treat it as cancelled.

Pricing models

Each streaming action picks one model in its manifest. The Provider's meter events MUST report units consistent with the chosen model.

1. Flat — fixed price per stream

actions:
  - id: simple-stream
    streaming: true
    pricing:
      base: 0.05
      model: flat

2. Per token — LLMs

actions:
  - id: chat
    streaming: true
    pricing:
      base: 0.000003
      model: per_token
      input_per_token_usdc: 0.000003
      output_per_token_usdc: 0.000015

3. Per second — TTS / live audio

actions:
  - id: tts
    streaming: true
    pricing:
      base: 0.0002
      model: per_second
      audio_per_second_usdc: 0.0002

4. Per chunk — image batches / structured streams

actions:
  - id: image-batch
    streaming: true
    pricing:
      base: 0.005
      model: per_chunk
      per_chunk_usdc: 0.005
Phase A (live now) charges the manifest pricing.base as a flat rate when completed arrives. Variable per-token / per-second metering is on the Phase B roadmap and ships transparently — your manifest is forward-compatible.

Backpressure, timeouts, and Mandate

Errors specific to streaming

CodeHTTPMeaning
NOT_STREAMABLE406You sent Accept: text/event-stream for an action that does not declare streaming: true.
PROVIDER_TIMEOUTSSE event30 s no-progress timeout from Provider.
STREAM_TIMEOUTSSE event5-minute total stream cap reached.
STREAM_INCOMPLETESSE eventProvider closed the stream without a terminal event.
PROVIDER_DISCONNECTSSE eventTCP read error mid-stream.

Run your own streaming Provider

Any HTTP server that emits text/event-stream works. Here is a minimal Express example. The Hub validates the HMAC headers (x-jecp-signature, x-jecp-timestamp) just like the non-streaming path.

app.post('/jecp/llm-chat', verifyJecpHmac, async (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.flushHeaders();

  const send = (event, data) =>
    res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`);

  let tokens = 0;
  for await (const delta of myLlm.stream(req.body.input.prompt)) {
    send('chunk', { delta, index: tokens });
    tokens += 1;
    if (tokens % 16 === 0) send('meter', { tokens, elapsed_ms: Date.now() - t0 });
  }

  send('completed', {
    result: { full_text: '...' },
    billing: { tokens },
  });
  res.end();
});
Provider implementations SHOULD send meter events at least every 5 seconds so the Hub can apply backpressure and (in Phase B) enforce live Mandate caps.

From the CLI

jecp invoke llm/chat complete \
  --input '{"prompt":"haiku about TCP"}' \
  --stream