Streaming responses
Server-Sent Events on /v1/invoke. Chunk-by-chunk pass-through. Billing settles when the Provider sends completed.
Concept
Some capabilities produce output incrementally — LLM tokens, TTS audio, image batches. Streaming lets the agent process the first byte before the last byte exists. JECP exposes streaming through the same /v1/invoke endpoint via content negotiation: send Accept: text/event-stream and the Hub forwards the Provider's SSE stream straight to you. There is no Hub-side buffering beyond TCP. Wallet is debited once the Provider emits completed.
TypeScript SDK
import { JecpClient } from '@jecpdev/sdk';
const jecp = new JecpClient({ agentId, apiKey });
const stream = jecp.invokeStream('llm/chat', 'complete', {
prompt: 'Write a haiku about TCP backpressure.',
}, {
mandate: { budget_usdc: 0.10 },
});
for await (const event of stream) {
switch (event.type) {
case 'chunk':
process.stdout.write(event.delta);
break;
case 'meter':
// periodic usage update from the Provider
console.log(`\n[meter] tokens=${event.tokens}`);
break;
case 'completed':
console.log('\nbilling:', event.billing);
break;
case 'error':
console.error('stream error:', event.error);
break;
case 'cancelled':
console.log('cancelled — partial billing:', event.billing);
break;
}
}
Cancel mid-stream
const ctl = new AbortController();
setTimeout(() => ctl.abort(), 5000); // cap to 5s
const stream = jecp.invokeStream('llm/chat', 'complete', input, {
signal: ctl.signal,
});
for await (const event of stream) { /* ... */ }
// On abort: Hub closes the upstream connection. You receive a `cancelled`
// event and a partial bill computed from the last `meter` snapshot.
Five event types
| Event | Direction | Meaning |
|---|---|---|
open | Hub → Agent | Upstream connection established. Sent once. |
chunk | Provider → Agent | Incremental output. { delta, index }. |
meter | Provider → Hub → Agent | Usage update for the running tally — tokens, audio_seconds, chunks, elapsed_ms. |
completed | Provider → Hub → Agent | Terminal. Includes result, billing, provider, meter_summary. Hub charges the wallet here. |
error / cancelled | Hub → Agent | Terminal. Stream ends. cancelled may include partial billing. |
Exactly one of completed, error, or cancelled is the last event. If the connection drops without one of those, both sides treat it as cancelled.
Pricing models
Each streaming action picks one model in its manifest. The Provider's meter events MUST report units consistent with the chosen model.
1. Flat — fixed price per stream
actions:
- id: simple-stream
streaming: true
pricing:
base: 0.05
model: flat
2. Per token — LLMs
actions:
- id: chat
streaming: true
pricing:
base: 0.000003
model: per_token
input_per_token_usdc: 0.000003
output_per_token_usdc: 0.000015
3. Per second — TTS / live audio
actions:
- id: tts
streaming: true
pricing:
base: 0.0002
model: per_second
audio_per_second_usdc: 0.0002
4. Per chunk — image batches / structured streams
actions:
- id: image-batch
streaming: true
pricing:
base: 0.005
model: per_chunk
per_chunk_usdc: 0.005
pricing.base as a flat rate when completed arrives. Variable per-token / per-second metering is on the Phase B roadmap and ships transparently — your manifest is forward-compatible.
Backpressure, timeouts, and Mandate
- Backpressure: native HTTP/TCP flow control. If the agent reads slowly, the Hub stops pulling from the Provider. No internal Hub buffer beyond kernel sockets.
- No-progress timeout: 30 s between successive Provider chunks. Hub emits
cancelledwith reasonPROVIDER_TIMEOUT. - Total stream timeout: 5 minutes hard cap. Hub emits
cancelledwith reasonSTREAM_TIMEOUT. - Mandate budget: enforced up-front against
pricing.base. Mid-stream Mandate enforcement (live cost vs. running meter) ships in Phase B.
Errors specific to streaming
| Code | HTTP | Meaning |
|---|---|---|
NOT_STREAMABLE | 406 | You sent Accept: text/event-stream for an action that does not declare streaming: true. |
PROVIDER_TIMEOUT | SSE event | 30 s no-progress timeout from Provider. |
STREAM_TIMEOUT | SSE event | 5-minute total stream cap reached. |
STREAM_INCOMPLETE | SSE event | Provider closed the stream without a terminal event. |
PROVIDER_DISCONNECT | SSE event | TCP read error mid-stream. |
Run your own streaming Provider
Any HTTP server that emits text/event-stream works. Here is a minimal Express example. The Hub validates the HMAC headers (x-jecp-signature, x-jecp-timestamp) just like the non-streaming path.
app.post('/jecp/llm-chat', verifyJecpHmac, async (req, res) => {
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.flushHeaders();
const send = (event, data) =>
res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`);
let tokens = 0;
for await (const delta of myLlm.stream(req.body.input.prompt)) {
send('chunk', { delta, index: tokens });
tokens += 1;
if (tokens % 16 === 0) send('meter', { tokens, elapsed_ms: Date.now() - t0 });
}
send('completed', {
result: { full_text: '...' },
billing: { tokens },
});
res.end();
});
meter events at least every 5 seconds so the Hub can apply backpressure and (in Phase B) enforce live Mandate caps.
From the CLI
jecp invoke llm/chat complete \
--input '{"prompt":"haiku about TCP"}' \
--stream