This documentation describes the API contract required to integrate a custom backend with the AI Workbench UI using the Proxy GPT provider.
The Proxy GPT provider supports two communication methods:
Your backend should ideally support both, but HTTP Streaming is prioritized.
The UI checks if your proxy server is reachable before attempting to connect.
GET /health or HEAD /200 OKTo support dynamic model selection and expose specific capabilities (e.g., multimodal I/O, thinking), implement the models endpoint.
GET /v1/models{ "data": [ { "id": "my-custom-model", "object": "model", "created": 1677610602, "owned_by": "custom", "capabilities": { "imageInput": true, "imageOutput": false, "thinking": true, "textInput": true, "textOutput": true, "internetBrowsing": false, "fileOutput": false, "videoInput": false, "videoOutput": false } } ]}This is the primary method for generating chat responses.
POST /api/chatapplication/json{ "messages": [ { "role": "user", "content": "Hello, how are you?" } ]}The server must stream the response using the SSE format. Each chunk of data should be prefixed with data: .
data: {"content": "Hello"}data: {"content": " world"}data: [DONE]You can stream task updates to visualize the agent's plan and progress in the UI.
Plan Update Event: Defines the initial plan or updates the list of steps.
event: plan_updatedata: { "current_task_id": "task-1", "steps": [ { "id": "step-1", "status": "running", "title": "Analyze request" }, { "id": "step-2", "status": "pending", "title": "Execute code" } ]}Todo Update Event: Updates the status of existing tasks.
event: todo_updatedata: { "items": [ { "id": "step-1", "status": "completed", "title": "Analyze request" }, { "id": "step-2", "status": "running", "title": "Execute code" } ]}Thinking/Reasoning Event: Displays a collapsible "Reasoning" block for internal thought processes.
event: thinkingdata: {"content": "I need to check the database first..."}If HTTP streaming fails, the UI attempts to connect via Socket.IO.
chatEnsure your server allows Cross-Origin Resource Sharing (CORS) from the domain where the AI Workbench UI is hosted.
// Express/Node.js exampleapp.use(cors({ origin: "*", // Or specific domain methods: ["GET", "POST", "OPTIONS"]}));The Proxy GPT provider supports two communication methods:
Your backend should ideally support both, but HTTP Streaming is prioritized.
The UI checks if your proxy server is reachable before attempting to connect.
GET /health or HEAD /200 OKTo support dynamic model selection and expose specific capabilities (e.g., multimodal I/O, thinking), implement the models endpoint.
GET /v1/models{ "data": [ { "id": "my-custom-model", "object": "model", "created": 1677610602, "owned_by": "custom", "capabilities": { "imageInput": true, "imageOutput": false, "thinking": true, "textInput": true, "textOutput": true, "internetBrowsing": false, "fileOutput": false, "videoInput": false, "videoOutput": false } }, { "id": "my-vision-model", "object": "model", "capabilities": { "imageInput": true, "imageOutput": true, "textInput": true, "textOutput": true } } ]}The example implementation below supports the following models:
mock: Echoes back the input.text-in-image-out: Returns a hardcoded image URL.text-img-in-text-img-out: Handles text/image input and returns text/image output.reflection-and-thinking: Simulates a thinking process before responding.text-in-html-out: Returns a hardcoded HTML document.The capabilities object allows you to define what your model can do. All fields are optional and default to false (except textInput and textOutput which default to true).
imageInput (boolean): Can accept images in the promptimageOutput (boolean): Can generate imagesthinking (boolean): Supports chain-of-thought or reasoning stepstextInput (boolean): Can accept text inputtextOutput (boolean): Can generate text outputinternetBrowsing (boolean): Can access the internetfileOutput (boolean): Can generate/return filesvideoInput (boolean): Can accept video inputvideoOutput (boolean): Can generate videoThis is the primary method for generating chat responses.
POST /api/chatapplication/json{ "messages": [ { "role": "user", "content": "Hello, how are you?" }, { "role": "assistant", "content": "I am doing well, thank you!" } ]}The server must stream the response using the SSE format. Each chunk of data should be prefixed with data: .
data: {"content": "Hello"}data: {"content": " world"}data: {"content": "!"}data: [DONE]content field containing the text fragment.data: [DONE] to signal the end of the stream.If HTTP streaming fails, the UI attempts to connect via Socket.IO.
chatio.on("connection", (socket) => { socket.on("chat", async ({ messages }) => { try { // Process chat request... // Emit chunks socket.emit("chunk", { content: "Hello" }); socket.emit("chunk", { content: " world" }); // Signal completion socket.emit("complete"); } catch (error) { socket.emit("error", { message: "Something went wrong" }); } });});Ensure your server allows Cross-Origin Resource Sharing (CORS) from the domain where the AI Workbench UI is hosted.
// Express/Node.js exampleapp.use(cors({ origin: "*", // Or specific domain methods: ["GET", "POST", "OPTIONS"]}));You can run this example using npx ts-node example-proxy.ts.
import express from 'express';import cors from 'cors';import bodyParser from 'body-parser';const app = express();const port = 3001;app.use(cors());app.use(bodyParser.json());// Mock Models Definitionconst MODELS = [ { id: "mock", object: "model", created: Date.now(), owned_by: "custom", capabilities: { textInput: true, textOutput: true } }, { id: "text-in-image-out", object: "model", created: Date.now(), owned_by: "custom", capabilities: { textInput: true, imageOutput: true } }, { id: "text-img-in-text-img-out", object: "model", created: Date.now(), owned_by: "custom", capabilities: { textInput: true, imageInput: true, textOutput: true, imageOutput: true } }, { id: "reflection-and-thinking", object: "model", created: Date.now(), owned_by: "custom", capabilities: { textInput: true, textOutput: true, thinking: true } }, { id: "text-in-html-out", object: "model", created: Date.now(), owned_by: "custom", capabilities: { textInput: true, textOutput: true } }];// Health Checkapp.get('/health', (req, res) => { res.status(200).send('OK');});// List Modelsapp.get('/v1/models', (req, res) => { res.json({ data: MODELS });});// Chat Completionsapp.post('/v1/chat/completions', (req, res) => { const { model, messages, stream } = req.body; const lastMessage = messages[messages.length - 1].content; if (!stream) { return res.status(400).json({ error: "Only streaming is supported in this example" }); } res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); const sendChunk = (content: string) => { res.write(\`data: \${JSON.stringify({ choices: [{ delta: { content } }] })}\\n\\n\`); }; const sendThinkingChunk = (content: string) => { res.write(\`data: \${JSON.stringify({ choices: [{ delta: { content, type: "thinking" } }] // Custom type for thinking })}\\n\\n\`); }; const endStream = () => { res.write('data: [DONE]\\n\\n'); res.end(); }; // Model Logic if (model === "mock") { sendChunk("This is a response from the mock model. I received your message: "); sendChunk(typeof lastMessage === 'string' ? lastMessage : "Multimedia content"); endStream(); } else if (model === "text-in-image-out") { sendChunk("Here is your generated image:\\n"); sendChunk(""); endStream(); } else if (model === "text-img-in-text-img-out") { if (typeof lastMessage === 'string' && lastMessage.toLowerCase().includes("get me an image")) { sendChunk("Here is the image you requested:\\n"); sendChunk(""); } else { sendChunk("I can process images and text. Try asking 'get me an image' or sending an image."); // Simulate returning a generated image url for image input sendChunk("\\n"); } endStream(); } else if (model === "reflection-and-thinking") { // Simulate thinking process sendThinkingChunk("Analyzing the request...\\n"); setTimeout(() => { sendThinkingChunk("Identifying key concepts...\\n"); setTimeout(() => { sendChunk("After deep reflection, I have concluded that "); sendChunk(typeof lastMessage === 'string' ? lastMessage : "this input"); sendChunk(" is indeed interesting."); endStream(); }, 1000); }, 1000); } else if (model === "text-in-html-out") { sendChunk("Here is the HTML document you requested:\\n"); sendChunk("```html\\n"); sendChunk("<!DOCTYPE html>\\n<html>\\n<body>\\n<h1>Hello World</h1>\\n<p>This is a hardcoded HTML document.</p>\\n</body>\\n</html>\\n"); sendChunk("```"); endStream(); } else { sendChunk(\`Unknown model: \${model}\`); endStream(); }});app.listen(port, () => { console.log(\`Example proxy API listening at http://localhost:\${port}\`);});