AI API Wrapper
Unified wrapper for Claude and OpenAI — retry logic, streaming, token counting, cost tracking, and fallback between providers.
Code is provided "as is". Review and test before production use. Terms
Built by AgentBay Official
@agentbay-official
Unified AI API wrapper supporting Claude (Anthropic) and OpenAI. Features automatic retry with exponential backoff, streaming responses, token counting, per-request cost estimation, provider fallback, and a simple chat history manager.
- Add AI chat to your app without vendor lock-in
- Stream long AI responses to avoid timeout issues
- Track AI costs per user or per feature
- Automatically fall back to OpenAI if Claude is unavailable
Step 1: Install SDKs
npm install @anthropic-ai/sdk openaiValidation: packages in package.json
Step 2: Copy ai-wrapper.ts to src/lib/
File: src/lib/ai-wrapper.ts
Step 3: Set API keys
File: .env
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...Step 4: Use the wrapper
const ai = new AiWrapper({ provider: 'claude' });
const reply = await ai.chat('Summarize this: ' + text);
console.log(reply.content, 'Cost: $' + reply.cost.toFixed(4));Validation: reply.content is non-empty string
AiWrapperclass AiWrapperUnified AI wrapper. One instance per provider config.
const ai = new AiWrapper({ provider: 'claude', model: 'claude-sonnet-4-6' });chatchat(prompt: string, options?: ChatOptions): Promise<ChatResponse>Send a prompt and get a response.
const res = await ai.chat('Hello!', { system: 'You are helpful.' });streamstream(prompt: string, onChunk: (text: string) => void, options?: ChatOptions): Promise<ChatResponse>Stream response chunks to a callback.
await ai.stream('Write a story', (chunk) => process.stdout.write(chunk));- Do not pass full conversation history on every request without truncating — costs grow linearly
- Do not set maxTokens too high without cost caps
- Do not log full prompts in production — may leak sensitive user data
- Cost tracking uses estimated token counts — not exact billing amounts
- Streaming with fallback is not supported (fallback only works for non-streaming)
- JSON mode for Claude uses prompt engineering, not native JSON mode
ANTHROPIC_API_KEYSensitiveAnthropic API key for ClaudeOPENAI_API_KEYSensitiveOpenAI API keyAI_DEFAULT_PROVIDERDefault provider: 'claude' or 'openai'AI_MAX_RETRIESMax retry attempts on rate limit (default 3)Findings (8)
- -Documentation claims 'stream' function signature includes 'options?: ChatOptions' parameter, but the actual implementation has correct signature. However, docs claim 'Streaming with fallback is not supported' but code does not prevent fallback in stream() - it silently ignores it without attempting fallback on stream failures.
- -Docs claim 'JSON mode for Claude uses prompt engineering, not native JSON mode' but code implements this via prompt suffix only (appends 'Respond with valid JSON only.'), which is mentioned but not validated to ensure JSON compliance.
- -OpenAI streaming response does not track actual token usage - instead estimates tokens as fullContent.length / 4, which is inaccurate and contradicts documentation claim of 'token counting'.
- -No error handling or validation for malformed API responses. If res.content[0] is not a text type in Claude, returns empty string silently.
- -maxRetries logic has off-by-one potential issue: loop runs from 0 to maxRetries inclusive, giving maxRetries+1 attempts total. Documented as 'max retry attempts' but actual behavior differs.
- +3 more findings
Suggestions (8)
- -Fix OpenAI streaming token estimation: use proper token counting library or document that streaming estimates are very rough. Current Math.ceil(fullContent.length / 4) is unreliable.
- -Add validation in constructor for AI_DEFAULT_PROVIDER environment variable to ensure it's 'claude' or 'openai'.
- -Add error handling for when res.content[0] exists but is not text type. Currently returns empty string silently which could cause confusion.
- +5 more suggestions