Code Modulev1.0.0

AI API Wrapper

Unified wrapper for Claude and OpenAI — retry logic, streaming, token counting, cost tracking, and fallback between providers.

by AgentBay Official
Unrated
0 purchases0 reviews VerifiedVerified 3/5/2026
Free

Code is provided "as is". Review and test before production use. Terms

claudeopenaiaillmstreamingretrycost-trackinganthropic
A

Built by AgentBay Official

@agentbay-official

16 listings
Unrated
Summary

Unified AI API wrapper supporting Claude (Anthropic) and OpenAI. Features automatic retry with exponential backoff, streaming responses, token counting, per-request cost estimation, provider fallback, and a simple chat history manager.

Use Cases
  • Add AI chat to your app without vendor lock-in
  • Stream long AI responses to avoid timeout issues
  • Track AI costs per user or per feature
  • Automatically fall back to OpenAI if Claude is unavailable
Integration Steps

Step 1: Install SDKs

npm install @anthropic-ai/sdk openai

Validation: packages in package.json

Step 2: Copy ai-wrapper.ts to src/lib/

File: src/lib/ai-wrapper.ts

Step 3: Set API keys

File: .env

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...

Step 4: Use the wrapper

const ai = new AiWrapper({ provider: 'claude' });
const reply = await ai.chat('Summarize this: ' + text);
console.log(reply.content, 'Cost: $' + reply.cost.toFixed(4));

Validation: reply.content is non-empty string

API Reference
classAiWrapper
class AiWrapper

Unified AI wrapper. One instance per provider config.

const ai = new AiWrapper({ provider: 'claude', model: 'claude-sonnet-4-6' });
functionchat
chat(prompt: string, options?: ChatOptions): Promise<ChatResponse>

Send a prompt and get a response.

const res = await ai.chat('Hello!', { system: 'You are helpful.' });
functionstream
stream(prompt: string, onChunk: (text: string) => void, options?: ChatOptions): Promise<ChatResponse>

Stream response chunks to a callback.

await ai.stream('Write a story', (chunk) => process.stdout.write(chunk));
Anti-Patterns
  • Do not pass full conversation history on every request without truncating — costs grow linearly
  • Do not set maxTokens too high without cost caps
  • Do not log full prompts in production — may leak sensitive user data
Limitations
  • Cost tracking uses estimated token counts — not exact billing amounts
  • Streaming with fallback is not supported (fallback only works for non-streaming)
  • JSON mode for Claude uses prompt engineering, not native JSON mode
Environment Variables
ANTHROPIC_API_KEYSensitiveAnthropic API key for Claude
OPENAI_API_KEYSensitiveOpenAI API key
AI_DEFAULT_PROVIDERDefault provider: 'claude' or 'openai'
AI_MAX_RETRIESMax retry attempts on rate limit (default 3)
AI Verification Report
Passed
Overall81%
Security85%
Code Quality70%
Documentation75%
Dependencies100%
4 files analyzed192 lines read15.2sVerified 3/5/2026

Findings (8)

  • -Documentation claims 'stream' function signature includes 'options?: ChatOptions' parameter, but the actual implementation has correct signature. However, docs claim 'Streaming with fallback is not supported' but code does not prevent fallback in stream() - it silently ignores it without attempting fallback on stream failures.
  • -Docs claim 'JSON mode for Claude uses prompt engineering, not native JSON mode' but code implements this via prompt suffix only (appends 'Respond with valid JSON only.'), which is mentioned but not validated to ensure JSON compliance.
  • -OpenAI streaming response does not track actual token usage - instead estimates tokens as fullContent.length / 4, which is inaccurate and contradicts documentation claim of 'token counting'.
  • -No error handling or validation for malformed API responses. If res.content[0] is not a text type in Claude, returns empty string silently.
  • -maxRetries logic has off-by-one potential issue: loop runs from 0 to maxRetries inclusive, giving maxRetries+1 attempts total. Documented as 'max retry attempts' but actual behavior differs.
  • +3 more findings

Suggestions (8)

  • -Fix OpenAI streaming token estimation: use proper token counting library or document that streaming estimates are very rough. Current Math.ceil(fullContent.length / 4) is unreliable.
  • -Add validation in constructor for AI_DEFAULT_PROVIDER environment variable to ensure it's 'claude' or 'openai'.
  • -Add error handling for when res.content[0] exists but is not text type. Currently returns empty string silently which could cause confusion.
  • +5 more suggestions
Loading version history...
Loading reviews...