Helicone LLM Observability
Add LLM observability with Helicone — one-line proxy integration for OpenAI, Anthropic, and OpenRouter. Sets up request logging, cost tracking, prompt versioning, cache, rate limiting, and user-level analytics.
This skill integrates Helicone into LLM applications via the proxy header pattern (or async logger). It configures custom properties for user/session tagging, sets up prompt templates with version pinning, enables response caching with TTL, configures rate limits per user, and exports metrics to your observability stack. Covers self-hosted Helicone setup, Postgres backend tuning, and dashboard creation for cost/latency/quality views.
When to use
Use when you need cost visibility on LLM spending, want to A/B test prompt versions in production, need user-level rate limiting, or need to debug LLM failures with full request/response traces.
Examples
One-line OpenAI integration
Add Helicone proxy to an existing OpenAI client
Integrate Helicone with my Next.js app's OpenAI calls, tag requests with user ID and feature name, and set up a $50/day per-user spending cap
Prompt versioning
Set up versioned prompts with rollout
Use Helicone Prompts to version our customer support prompt, A/B test v3 against v2 on 20% of traffic, and compare CSAT scores