Building a Tier-Based LLM API Gateway with LiteLLM, Stripe, and k3s
This post covers the architecture we built for Vibe Browser’s API gateway — a self-hosted solution that routes LLM requests through a…
Author: Dzianis Vashchuk | Site: Medium | Published: 2025-12-29T23:03:17Z
Vibe Engineering: Building a Tier-Based LLM API Gateway with LiteLLM, Stripe, and k3s This post covers the architecture we built for Vibe Browser’s (vibebrowser.app) API gateway — a self-hosted …
This post covers the architecture we built for Vibe Browser’s (vibebrowser.app) API gateway — a self-hosted solution that routes LLM requests through a unified endpoint while enforcing subscription-based access tiers. Surprised, that there is now such an opensource project, probably will open source what we have right now, because deploying api gateway is so mainstream task every GenAI startup does right now. No need to write this custom code every time.Stack:LiteLLM — OpenAI-compatible proxy with per-key budget/model restrictionsSupabase — Authentication (OAuth, email/password) and PostgreSQL for LiteLLMStripe — Subscription management and tier assignmentAzure OpenAI — Model deployments (gpt-5-mini, gpt-5.1, gpt-5.2, grok-4, deepseek-r1)k3s — Lightweight Kubernetes for self-hosted deploymentTraefik — Ingress with automatic TLS via Let’s EncryptCloudflare — DNS managementSystem Architecture┌────────────────────────────────────────────────────────────────────────────────┐│ AZURE VM (k3s cluster) ││ ┌──────────────────────────────────────────────────────────────────────────┐ ││ │ TRAEFIK INGRESS │ ││ │ (TLS termination, routing) │ ││ │ api.vibebrowser.app portal.vibebrowser.app │ ││ └─────────────┬───────────────────────────────┬────────────────────────────┘ ││ │ │ ││ ▼ ▼ ││ ┌─────────────────────────┐ ┌─────────────────────────┐ ││ │ LITELLM │ │ USER-PORTAL │ ││ │ (API Gateway/Proxy) │ │ (OAuth + Key Gen) │ ││ │ │ │ │ ││ │ - Model routing │ │ - PKCE auth flow │ ││ │ - Per-key budgets │ │ - LiteLLM key creation │ ││ │ - Rate limiting │ │ - Tier assignment │ ││ │ - Usage tracking │ │ │ ││ └───────────┬─────────────┘ └───────────┬─────────────┘ ││ │ │ ││ │ ┌────────────────────┘ ││ │ │ ││ ▼ ▼ ││ ┌─────────────────────────┐ ┌─────────────────────────┐ ││ │ STRIPE-SERVICE │ │ VIBE-SECRETS │ ││ │ (Webhook Handler) │ │ (k8s Secret) │ ││ │ │ │ │ ││ │ - subscription.created │ │ - API keys │ ││ │ - subscription.updated │ │ - DB credentials │ ││ │ - Tier changes │ │ - Stripe secrets │ ││ └─────────────────────────┘ └─────────────────────────┘ ││ │└────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────┴───────┐ ┌───────┴───────┐ ▼ ▼ ▼ ▼┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐│ AZURE OPENAI │ │ AZURE OPENAI │ │ SUPABASE │ │ STRIPE ││ (eastus) │ │ (eastus2) │ │ │ │ ││ │ │ │ │ - Auth (JWT) │ │ - Payments ││ - gpt-5-mini │ │ - gpt-5.2 │ │ - PostgreSQL │ │ - Webhooks ││ - gpt-5.1 │ │ - grok-4 │ │ - User data │ │ - Products ││ - grok-4-fast │ │ - deepseek-r1 │ │ │ │ │└───────────────┘ └───────────────┘ └───────────────┘ └───────────────┘Authentication Flow┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐│ USER │ │ SUPABASE │ │ PORTAL │ │ LITELLM │ │ API │└────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ │ 1. Sign In │ │ │ │ │ (email/pass) │ │ │ │ │────────────────▶│ │ │ │ │ │ │ │ │ │ access_token │ │ │ │ │◀────────────────│ │ │ │ │ │ │ │ │ │ 2. Request Code (PKCE) │ │ │ │ + code_challenge │ │ │ │──────────────────────────────────▶│ │ │ │ │ │ │ │ │ │ Verify JWT │ │ │ │ │◀────────────────│ │ │ │ │────────────────▶│ │ │ │ │ │ │ │ │ auth_code │ │ │ │ │◀──────────────────────────────────│ │ │ │ │ │ │ │ │ 3. Exchange Code │ │ │ │ + code_verifier │ │ │ │──────────────────────────────────▶│ │ │ │ │ │ │ │ │ │ │ Get User Tier │ │ │ │ │ (from Stripe) │ │ │ │ │─────────────────│ │ │ │ │ │ │ │ │ │ 4. Create Key │ │ │ │ │ (models, budget) │ │ │ │────────────────▶│ │ │ │ │ │ │ │ │ │ sk-xxxxx │ │ │ │ │◀────────────────│ │ │ │ │ │ │ │ api_key │ │ │ │ │◀──────────────────────────────────│ │ │ │ │ │ │ │ │ 5. API Request (Bearer sk-xxxxx) │ │ │ │─────────────────────────────────────────────────────────────────────▶│ │ │ │ │ │ │ │ │ │ Validate Key │ │ │ │ │◀────────────────│ │ │ │ │────────────────▶│ │ │ │ │ │ │ │ │ │ Check Model │ │ │ │ │ Access + Budget│ │ │ │ │────────────────▶│ │ │ │ │ │ │ Response │ │ │ │ │◀─────────────────────────────────────────────────────────────────────│ │ │ │ │ │User authentication uses Supabase with PKCE OAuth flow:const auth = await supabase.auth.signInWithPassword({ email, password });// 2. Request auth code with PKCE challengeconst codeVerifier = crypto.randomBytes(32).toString('base64url');const codeChallenge = crypto.createHash('sha256') .update(codeVerifier).digest('base64url');const { code } = await fetch('/api/auth/code', { method: 'POST', headers: { 'Authorization': Bearer ${auth.access_token} }, body: JSON.stringify({ code_challenge: codeChallenge, code_challenge_method: 'S256' })});// 3. Exchange code for LiteLLM API keyconst { key } = await fetch('/api/auth/token', { method: 'POST', body: JSON.stringify({ code, code_verifier: codeVerifier })});Subscription Tier Change Flow┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐│ STRIPE │ │ STRIPE │ │ LITELLM │ │ SUPABASE ││ CHECKOUT │ │ SERVICE │ │ │ │ │└────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ 1. User upgrades to PRO │ │ │ (Stripe Checkout) │ │ │ │ │ │ │ 2. Webhook: subscription.updated │ │ │────────────────▶│ │ │ │ │ │ │ │ │ 3. Get user_id from metadata │ │ │────────────────────────────────▶ │ │ │◀────────────────────────────────│ │ │ │ │ │ │ 4. Determine new tier │ │ │ (price_id -> tier) │ │ │ │ │ │ │ 5. List user │ │────────────────▶│ │ │ │◀────────────────│ │ │ │ │ │ │ │ 6. Update each key: │ │ │ - models: PRO_MODELS │ │ │ - max_budget: 50 │ │ │ - tpm_limit: 100000 │ │ │────────────────▶│ │ │ │ │ │ │ │ 7. Log tier change │ │ │────────────────────────────────▶ │ │ │ │ │ │ 200 OK │ │ │ │◀────────────────│ │ │ │ │ │ │Request Flow (Model Access Check)┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐│ CLIENT │ │ LITELLM │ │ POSTGRES │ │ AZURE ││ │ │ │ │(Supabase)│ │ OPENAI │└────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ POST /v1/chat/completions │ │ │ Authorization: Bearer sk-xxx │ │ │ model: gpt-5.1 │ │ │────────────────▶│ │ │ │ │ │ │ │ │ 1. Lookup key │ │ │ │────────────────▶│ │ │ │ │ ││ │ key_info: │ ││ │ - user_id │ ││ │ - models: [gpt-5-mini, gpt-5.1] ││ │ - max_budget: 50 │ │ │ - spend: 12.50 │ │ │ │◀────────────────│ │ │ │ │ ││ │ 2. Check model access ││ │ gpt-5.1 in allowed_models? YES │ │ │ │ │ │ │ 3. Check budget │ │ │ spend < max_budget? YES │ │ │ │ │ │ │ 4. Route to Azure │ │ │────────────────────────────────▶ │ │ │ │ │ │ │ 5. Response │ │ │ │◀────────────────────────────────│ │ │ │ │ │ │ 6. Update spend │ │ │────────────────▶│ │ │ │ │ │ │ Response │ │ │ │◀────────────────│ │ │ │ │ │ │─────────────────────────────────────────────────────────────────│ POST /v1/chat/completions │ ││ Authorization: Bearer sk-xxx │ ││ model: gpt-5.2 (MAX only) │ ││────────────────▶│ │ ││ │ │ ││ │ Lookup key... │ ││ │ models: [gpt-5-mini, gpt-5.1] ││ │ │ ││ │ gpt-5.2 in allowed_models? NO │ │ │ │ │ │ 403 Forbidden │ │ │ │ "model not in allowed list" │ │ │◀────────────────│ │ │ │ │ │ │Tier-Based Model AccessThree tiers with different model access:model_list: - model_name: gpt-5-mini litellm_params: model: azure/gpt-5-mini api_base: https://eastus.api.cognitive.microsoft.com/ api_key: os.environ/AZURE_OPENAI_API_KEY - model_name: gpt-oss-120b litellm_params: model: azure/gpt-oss-120b api_base: https://eastus2.cognitiveservices.azure.com/ api_key: os.environ/EASTUS2_OPENAI_API_KEY # PRO TIER - model_name: gpt-5.1 litellm_params: model: azure/gpt-5-1 api_base: https://eastus.api.cognitive.microsoft.com/ api_key: os.environ/AZURE_OPENAI_API_KEY - model_name: grok-4-fast litellm_params: model: azure/grok-4-fast-non-reasoning api_base: https://eastus2.cognitiveservices.azure.com/ api_key: os.environ/EASTUS2_OPENAI_API_KEY - model_name: deepseek-v3.2 litellm_params: model: azure/DeepSeek-V3-0324 api_base: https://eastus2.cognitiveservices.azure.com/ api_key: os.environ/EASTUS2_OPENAI_API_KEY # MAX TIER - model_name: gpt-5.2 litellm_params: model: azure/gpt-5-2 api_base: https://eastus2.cognitiveservices.azure.com/ api_key: os.environ/EASTUS2_OPENAI_API_KEY - model_name: grok-4 litellm_params: model: azure/grok-4 api_base: https://eastus2.cognitiveservices.azure.com/ api_key: os.environ/EASTUS2_OPENAI_API_KEY - model_name: deepseek-r1 litellm_params: model: azure/DeepSeek-R1 api_base: https://eastus2.cognitiveservices.azure.com/ api_key: os.environ/EASTUS2_OPENAI_API_KEYTier definitions in the Stripe webhook handler:const TIER_MODELS = { free: ['gpt-5-mini', 'gpt-oss-120b'], pro: ['gpt-5-mini', 'gpt-oss-120b', 'gpt-5.1', 'grok-4-fast', 'deepseek-v3.2'], max: ['gpt-5-mini', 'gpt-oss-120b', 'gpt-5.1', 'grok-4-fast', 'deepseek-v3.2', 'gpt-5.2', 'grok-4', 'grok-4-fast-reasoning', 'deepseek-r1']};const BUDGETS = { free: { max_budget: 5, tpm_limit: 10000, rpm_limit: 100 }, pro: { max_budget: 50, tpm_limit: 100000, rpm_limit: 1000 }, max: { max_budget: 500, tpm_limit: 1000000, rpm_limit: 5000 }};Stripe IntegrationWebhook handler updates LiteLLM keys when subscription changes:app.post('/webhook', async (req, res) => { const event = stripe.webhooks.constructEvent( req.body, req.headers['stripe-signature'], WEBHOOK_SECRET ); switch (event.type) { case 'customer.subscription.created': case 'customer.subscription.updated': const subscription = event.data.object; const tier = getTierFromPriceId(subscription.items.data[0].price.id); const userId = subscription.metadata.user_id; // Update all user's keys with new tier permissions await updateUserKeys(userId, tier); break; case 'customer.subscription.deleted': await downgradeToFree(subscription.metadata.user_id); break; }});Cloudflare DNS ConfigurationCloudflare for DNS with proxy disabled (DNS-only) since Traefik handles TLS:┌─────────────────────────────────────────────────────────────────┐│ CLOUDFLARE DNS ││ ││ Type Name Content Proxy ││ ──── ──── ─────── ───── ││ A api.vibebrowser.app 4.246.58.241 DNS only ││ A portal.vibebrowser.app 4.246.58.241 DNS only ││ A vibebrowser.app (Vercel IP) Proxied ││ ││ Why DNS-only for api/portal: ││ - Traefik terminates TLS for Let's Encrypt ACME challenge ││ - Cloudflare proxy would break cert renewal ││ - Direct connection enables WebSocket streaming ││ │└─────────────────────────────────────────────────────────────────┘Traefik Configurationk3s ships with Traefik. Configure for automatic TLS:apiVersion: helm.cattle.io/v1kind: HelmChartConfigmetadata: name: traefik namespace: kube-systemspec: valuesContent: |- ports: websecure: tls: enabled: true certResolvers: letsencrypt: email: admin@vibebrowser.app storage: /data/acme.json httpChallenge: entryPoint: webIngressRoute with path-based routing:apiVersion: traefik.io/v1alpha1kind: IngressRoutemetadata: name: vibe-ingress namespace: vibespec: entryPoints: - websecure routes: - match: Host(api.vibebrowser.app) && !PathPrefix(/stripe) kind: Rule services: - name: litellm port: 4000 # Stripe webhooks - match: Host(api.vibebrowser.app) && PathPrefix(/stripe) kind: Rule services: - name: stripe-service port: 3002 middlewares: - name: strip-stripe-prefix # User portal - match: Host(portal.vibebrowser.app) kind: Rule services: - name: user-portal port: 3001 tls: certResolver: letsencrypt domains: - main: api.vibebrowser.app - main: portal.vibebrowser.app---apiVersion: traefik.io/v1alpha1kind: Middlewaremetadata: name: strip-stripe-prefix namespace: vibespec: stripPrefix: prefixes: - /stripek3s Deployment with KustomizeKustomize for managing k8s manifests with secrets:apiVersion: kustomize.config.k8s.io/v1beta1kind: Kustomizationnamespace: viberesources: - namespace.yaml - litellm.yaml - stripe-service.yaml - user-portal.yaml - ingress.yamlsecretGenerator: - name: vibe-secrets envs: - .env.secrets options: disableNameSuffixHash: trueLiteLLM deployment:apiVersion: apps/v1kind: Deploymentmetadata: name: litellmspec: replicas: 1 template: spec: containers: - name: litellm image: ghcr.io/berriai/litellm:main-latest args: ["--config=/app/config.yaml", "--port=4000"] env: - name: DATABASE_URL valueFrom: secretKeyRef: name: vibe-secrets key: DATABASE_URL - name: LITELLM_MASTER_KEY valueFrom: secretKeyRef: name: vibe-secrets key: LITELLM_MASTER_KEYSupabase as LiteLLM DatabaseLiteLLM requires PostgreSQL for keys, budgets, and usage. Using Supabase connection pooler:DATABASE_URL=postgresql://postgres.PROJECT_REF:PASSWORD@aws-1-us-east-1.pooler.supabase.com:5432/litellmKey considerations:Create separate litellm database (default postgres has Supabase schema conflicts)Use port 5432 (session mode) not 6543 (transaction mode) — LiteLLM needs session persistencePooler host varies by project regionManaging Kubernetes Secrets from GitHub ActionsSecrets flow: GitHub Secrets -> Kustomize secretGenerator -> k8s Secret┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐│ GitHub Secrets │ │ .env.secrets │ │ k8s Secret ││ (encrypted) │─────▶│ (ephemeral) │─────▶│ vibe-secrets ││ │ │ │ │ (base64 in etcd)│└─────────────────┘ └─────────────────┘ └─────────────────┘GitHub Actions generates .env.secrets at deploy time:on: push: branches: [master] paths: ['services/subscription/**']jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup kubeconfig run: | echo "${{ secrets.K3S_KUBECONFIG }}" | base64 -d > /tmp/kubeconfig echo "KUBECONFIG=/tmp/kubeconfig" >> $GITHUB_ENV - name: Generate secrets file working-directory: services/subscription/k8s run: | cat > .env.secrets << 'EOF' LITELLM_MASTER_KEY=${{ secrets.LITELLM_MASTER_KEY }} AZURE_OPENAI_API_KEY=${{ secrets.AZURE_OPENAI_API_KEY }} DATABASE_URL=${{ secrets.DATABASE_URL }} STRIPE_SECRET_KEY=${{ secrets.STRIPE_SECRET_KEY }} STRIPE_WEBHOOK_SECRET=${{ secrets.STRIPE_WEBHOOK_SECRET }} EOF - name: Deploy with Kustomize run: | kubectl apply -k services/subscription/k8s kubectl -n vibe rollout status deployment/litellm --timeout=120s - name: Run tier tests env: TEST_FREE_EMAIL: ${{ secrets.TEST_FREE_EMAIL }} TEST_FREE_PASSWORD: ${{ secrets.TEST_FREE_PASSWORD }} run: npx tsx tests/tier-models.test.tsTesting Tier AccessIntegration test verifies model access per tier:async function testModelAccess(apiKey: string, model: string) { const response = await fetch('https://api.vibebrowser.app/v1/chat/completions', { method: 'POST', headers: { 'Authorization': Bearer ${apiKey}, 'Content-Type': 'application/json' }, body: JSON.stringify({ model, messages: [{ role: 'user', content: 'test' }], max_tokens: 5 }) }); return response.ok;}// Test results:// FREE: gpt-5-mini OK, gpt-5.1 BLOCKED// PRO: gpt-5-mini OK, gpt-5.1 OK, gpt-5.2 BLOCKED // MAX: all models OKCostOur Setup (k3s on Azure VM)ComponentMonthly CostAzure VM (Standard_D2s_v3, 2 vCPU, 8GB RAM)$50Supabase (Free tier, 500MB database)$0Cloudflare (DNS only)$0Let’s Encrypt TLS$0Azure OpenAIPay-per-tokenTotal fixed cost$50/monthAlternative: Azure Kubernetes Service (AKS) — Managed k8sIf we used Azure’s managed Kubernetes (AKS) instead of self-hosted k3s:ComponentMonthly CostAKS cluster (free control plane)$0Node pool (Standard_D2s_v3 x 2 for HA)$100Azure Load Balancer (Standard)$18Azure Database for PostgreSQL (Basic, 2 vCore)$50Azure Application Gateway (optional, WAF)$175Total fixed cost~$170–345/monthCost Comparison┌─────────────────────────────────────────────────────────────────┐│ Monthly Infrastructure Cost │├─────────────────────────────────────────────────────────────────┤│ ││ k3s on VM ████████ $50 ││ ││ AKS (minimal) ██████████████████████████████████ $170 ││ ││ AKS (with WAF)████████████████████████████████████████████████ ││ ██████████████████████████ $345 ││ │└─────────────────────────────────────────────────────────────────┘Why k3s for an MVP:This is our MVP infrastructure — optimized for cost and speed of iteration, not enterprise scale. The goal is to validate the product with real users while keeping fixed costs minimal.~$50/month vs ~$170–345/month — 70–85% cost savings lets us run longer on limited runway.Supabase free tier — Instead of paying ~$50/month for Azure Database for PostgreSQL, we use Supabase’s free tier with connection pooling. Good enough for MVP scale.Traefik included — k3s ships with Traefik and automatic Let’s Encrypt. No need for Azure Load Balancer ($18) or Application Gateway ($175).GitOps from day one — All configuration in GitHub (services/subscription/k8s/). Merge to master = auto-deploy. Full audit trail, no manual kubectl, easy rollbacks.Fast iteration — Direct SSH access, no abstraction layers, deploy in seconds. When something breaks at 2am, we can fix it immediately.Easy to migrate later — Same Kubernetes manifests work on AKS. When we need HA and auto-scaling, migration is straightforward.When to migrate to AKS:When traffic grows and we need auto-scaling, the migration path is simple:Create AKS cluster with az aks createPoint kubeconfig to AKSRun kubectl apply -k services/subscription/k8s/ - same manifests, same GitOps workflowUpdate DNS to new load balancer IPThe Kubernetes manifests are portable. No vendor lock-in, no rewrite required.