Uplink API Documentation
OpenAI-compatible access to 170+ models, arbitrage-aware routing, workflows, and tenant analytics delivered through a single Cloudflare Worker. This guide covers authentication, common patterns, and every public endpoint.
Platform Overview
Uplink is a drop-in replacement for the OpenAI REST API that routes traffic across Groq, Together.ai, OpenRouter, and bespoke research providers. The worker continuously synchronizes pricing and health signals so the arbitrage engine can deliver 20-40% cost savings without sacrificing quality.
Our Endpoints Use Our Endpointsβ’ β The agent mode calls search endpoints, workflows orchestrate chat completions, and every capability is built by composing other capabilities. This ensures consistency, reliability, and that every feature we add makes every other feature more powerful.
Quickstart
Authenticate with any valid tenant key and hit the v3 chat endpoint. The worker automatically selects the cheapest healthy provider for the requested capability.
Using curl
curl https://api.frnds.cloud/v3/chat/completions \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "Summarize Uplink."}],
"arbitrage_mode": "auto",
"stream": false
}'
Using Python SDK
from uplink_client import UplinkClient
client = UplinkClient(
base_url="https://api.frnds.cloud",
api_key="ak_your_key"
)
# Basic chat
async with client:
response = await client.chat({
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "Summarize Uplink."}]
})
print(response.choices[0].message.content)
# Agent mode with auto-research
async with client:
response = await client.agent_chat({
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "What are the latest AI breakthroughs?"}],
"tools": ["search", "extract"],
"max_iterations": 5
})
print(response.choices[0].message.content)
Using JavaScript SDK
import { UplinkClient } from '@frnd/uplink-sdk'
const client = new UplinkClient({
baseUrl: 'https://api.frnds.cloud',
apiKey: 'ak_your_key'
})
// Basic chat
const response = await client.chat({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Summarize Uplink.' }]
})
console.log(response.choices[0].message.content)
// Agent mode with auto-research
const agentResponse = await client.agentChat({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'What are the latest AI breakthroughs?' }],
tools: ['search', 'extract'],
max_iterations: 5
})
console.log(agentResponse.choices[0].message.content)
Install SDKs:
- Python:
pip install uplink-client(or fromsdk/python/) - JavaScript:
npm install @frnd/uplink-sdk(or fromsdk/javascript/)
Streaming works with "stream": true; partial tokens emit as data: events that mirror the OpenAI spec and include arbitrage metadata on the first chunk.
β οΈ Common Mistakes to Avoid
- Wrong API URL: Always use
https://api.frnds.cloud(not.workers.devor other variants) - Missing Headers: Include both
Authorization: Bearer ak_your_keyANDContent-Type: application/json - Malformed JSON: Ensure your request body is valid JSON - use
-d '{...}'with curl - HTTP Method: Most endpoints use
POSTfor mutations,GETfor reads - API Key Format: Keys start with
ak_prefix (e.g.,ak_dev_abc123)
π SDK v2.0 - Built-in OAuth & Developer Onboarding
NEW! The Uplink SDK now includes complete OAuth authentication, sub-tenant management, and payment gateway integration. Build AI-powered apps with zero configuration.
Installation
NPM (Recommended):
npm install @frnd/uplink-sdk
Quick Install (curl):
curl -fsSL https://api.frnds.cloud/sdk/install.sh | bash
Direct from URL:
# Install from Cloudflare Pages
npm install -g https://api.frnds.cloud/sdk/uplink-sdk-2.0.0.tgz
# Or from GitHub
npm install -g github:your-org/uplink-worker#main
Quick Start: 3-Step Setup
Step 1: Initialize Developer Account (one-time)
import { initializeDeveloperAccount } from '@frnd/uplink-sdk'
// Opens browser for OAuth, creates sub-tenant, saves credentials
const result = await initializeDeveloperAccount({
appName: 'My Awesome App',
plan: 'pro', // starter, pro, or enterprise
developerEmail: '[email protected]'
})
console.log('β Sub-tenant:', result.subTenant.id)
console.log('β API Key:', result.apiKey)
// Credentials saved to .uplink/config.json
Step 2: Use Auto-Configured Client
import { getUplinkClient } from '@frnd/uplink-sdk'
// Automatically loads credentials from .uplink/config.json
const uplink = await getUplinkClient()
const response = await uplink.chat({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Hello!' }]
})
console.log(response.choices[0].message.content)
Step 3: Deploy Your App
That's it! Your app now has:
- β OAuth authentication
- β Isolated sub-tenant with quotas
- β Automatic credential management
- β Zero configuration for end users
Payment Gateway Integration
Monetize your app with built-in Stripe/Paddle support:
import { initializeDeveloperAccount } from '@frnd/uplink-sdk'
const result = await initializeDeveloperAccount({
appName: 'My Paid App',
plan: 'pro',
requirePayment: true,
paymentConfig: {
provider: 'stripe',
priceId: 'price_xxxxx', // Your Stripe price ID
successUrl: 'https://myapp.com/success',
cancelUrl: 'https://myapp.com/cancel'
}
})
// Payment is validated before sub-tenant creation β
Custom Payment Validation:
const result = await initializeDeveloperAccount({
appName: 'My App',
requirePayment: true,
paymentConfig: {
provider: 'custom',
validateFn: async (paymentToken) => {
// Your custom validation logic
return await myPaymentGateway.verify(paymentToken)
}
}
})
π° Pricing SDK - Real-Time Cost Tracking
Setup:
import { createPricingSDK } from '@frnd/uplink-sdk'
const pricing = createPricingSDK({
apiKey: 'your-api-key',
baseUrl: 'https://api.frnds.cloud',
useActualCosts: true, // Use real usage data
lookbackDays: 30 // 30-day average
})
Calculate Operation Costs:
// Get cost for a specific operation
const chatCost = await pricing.getOperationCost('chat', {
inputTokens: 1000,
outputTokens: 500
})
console.log(`Chat cost: $${chatCost.toFixed(6)}`)
// Calculate document processing cost
const docCost = await pricing.getOperationCost('document', {
documentSizeKB: 500
})
// Calculate embedding cost
const embeddingCost = await pricing.getOperationCost('embedding', {
inputTokens: 2000
})
Calculate Total User Costs:
// Calculate monthly cost for a user's usage
const monthlyCost = await pricing.calculateUserCost({
documents: 100, // Documents processed
chats: 1000, // Chat requests
storageGB: 5, // Storage used
voiceMinutes: 60 // Voice transcription
})
console.log(`Monthly cost: $${monthlyCost.totalCost}`)
console.log(`Breakdown:`)
console.log(` LLM: $${monthlyCost.breakdown.llm}`)
console.log(` Storage: $${monthlyCost.breakdown.storage}`)
console.log(` Voice: $${monthlyCost.breakdown.voice}`)
Get Infrastructure Pricing:
// Get full pricing configuration
const config = await pricing.getPricingConfig()
console.log('LLM Pricing:')
console.log(` Input: $${config.llm.inputTokenCost} per 1M tokens`)
console.log(` Output: $${config.llm.outputTokenCost} per 1M tokens`)
console.log('Embeddings:')
console.log(` Cost: $${config.embeddings.costPerMillionTokens} per 1M tokens`)
console.log('Storage:')
console.log(` Vectorize: $${config.vectorize.storageCostPer100MDimensions} per 100M dims`)
console.log(` R2: $${config.r2.storageCostPerGB} per GB/month`)
Build Custom Pricing Tiers:
// Generate optimal pricing tiers based on usage patterns
const tiers = await pricing.generateTiers({
targetMargin: 0.25, // 25% margin
expectedMonthlyUsers: 1000,
avgDocumentsPerUser: 50,
avgChatsPerUser: 500
})
console.log('Recommended Tiers:')
tiers.forEach(tier => {
console.log(`${tier.name}: $${tier.price}/month`)
console.log(` Includes: ${tier.quotas.documents} docs, ${tier.quotas.chats} chats`)
console.log(` Margin: ${tier.margin}%`)
})
API Client Features
The SDK includes a full-featured OpenAI-compatible client:
Chat Completions:
const response = await uplink.chat({
model: 'llama-3.3-70b-versatile',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' }
],
temperature: 0.7,
max_tokens: 1000
})
Streaming:
for await (const chunk of uplink.streamChat({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Tell me a story...' }]
})) {
process.stdout.write(chunk.choices[0]?.delta?.content || '')
}
Multi-Provider Arbitrage:
// Use canonical model name for automatic routing
const response = await uplink.chatWithArbitrage({
model: 'llama-3.3-70b', // No provider suffix
messages: [{ role: 'user', content: 'Hello!' }]
}, {
mode: 'cost', // 'auto', 'cost', 'speed', 'quality'
max_cost: 0.001
})
// Check savings
console.log(`Provider: ${response._arbitrage.provider_used}`)
console.log(`Saved: ${response._arbitrage.savings_percentage}%`)
Magic API (Ultra-Simple):
import { Magic } from '@frnd/uplink-sdk'
const magic = new Magic('https://api.frnds.cloud')
// Simple question
const answer = await magic.ask('What is quantum computing?')
// Web search
const results = await magic.search('latest AI developments 2024')
// Extract content
const content = await magic.extract('https://example.com')
// Conversational chat
const chat = magic.chat()
await chat.say('Hello!')
await chat.say('How do I use async/await?')
console.log(chat.history)
TypeScript Support
Full TypeScript support with comprehensive types:
import {
UplinkClient,
initializeDeveloperAccount,
type OnboardingOptions,
type OnboardingResult,
type UplinkCredentials,
type UplinkPricingConfig
} from '@frnd/uplink-sdk'
const options: OnboardingOptions = {
appName: 'My App',
plan: 'pro',
requirePayment: true
}
const result: OnboardingResult = await initializeDeveloperAccount(options)
const credentials: UplinkCredentials = result.credentials
Configuration Management
The SDK stores credentials in .uplink/config.json:
{
"UPLINK_URL": "https://api.frnds.cloud",
"API_KEY": "ak_sub_xyz789...",
"TENANT_ID": "sub_abc123",
"APP_NAME": "My App",
"PLAN": "pro",
"CREATED_AT": "2025-11-01T12:00:00.000Z",
"MASTER_API_KEY": "ak_master123...",
"MASTER_TENANT_ID": "tenant_parent456"
}
Load Credentials:
import { loadConfig } from '@frnd/uplink-sdk'
const config = loadConfig() // Loads from .uplink/config.json
if (!config) {
console.error('Run: npx @frnd/uplink-sdk init')
process.exit(1)
}
console.log(`API Key: ${config.API_KEY}`)
console.log(`Tenant: ${config.TENANT_ID}`)
.uplink/config.json to git. Add it to .gitignore immediately.
Plans & Quotas
| Plan | Monthly Tokens | Daily Tokens | RPM | Concurrent | Storage |
|---|---|---|---|---|---|
| Starter | 100K | 5K | 10 | 2 | 1 GB |
| Pro | 1M | 50K | 60 | 10 | 10 GB |
| Enterprise | 10M | 500K | 300 | 50 | 100 GB |
Additional Resources
β¨ Magic EVS - Zero-Config Document RAG
NEW! Upload files and chat with them in 2 API calls - no chunking, no embeddings, no configuration required.
Quick Example
# 1. Upload a file
curl -X POST https://api.frnds.cloud/v3/evs/magic/upload \
-H "Authorization: Bearer ak_your_key" \
-F "[email protected]"
# Response includes source_id
# {"success":true,"sourceId":"magic_employee_handbook_pdf_123",...}
# 2. Ask questions
curl "https://api.frnds.cloud/v3/evs/magic/search?q=How%20many%20PTO%20days?" \
-H "Authorization: Bearer ak_your_key"
Magic EVS Endpoints
Upload a file with automatic chunking and indexing. Supports PDF, TXT, MD, JSON, CSV.
| Field | Type | Description |
|---|---|---|
file |
File | File to upload (multipart/form-data) |
chat_id |
string | Associate with specific conversation |
tags |
string | Comma-separated tags (e.g., "hr,policies") |
category |
string | Document category for filtering |
Semantic search across uploaded documents with automatic query optimization.
curl "https://api.frnds.cloud/v3/evs/magic/search?q=vacation%20policy&limit=10" \
-H "Authorization: Bearer ak_your_key"
Ask questions about your documents - RAG automatically retrieves context.
curl -X POST https://api.frnds.cloud/v3/evs/magic/chat \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"message": "What is the PTO policy?",
"source_id": "magic_handbook_pdf_123"
}'
Upload multiple files at once with automatic grouping.
curl -X POST https://api.frnds.cloud/v3/evs/magic/batch \
-H "Authorization: Bearer ak_your_key" \
-F "[email protected]" \
-F "[email protected]" \
-F "[email protected]" \
-F "group_id=company_docs"
Why Magic EVS? Upload a file, get instant search and chat. No manual chunking, no embedding management, no vector DB setup. 98%+ accuracy with 50ms P50 latency. Production-ready RAG in 1 API call.
Authentication
All API endpoints require an API key. Keys are tenant scoped and can be rotated without downtime. Three authentication patterns are accepted for backward compatibility:
| Method | Format | Use Case |
|---|---|---|
| Authorization header | Authorization: Bearer ak_xxx |
Recommended for all OpenAI-compatible clients. |
| Query string | ?api_key=ak_xxx |
Ollama-compatible integrations and simple demos. |
| Request body | { "api_key": "ak_xxx" } |
Legacy SDKs that cannot set headers. |
Uplink enforces global and per-key rate limiting, KV-backed quota tracking, and tenant suspension checks before any request is proxied to a provider.
π Self-Service Signup & Signin
Uplink provides secure email-verified authentication endpoints for self-service user onboarding with automatic tier-based API token generation.
Create a new account and receive verification code via email (expires in 15 minutes).
curl -X POST https://api.frnds.cloud/public/signup \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"first_name": "Jane",
"last_name": "Doe",
"phone": "+1234567890"
}'
Verify email with 6-digit code and receive dev token (7 days, 5 requests, 50K tokens).
curl -X POST https://api.frnds.cloud/public/verify-email \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"code": "123456"
}'
Sign in to existing account and receive verification code via email.
curl -X POST https://api.frnds.cloud/public/signin \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]"
}'
Verify signin code and retrieve your active API token.
curl -X POST https://api.frnds.cloud/public/verify-signin \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"code": "654321"
}'
Security: All codes are time-limited (15 min), single-use, and require email verification. Failed attempts are logged for security monitoring. Tokens are only shown once at creation.
π± Telegram Login Widget
NEW! Passwordless signup and signin using Telegram. Automatic account creation with 30-day developer tier (100 requests/day).
Verify Telegram Login Widget authentication data and create/sign-in user.
curl -X POST https://api.frnds.cloud/auth/telegram/verify \
-H "Content-Type: application/json" \
-d '{
"id": 123456789,
"first_name": "John",
"last_name": "Doe",
"username": "johndoe",
"photo_url": "https://...",
"auth_date": 1698765432,
"hash": "abc123..."
}'
Response
{
"success": true,
"message": "Welcome back, John!",
"session_id": "sess_abc123...",
"token": "ak_dev_xyz789...",
"user": {
"id": "user_xyz789",
"telegram_id": 123456789,
"tier": "telegram_signup",
"first_name": "John",
"username": "johndoe"
}
}
Link Telegram account to an existing email-based account. Requires authenticated session.
curl -X POST https://api.frnds.cloud/api/link-telegram \
-H "Content-Type: application/json" \
-H "Cookie: uplink_session=sess_..." \
-d '{
"id": 123456789,
"first_name": "John",
"username": "johndoe",
"auth_date": 1698765432,
"hash": "abc123..."
}'
Response
{
"success": true,
"message": "Telegram account linked successfully",
"linked_methods": ["email", "telegram"]
}
Benefits of Telegram Auth
- Instant signup - No email verification needed
- Secure - Cryptographically signed by Telegram
- Generous limits - 30-day trial with 100 requests/day
- Multi-auth - Link both email and Telegram to one account
User Dashboard & Session Management
NEW! Authenticated dashboard for viewing usage, managing tokens, and upgrading tiers. Uses HttpOnly cookies for security.
Create Session
Create an authenticated session using your API token. Returns HttpOnly cookie for dashboard access.
curl -X POST https://api.frnds.cloud/auth/session \
-H "Content-Type: application/json" \
-d '{
"token": "ak_dev_your_token_here"
}'
Response
{
"success": true,
"message": "Session created",
"session_id": "sess_abc123..."
}
Get Dashboard Data
Retrieve user information, usage statistics, and limits. Requires authenticated session (cookie).
curl https://api.frnds.cloud/api/dashboard \
-H "Cookie: uplink_session=sess_..."
Response
{
"user": {
"id": "user_xyz789",
"email": "[email protected]",
"first_name": "Jane",
"last_name": "Doe",
"tier": "email_verified",
"created_at": "2025-10-20T10:30:00Z",
"expires_at": "2025-10-27T10:30:00Z",
"telegram_id": 123456789,
"telegram_username": "janedoe",
"linked_auth_methods": ["email", "telegram"],
"can_upgrade": true,
"needs_email": false
},
"token": "ak_dev_your_token_here",
"usage": {
"total_requests": 45,
"total_tokens": 12500
},
"limits": {
"max_requests": 100,
"max_tokens": 100000,
"expires_in_days": 7
}
}
Create Checkout Session
Create Stripe checkout session for tier upgrade. Requires authenticated session and verified email.
curl "https://api.frnds.cloud/api/create-checkout?tier=tier2" \
-H "Cookie: uplink_session=sess_..."
Response
{
"url": "https://checkout.stripe.com/c/pay/cs_...",
"session_id": "cs_..."
}
Sign Out
Destroy session and sign out. Clears HttpOnly cookie.
curl -X POST https://api.frnds.cloud/api/signout \
-H "Cookie: uplink_session=sess_..."
SDK Usage
JavaScript/TypeScript
import { UplinkClient } from 'uplink-client'
const client = new UplinkClient({
baseUrl: 'https://api.frnds.cloud'
})
// Create session (cookies handled automatically in browser)
await client.createSession('ak_dev_your_token')
// Get dashboard data
const dashboard = await client.getDashboard()
console.log(`Usage: ${dashboard.usage.total_requests}/${dashboard.limits.max_requests}`)
// Create checkout for upgrade
if (dashboard.user.can_upgrade) {
const checkout = await client.createCheckout('tier2')
window.location.href = checkout.url
}
Python
from uplink_client import UplinkClient
client = UplinkClient(
base_url="https://api.frnds.cloud"
)
# Create session (note: cookie handling requires requests session)
session_resp = await client.create_session("ak_dev_your_token")
# Get dashboard data
dashboard = await client.get_dashboard()
print(f"Usage: {dashboard.usage.total_requests}/{dashboard.limits.max_requests}")
# Create checkout for upgrade
if dashboard.user.can_upgrade:
checkout = await client.create_checkout("tier2")
print(f"Checkout URL: {checkout.url}")
Security Notes
- HttpOnly cookies - Session tokens not accessible via JavaScript
- 30-day sessions - Automatic expiration
- Email required for upgrades - Telegram-only users must add email first
- Stripe webhooks - Automatic tier upgrades after payment
Pricing & Billing
Uplink offers tiered subscription plans with included tokens and pay-as-you-go overage billing.
Billing Tiers
| Tier | Price | Included Tokens | Overage Rate | Rate Limits |
|---|---|---|---|---|
| Free | $0/month | 100K tokens | $0.15 per 1M tokens | 60 req/hour, 1 req/min |
| Hobby | $9/month | 500K tokens | $0.12 per 1M tokens (20% savings) | 600 req/hour, 10 req/min |
| Starter | $29/month | 2M tokens | $0.10 per 1M tokens (33% savings) | 6K req/hour, 100 req/min |
| Pro | $99/month | 10M tokens | $0.08 per 1M tokens (47% savings) | 30K req/hour, 500 req/min |
How Billing Works
- Base Subscription: Monthly charge for your tier with included token allowance
- Metered Usage: Tokens beyond your included amount are billed at your tier's overage rate
- Hourly Reporting: Token usage is aggregated hourly and reported to Stripe
- Promo Codes: Apply discount codes during checkout or in your billing portal
- No Surprises: Monitor your usage in real-time via response headers
Usage Tracking Headers
Every API response includes usage tracking headers:
X-Token-Usage: 12450 # Total tokens used this period
X-Token-Limit: 500000 # Included tokens for your tier
X-Token-Remaining: 487550 # Tokens remaining before overage
X-Token-Reset: 1735689600 # Unix timestamp when usage resets
X-Token-Overage: 0 # Overage tokens (if any)
X-Estimated-Overage-Cost: 0.00 # Estimated overage cost in USD
Billing API Endpoints
Get all available billing tiers with pricing details.
curl https://api.frnds.cloud/api/billing/tiers
Response:
[
{
"id": "free",
"name": "Free",
"price_monthly": 0,
"included_tokens": 100000,
"overage_rate": 0.15,
"rate_limit": {
"requests_per_hour": 60,
"requests_per_minute": 1
},
"features": [
"100K tokens/month",
"Pay-as-you-go after limit",
"Community support"
]
}
// ... more tiers
]
Apply a promotional code to your subscription.
curl -X POST https://api.frnds.cloud/api/billing/promo-code \
-H "Cookie: uplink_session=YOUR_SESSION_ID" \
-H "Content-Type: application/json" \
-d '{"code": "LAUNCH2025"}'
Response:
{
"success": true,
"message": "Promo code applied successfully",
"discount": {
"id": "di_123abc",
"coupon_id": "LAUNCH2025",
"percent_off": 20,
"duration": "repeating",
"duration_in_months": 3
}
}
Get URL for Stripe customer billing portal.
curl https://api.frnds.cloud/api/billing/portal \
-H "Cookie: uplink_session=YOUR_SESSION_ID"
Response:
{
"url": "https://billing.stripe.com/session/..."
}
SDK Examples
JavaScript/TypeScript
import { UplinkClient } from '@frnd/uplink-sdk'
const client = new UplinkClient({
baseUrl: 'https://api.frnds.cloud',
apiKey: 'your-api-key'
})
// Get billing info
const billing = await client.getBillingInfo()
console.log(`Tier: ${billing.tier}`)
console.log(`Status: ${billing.subscription_status}`)
// Get usage details
const usage = await client.getUsageDetails()
console.log(`Tokens used: ${usage.tokens_used} / ${usage.tokens_included}`)
console.log(`Overage cost: $${usage.estimated_cost}`)
// Get available tiers
const tiers = await client.getBillingTiers()
for (const tier of tiers) {
console.log(`${tier.name}: $${tier.price_monthly}/mo`)
}
// Apply promo code
const result = await client.applyPromoCode('LAUNCH2025')
if (result.success && result.discount) {
console.log(`Applied ${result.discount.percent_off}% discount`)
}
// Get billing portal URL
const portal = await client.getBillingPortalUrl()
console.log(`Manage subscription: ${portal.url}`)
Python
from uplink_client import UplinkClient
client = UplinkClient(
base_url='https://api.frnds.cloud',
api_key='your-api-key'
)
# Get billing info
billing = await client.get_billing_info()
print(f"Tier: {billing.tier}")
print(f"Status: {billing.subscription_status}")
# Get usage details
usage = await client.get_usage_details()
print(f"Tokens: {usage.tokens_used} / {usage.tokens_included}")
print(f"Overage: ${usage.estimated_cost:.2f}")
# Get available tiers
tiers = await client.get_billing_tiers()
for tier in tiers:
print(f"{tier.name}: ${tier.price_monthly}/mo")
# Apply promo code
result = await client.apply_promo_code('LAUNCH2025')
if result['success']:
print(f"Applied discount: {result['discount']['percent_off']}%")
# Get billing portal
portal_url = await client.get_billing_portal_url()
print(f"Manage subscription: {portal_url}")
π‘ Billing Best Practices
- Monitor usage via response headers to avoid unexpected overage
- Use the billing portal to manage payment methods and view invoices
- Apply promo codes before upgrading to maximize savings
- Free tier has hard limits - upgrade for overage billing
- Paid tiers allow unlimited overage at transparent rates
Pricing API (for Arepo & Partners)
The Pricing API provides real-time infrastructure costs and pricing calculations for building your own pricing models. Designed for Arepo and other platform integrators who need accurate cost data to set their own margins and tier pricing.
tenant_id=arepo in requests to enable this pricing.
Quick Start
All endpoints require Bearer token authentication:
curl https://api.frnds.cloud/v3/pricing/config \
-H "Authorization: Bearer YOUR_API_KEY"
Endpoints
Get complete infrastructure cost configuration with volume discounts applied.
Query Parameters:
tenant_id- Tenant identifier (use "arepo" for enterprise pricing)markup- Optional markup percentage (0.0-0.5) for non-first-party tenants
curl "https://api.frnds.cloud/v3/pricing/config?tenant_id=arepo" \
-H "Authorization: Bearer YOUR_API_KEY"
Response:
{
"version": "2025-01-27",
"lastUpdated": "2025-01-27T00:00:00Z",
"llm": {
"inputTokenCost": 0.00000008,
"outputTokenCost": 0.00000024,
"avgTokensPerRequest": 750
},
"embeddings": {
"costPerMillionTokens": 0.02,
"costPerRequest": 0.000015,
"dimensions": 768
},
"vectorize": {
"storageCostPer100MDimensions": 0.25,
"queryCostPer1MDimensions": 0.0025,
"indexMaintenanceCost": 0.01
},
"voice": {
"whisperCostPerHour": 0.006,
"ttsCostPerMillionChars": 15.0,
"avgTranscriptionMinutes": 1.5,
"avgSynthesisChars": 500
}
}
Calculate cost for a specific operation.
Query Parameters:
operation- Operation type: chat, embedding, vector_search, voice_transcribe, voice_synthesizeinputTokens- Number of input tokensoutputTokens- Number of output tokensdimensions- Vector dimensions (for embedding/search)minutes- Voice minutes (for transcription)characters- Text characters (for TTS)tenant_id- Tenant for volume discounts
curl "https://api.frnds.cloud/v3/pricing/operations?operation=chat&inputTokens=500&outputTokens=500&tenant_id=arepo" \
-H "Authorization: Bearer YOUR_API_KEY"
Response:
{
"cost": 0.00016,
"currency": "USD",
"per": "request",
"operation": "chat",
"params": {
"inputTokens": 500,
"outputTokens": 500
}
}
Calculate total monthly cost for a user's usage pattern.
curl -X POST https://api.frnds.cloud/v3/pricing/calculate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"documents": 50,
"chats": 200,
"storageGB": 0.5,
"voiceMinutes": 10,
"ttsCharacters": 2000,
"tenantId": "arepo"
}'
Response:
{
"totalCost": 0.32,
"breakdown": {
"llm": 0.15,
"evs": 0.12,
"voice": 0.03,
"infrastructure": 0.02
},
"tierRecommendation": "free",
"estimatedRequests": 200,
"estimatedTokens": 150000
}
Generate recommended tier pricing based on parameters.
curl -X POST https://api.frnds.cloud/v3/pricing/tiers \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"npMultiplier": 1.0,
"targetMargin": 0.50,
"markup": 0.0,
"tenantId": "arepo"
}'
Response:
[
{
"id": "free",
"name": "Free",
"npMultiplier": 1.0,
"price": 0,
"includedValue": 100000,
"overageRate": 0.15,
"costToServe": 1.5,
"netProfit": -1.5,
"profitMargin": -1.0
},
{
"id": "hobby",
"name": "Hobby",
"npMultiplier": 1.0,
"price": 9,
"includedValue": 500000,
"overageRate": 0.12,
"costToServe": 7.5,
"netProfit": 1.5,
"profitMargin": 0.17
}
// ... more tiers
]
SDK Integration
For TypeScript/JavaScript projects, use the Pricing SDK for type-safe access:
import { createPricingSDK } from '@uplink/pricing-sdk'
const pricing = createPricingSDK({
apiKey: process.env.UPLINK_API_KEY!,
tenantId: 'arepo',
markup: 0.0, // 0% for first-party
isFirstParty: true, // Gets enterprise volume discounts
cacheEnabled: true,
cacheTTL: 3600 // Cache for 1 hour
})
// Get infrastructure costs
const costs = await pricing.getPricingConfig()
// Calculate user cost
const userCost = await pricing.calculateUserCost({
documents: 50,
chats: 200,
storageGB: 0.5
})
// Get operation cost
const chatCost = await pricing.getOperationCost('chat', {
inputTokens: 500,
outputTokens: 500
})
// Generate recommended tiers
const tiers = await pricing.generateTiers({
npMultiplier: 1.0,
targetMargin: 0.50
})
Python SDK
from uplink_pricing import PricingClient
pricing = PricingClient(
api_key=os.environ["UPLINK_API_KEY"],
tenant_id="arepo",
markup=0.0,
is_first_party=True
)
# Get infrastructure costs
costs = pricing.get_pricing_config()
# Calculate user cost
user_cost = pricing.calculate_user_cost(
documents=50,
chats=200,
storage_gb=0.5
)
# Get operation cost
chat_cost = pricing.get_operation_cost(
operation="chat",
input_tokens=500,
output_tokens=500
)
Additional Endpoints
GET /v3/pricing/np-impact- Calculate impact of NP multiplier changesPOST /v3/pricing/feature-impact- Calculate impact of feature toggles on pricingGET /v3/pricing/optimize-freemium- Calculate optimal freemium limits for a cost targetGET /v3/pricing/refresh- Force refresh pricing cache (admin only)
See the Arepo Integration Guide for complete documentation and examples.
Rate Limits & Usage Controls
- Global IP limit: All requests pass through a global limiter; violations return
429withtoo_many_requests. - Authentication limiter: Failed logins are throttled separately to block brute force attacks.
- Per-key limiter: Every API key has a dedicated bucket; rate-limit headers mirror OpenAI's
X-RateLimit-Remainingsemantics. - Quota tracking:
UsageTrackerstores daily and monthly usage in KV. If a tenant exceeds quota, responses return429 quota_exceeded.
Response Model
Successful chat responses include the standard OpenAI payload plus an _arbitrage block describing the provider decision:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1732400000,
"model": "llama-3.3-70b-versatile",
"choices": [ ... ],
"usage": { "prompt_tokens": 120, "completion_tokens": 45, "total_tokens": 165 },
"_arbitrage": {
"provider_used": "groq",
"model_used": "llama-3.3-70b-versatile",
"arbitrage_mode": "auto",
"reasoning": "Selected for 23% cost savings",
"savings_percentage": 23.1,
"estimated_cost": 0.0007
}
}
Error payloads follow the OpenAI structure with an error object that includes message, type, and code. Admin and workflow routes return detailed validation errors when Zod schema validation fails.
Endpoint Reference
Create a chat completion using the arbitrage engine. Supports streaming, tool invocation, JSON mode, and structured response metadata.
| Field | Type | Description |
|---|---|---|
model |
string | Canonical model slug (e.g. llama-3.3-70b-versatile). Use /v3/models for the full catalog. |
messages |
Message[] | OpenAI-style conversation array with role and content. |
arbitrage_mode |
string | auto (default), cost, speed, quality, or manual. |
max_latency |
number | Upper bound in milliseconds; providers exceeding this are skipped. |
min_quality |
number | Minimum quality score between 0 and 1. |
max_cost |
number | Cap the per-request USD spend. |
stream |
boolean | Emit SSE chunks that mirror OpenAI streaming responses. |
tools |
Tool[] | Function-calling definitions. Provider selection respects tool compatibility. |
Returns all available models with pricing, provider coverage, and capability metadata aggregated from Groq, Together, and OpenRouter.
Lists models that have cross-provider coverage—ideal arbitrage targets.
Dry-run routing for a proposed request. Returns the selected provider, canonical model, reasoning, and predicted savings without invoking the downstream API.
Raw opportunity ledger from the arbitrage database including provider spreads and health signals.
System health snapshot, sync cadence, and the timestamp of the latest provider refresh.
Plain-text summary of current arbitrage conditions—perfect for CLI dashboards.
π Advanced RAG Features
NEW! Production-tested retrieval strategies delivering 98-100% accuracy with 50ms P50 latency.
Hybrid Search
Combines BM25 keyword ranking with semantic vector search. Best for exact-match queries (IPs, version numbers, specific terms).
| Field | Type | Description |
|---|---|---|
query |
string | Search query (required) |
chat_id |
string | Filter to specific conversation |
semantic_weight |
number | Weight for semantic search (0-1, default: 0.6) |
keyword_weight |
number | Weight for keyword search (0-1, default: 0.4) |
fusion_method |
string | rrf (Reciprocal Rank Fusion) or linear |
curl -X POST https://api.frnds.cloud/v3/evs/hybrid-search \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"query": "database server IP address",
"semantic_weight": 0.3,
"keyword_weight": 0.7
}'
Progressive Retrieval
Multi-stage retrieval that starts fast and deepens if confidence is low. Delivers 70% of queries in <50ms via early exit.
| Stage | Latency Budget | Method | Confidence Threshold |
|---|---|---|---|
| Stage 1 | 50ms | Cache + keywords | 0.9 (exit if met) |
| Stage 2 | 200ms | Hybrid search | 0.8 (exit if met) |
| Stage 3 | 800ms | Multi-query ensemble | 0.7 (final) |
curl -X POST https://api.frnds.cloud/v3/evs/progressive-search \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"query": "How many PTO days?",
"chat_id": "support_123"
}'
Response includes metadata showing which stage completed and final confidence score.
Adaptive Retrieval
Auto-detects query type (lookup, analytical, procedural, comparison) and optimizes retrieval parameters accordingly.
| Query Type | Semantic Weight | Keyword Weight | Chunk Size |
|---|---|---|---|
| Lookup (facts, numbers) | 0.3 | 0.7 | Small (200-400 chars) |
| Analytical (deep) | 0.8 | 0.2 | Large (800-1200 chars) |
| Procedural (how-to) | 0.6 | 0.4 | Medium (400-800 chars) |
curl -X POST https://api.frnds.cloud/v3/evs/adaptive-search \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"query": "Compare PTO policies for full-time vs part-time",
"auto_optimize": true
}'
Set auto_optimize: true to let the system analyze and optimize automatically.
Query Classification Improvements
Enhanced UUID Detection (Oct 2025)
The EVS query classifier now uses a full UUID v4 pattern matcher, improving accuracy and reducing false positives:
- Previous: Partial pattern
/\b[0-9a-f]{8}-[0-9a-f]{4}\b/matched short segments like "20231026-1234" - Current: Full pattern
/\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/i
Impact on Search Behavior
Queries with full UUIDs are now correctly routed to keyword/exact-match search, while queries with UUID-like patterns (but not valid UUIDs) use semantic search for better relevance.
"Find document 550e8400-e29b-41d4-a716-446655440000"β Keyword search β"Documents from 20231026-1234"β Semantic search (no longer false positive) β"Transaction abc123-def4"β Semantic search (not a valid UUID) β
No action required: This improvement is automatic and backward compatible. Your existing queries will automatically benefit from improved classification accuracy.
Performance: Progressive retrieval achieves 50ms P50 latency (70% Stage 1 exits). Hybrid search adds +10-15% accuracy for exact-match queries. Adaptive retrieval optimizes automatically with zero configuration.
Research Tools
AI-powered web search with multi-provider support (Brave, NewsAPI, Diary). Returns search results with optional AI-generated answers and content extraction.
| Field | Type | Description |
|---|---|---|
query |
string | Search query (required) |
search_provider |
string | Provider to use: brave, newsapi, diary, hybrid (default: brave) |
search_depth |
string | basic or advanced (default: basic) |
max_results |
number | Maximum results to return (1-20, default: 5) |
include_answer |
boolean | Generate AI summary of results (default: false) |
time_range |
string | Filter by time: day, week, month, year |
Related routes:
POST /v3/extract– Extract and parse content from a URL (markdown, text, images)POST /v3/crawl– Deep crawl a website with link following and content extractionPOST /v3/map– Map website structure and discover pages
Agent-enabled chat completions with automatic tool injection. Drop-in OpenAI replacement where the model can autonomously search the web, extract content, and crawl websites to answer your questions.
| Field | Type | Description |
|---|---|---|
model |
string | Model that supports function calling (e.g. llama-3.3-70b-versatile) |
messages |
Message[] | OpenAI-style conversation array |
tools |
string[] | Tools to enable: search, extract, crawl (default: all) |
max_iterations |
number | Maximum tool call loops (default: 5) |
stream |
boolean | Stream response with tool call events |
curl https://api.frnds.cloud/v3/chat/agent \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "What are the latest AI breakthroughs this week?"}],
"tools": ["search"],
"stream": true
}'
The model automatically decides when to search the web, extract content from URLs, or crawl websites. Tool calls and results are handled server-side, and the final answer is returned after all tool iterations complete.
Smart agent endpoint with semantic memory search and web search enabled by default. Designed for Arepo integration with chatId-scoped document retrieval and flexible tool configuration. Streaming is enabled by default.
| Field | Type | Description |
|---|---|---|
model |
string | Model that supports function calling (e.g. llama-3.3-70b-versatile) |
messages |
Message[] | OpenAI-style conversation array |
tools |
string[] | Tools to enable (default: ["memory_search", "search"]). Additional tools can be added. |
chat_id |
string | Optional chat session ID for scoping EVS document retrieval (isolates user documents) |
max_iterations |
number | Maximum tool call loops (default: 5) |
stream |
boolean | Stream response with tool call events (default: true) |
# Default behavior: semantic memory + web search
curl https://api.frnds.cloud/v3/chat/smart-agent \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "What did I say about Q3 projections?"}],
"chat_id": "user-session-abc123"
}'
# Custom tools: add stock quotes and news
curl https://api.frnds.cloud/v3/chat/smart-agent \
-H "Authorization: Bearer ak_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [{"role": "user", "content": "Analyze NVDA stock performance"}],
"tools": ["memory_search", "search", "get_stock_quote", "get_recent_news"],
"chat_id": "user-session-abc123"
}'
Default Tools:
memory_search– Semantic search across user's EVS documents (chatId-scoped for data isolation)search– Real-time web search via Brave/Tavily APIs
Available Additional Tools: extract, get_stock_quote, get_crypto_quote, get_recent_news, get_top_headlines, get_current_weather
The smart agent automatically routes between semantic memory retrieval (for user-specific context) and web search (for current information). Cost tracking scales with the number of tools used and iterations performed.
List saved workflows for the authenticated tenant. Workflows are stored in KV and can be executed with persistent state.
Related routes:
GET /v3/workflows/templates– discover built-in templates.POST /v3/workflows– create or update a workflow definition.POST /v3/workflows/execute– run a stored or inline workflow.POST /v3/workflows/from-template– instantiate a template with overrides.GET /v3/workflows/:id,PUT /v3/workflows/:id,DELETE /v3/workflows/:id– manage lifecycle.
Strict OpenAI v1 compatibility for SDKs that have not yet migrated. Accepts the same payload as v3 but does not return the _arbitrage object. Session persistence is available via session_id.
GET /v1/models– legacy model catalog.GET /v1/sessions/:id– retrieve stored conversation history.DELETE /v1/sessions/:id– purge a stored session and any R2 context snapshots.POST /v1/embeddings– currently returns501 Not Implemented.
List registered function-calling tools. Use POST /v1/tools/register with a JSON schema to add custom tools at runtime.
Tenant self-service usage dashboard. Query parameters start_date and end_date (ISO yyyy-mm-dd) slice the reporting window. Returns real-time quota status.
Use GET /v1/billing/history for six months of pre-aggregated monthly spend.
Machine-readable documentation. Append ?format=markdown for Markdown output or switch to /v3/docs/openapi for the OpenAPI 3.0 schema.
Magic endpoints provide curl-friendly shortcuts with automatic key detection. Supply the prompt as the path segment (e.g. /ask/your-question) or POST JSON for richer control.
Document Compression & RAG
Upload documents to Embedded Vector Storage (baseline, 1x compression). Supports semantic search via embeddings.
β¨ NEW: Human-Readable Citations with displayName
Add an optional displayName field during ingestion to get human-readable citations in LLM responses. No post-processing needed!
Request Example:
{
"tenantId": "tenant_123",
"chatId": "user_docs",
"source": {
"type": "text",
"content": "Employee PTO policy...",
"displayName": "Employee-Handbook-2025.pdf"
}
}
LLM Response (automatic):
Employees receive 15 days PTO per year [Source: Employee-Handbook-2025.pdf (chunk 3)]
API Response Metadata:
{
"_vector_metadata": {
"sources": [{
"id": "abc123...",
"title": "Employee Handbook",
"displayName": "Employee-Handbook-2025.pdf",
"score": 0.92
}]
}
}
Benefits: Without displayName, citations show hex IDs like [Source: 1d34d363:8709...]. With it, you get readable filenames automatically in both LLM context and API responses. Fully backward compatible - optional field that defaults to title if not provided.
Compress documents using DeepSeek-OCR visual encoding (10x compression). Metadata search only. Drop-in replacement for /evs/*.
Compress documents using Memvid video encoding (50-100x compression). Full semantic search via QR-encoded frames. Drop-in replacement for /evs/*.
Analyze a document and get compression recommendations without ingesting. Returns size, complexity analysis, and optimal method selection.
Smart auto-routing: analyzes document and automatically routes to optimal compression method (EVS/OCR/OCV) based on size, structure, and content.
Benchmark all available compression methods on a document. Compares compression ratio, speed, and cost. Warning: Tests all methods (expensive).
Get comparison table of compression methods (EVS 1x, OCR 10x, OCV 75x) with recommendations for different use cases.
Search compressed documents. EVS and OCV support full semantic search. OCR supports metadata search only. Identical API across all three families.
RAG-enhanced chat with document context. Automatically retrieves relevant chunks and injects into chat. Works with all compression methods.
Admin APIs for provisioning, invoicing, and analytics. All routes require the ADMIN_API_KEY bearer token and are intended for operator dashboards.
Admin Analytics
The analytics API provides comprehensive business metrics including user counts, API usage, revenue data, and conversion funnels. Requires ADMIN_API_KEY authentication.
Get Analytics Metrics
Returns comprehensive analytics metrics for business insights and monitoring.
curl -H "Authorization: Bearer $ADMIN_API_KEY" \
https://api.frnds.cloud/admin/analytics
Response Schema
{
// User metrics
"total_users": 1250,
"users_by_tier": {
"email_verified": 1000,
"phone_verified": 500,
"payment_verified": 200,
"telegram_signup": 50
},
"new_signups_today": 15,
"new_signups_this_week": 85,
"new_signups_this_month": 320,
// API usage metrics
"total_api_calls_today": 5000,
"total_api_calls_this_week": 28000,
"total_tokens_consumed_today": 250000,
"total_tokens_consumed_this_week": 1400000,
// Revenue metrics
"mrr": 5000.00,
"total_paying_customers": 200,
"churn_rate": 0.03,
// Conversion funnel
"funnel": {
"signups": 1250,
"email_verified": 1000,
"phone_verified": 500,
"payment_verified": 200
},
"conversion_rates": {
"signup_to_email": 0.80,
"email_to_phone": 0.50,
"phone_to_payment": 0.40,
"overall": 0.16
},
// Data quality indicator (NEW in Oct 2025)
"is_estimate": false
}
Understanding is_estimate
The is_estimate field indicates the data quality of the response:
false- Data is from recent cache (accurate, <5 minutes old)true- Data is from daily counters (estimated, summary may be stale)
When is_estimate: true, consider calling the refresh endpoint for accurate real-time data.
Refresh Analytics Cache
Triggers a recalculation of analytics metrics from actual user data. Use this when you need the most accurate data.
curl -X POST \
-H "Authorization: Bearer $ADMIN_API_KEY" \
https://api.frnds.cloud/admin/analytics/refresh
Response
{
"success": true,
"message": "Analytics cache refreshed successfully"
}
SDK Usage
Both JavaScript and Python SDKs support the analytics API with proper typing for the is_estimate field.
JavaScript/TypeScript
import { UplinkClient } from 'uplink-client'
const client = new UplinkClient({
baseUrl: 'https://api.frnds.cloud',
apiKey: process.env.ADMIN_API_KEY
})
// Get analytics
const metrics = await client.getAnalytics()
console.log(`Total users: ${metrics.total_users}`)
console.log(`Data is estimated: ${metrics.is_estimate}`)
// If data is estimated, refresh for accuracy
if (metrics.is_estimate) {
await client.refreshAnalytics()
const updated = await client.getAnalytics()
console.log(`Updated MRR: $${updated.mrr}`)
}
Python
from uplink_client import UplinkClient
client = UplinkClient(
base_url="https://api.frnds.cloud",
api_key=os.environ["ADMIN_API_KEY"]
)
# Get analytics
metrics = await client.get_analytics()
print(f"Total users: {metrics.total_users}")
print(f"Data is estimated: {metrics.is_estimate}")
# If data is estimated, refresh for accuracy
if metrics.is_estimate:
await client.refresh_analytics()
updated = await client.get_analytics()
print(f"Updated MRR: ${updated.mrr}")
Best Practices
- Check is_estimate - Always check this field before making business decisions
- Refresh when needed - Call the refresh endpoint when accurate data is critical
- Cache client-side - Analytics data is expensive to compute, cache responses appropriately
- Monitor performance - Large user counts may take longer to calculate
Schemas & SDK Integration
- OpenAPI:
GET /v3/docs/openapireturns a complete 3.0 document for SDK code generation. - Markdown:
GET /v3/docs?format=markdownmirrors this guide for CLI or knowledge-base ingestion. - TypeScript: Worker-side request validation relies on Zod schemas in
src/schemas. Reuse them when extending the worker to guarantee compatibility.