AI Functions — The easiest way to add AI to your code

01 — The problem

Adding AI to your app should take minutes, not days

The usual way

You write a prompt. It works. Then edge cases break it. You add retries, JSON parsing, validation. Two days later you have 200 lines of glue code for one function.

Next project, you need something similar. You copy-paste, it drifts. There's no contract, no tests, no way to know if it's still working. The prompt is buried in a codebase nobody else can find.

With AI Functions

Describe what you need: input, output, a few examples. The system autonomously writes the instructions, tests them, and picks the right model. You get a function that works.

Call it from any project, any language. It's tested, typed, and stays in a shared library so your whole team can use it. When you want to be sure it's production-ready, release it with a quality gate.

02 — How it works

You describe it. The system builds it.

You do two things: say what you want, and use the result. Everything in between is autonomous.

You

Describe

Say what the function should do. Add a few real examples of good and bad output. That's enough.

Autonomous

Build

The system writes the instructions, derives scoring rules from your examples, and selects the right model for your cost/quality tradeoff.

Autonomous

Test & improve

Run → judge → fix → repeat. Autonomous loop against your examples until the quality threshold is met.

You

Use it

Call it like any function — from Node.js, over HTTP, from any language. Typed input, typed output. Done.

03 — Two lines of code

Call it from Node.js. Or any language over HTTP.

Every function works as a library call or a REST endpoint. No boilerplate, no prompt engineering.

Node.js — just import and call

import { classify, summarize, run }
  from "aifunctions-js/functions";

// Built-in function — one line
const { categories } = await classify({
  text: "I was charged twice this month.",
  categories: ["Billing", "Auth", "Support"],
});

// Your custom function — same simplicity
const { lines } = await run(
  "extract-invoices",
  { text: invoiceText }
);

HTTP — any language

# Create a function
POST /functions
{ "id": "extract-invoices",
  "description": "Extract line items",
  "scoreGate": 0.85 }

# Call it — typed input, typed output
POST /functions/extract-invoices/run
{ "input": { "text": "Invoice #1234..." } }
→ { result, usage: { tokens, model,
    latencyMs }, requestId }

# Works from Python, Go, curl, anything
# No SDK required

04 — You don't write prompts

Give it examples. It figures out the rest.

You provide a description and a few real examples. The system autonomously writes instructions, tests them, scores them, and rewrites until they pass your quality bar.

You

Examples

good & bad output

→

Autonomous

Build

instructions + rules

→

Autonomous

Test

score against rules

→

Autonomous

Improve

rewrite & retry

→

You

Use it

quality bar met ✓

generateInstructions

Give it test cases and a description. It writes instructions from scratch, runs them, tests the output, and loops autonomously until your quality bar is met.

generateJudgeRules

Provide a few real examples labeled good or bad — with a brief note on why. The system derives the scoring rules autonomously. You review them. Human judgment in, not AI-judges-AI.

raceModels

Benchmark your function across models — or sweep temperatures on a single model. Winners are stored as profiles: run with mode: "best" or "cheapest" and the system uses the winning config automatically.

05 — When you need to be sure

Quality gates, versioning, and rollback

For production use, functions can go through a release pipeline. Nothing goes live without passing your quality bar.

📝

Draft

New functions are callable immediately for testing. Responses include "draft": true so you know you're in sandbox mode. Iterate freely.

⚖️

Validate

Run :validate to check quality. Schema validation confirms the output shape. Semantic scoring tests every case against your rules. Both must pass your gate. Use it in CI to fail the build.

🚀

Release

Quality gate passes — immutable version tagged. Pinned contract, pinned model. Roll back to any previous version if something regresses. Stable forever.

06 — Ready to use

23 functions built in. Add your own in minutes.

The library ships with ready-to-use AI functions — plus the system's own optimization, judging, and benchmarking tools. Use them directly or as starting points.

classify

Classify text into one or more categories with confidence.

summarize

Summarize text into a concise paragraph with key points.

extractEntities

Extract people, orgs, locations, dates, products with context.

matchLists

Match items from two lists by semantic similarity.

sentiment

Detect positive, negative, or neutral sentiment with score.

rank

Rank items by relevance with scored justification.

cluster

Group items into semantic clusters automatically.

translate

Translate text to any language, preserving tone.

ai.ask

Run any prompt and get text output.

Plus 14 more built-in functions — judge, compare, generateInstructions, raceModels, collectionMapping, and more — the same tools the system uses internally to optimize, judge, and benchmark your functions.

These are just the starting set. Every function you create with the optimization wizard joins the library — tested, versioned, and reusable. Browse the full library →

07 — Proof, not promises

A real function, start to finish

One developer described what they needed and provided 8 test cases. Here's what the autonomous loop did.

extract-invoice-lines

Released · v2

Before optimization

0.52 score (8 test cases)

Seed instruction: "Extract line items from this invoice." — one sentence. Missed currency fields, broke on multi-page invoices, inconsistent output structure.

After 4 autonomous cycles

0.93 score (same 8 test cases)

System rewrote instructions 4 times. Generated 6 scoring rules from examples. Final instructions: 340 words, explicit about edge cases. Selected gpt-4o via raceModels (beat Sonnet by 4% on this task).

Cycle 1 0.52 → generated initial rules

Cycle 2 0.71 → fixed currency handling

Cycle 3 0.85 → added multi-page logic

Cycle 4 0.93 → quality bar met ✓

Released as v2

08 — Trust & observability

Verifiable, debuggable, no surprises

You're running production traffic through this. Here's how we make that safe.

What we log / don't log

We log requestId, function name, model, latency, and token count. We do not log your input data, output data, or API keys. The server is a stateless proxy — your payloads pass through and are never stored.

Verify in source →

Debugging built in

Every response includes a requestId. Enable "trace": true on any call to get the full prompt, model selection reasoning, and scores — for that request only, returned to you, not stored. Replay any request against a pinned version.

Model selection & pinning

The system picks the best model based on race results. Pin a specific model when you need exact reproducibility. Every response tells you exactly which model answered. Full control when you want it, smart defaults when you don't.

Human-anchored evaluation

The system asks you for real examples before generating scoring rules. You review and approve the rules before they're used. The evaluator is grounded in your judgment — not AI grading its own homework.

Request attribution

Tag any call with projectId, traceId, and custom tags. The system adds functionId automatically. Every usage response carries these back — so you can group costs by project, trace requests across systems, and filter analytics by any dimension.

Provider analytics built in

Check your OpenRouter balance and generation history, or pull OpenAI usage and cost data — all from the same API. Filter by date, model, project, or function. Your provider keys, your data, proxied directly. No separate dashboards needed.

10 — Pricing

Free today. Pro when you need it.

The full platform is free with your own inference key. A managed Pro tier is on the roadmap.

Free — available now

Bring your own OpenRouter key. You pay inference directly. Full platform, no restrictions.

Full Functions API
Unlimited functions
Autonomous optimization
Quality gates & versioning
Analytics & attribution
Your own inference key

Coming later

Pro

Usage-based

We handle inference. Pay per token. No keys to manage. Plus managed infrastructure and advanced capabilities.

Everything in Free
Managed inference — no API key setup
Team workspaces & RBAC
Priority rate limits

Advanced capabilities

Monitoring & alerts — track quality drift, latency spikes, cost anomalies per function
Usage dashboard — tokens, cost, and performance over time
Agentic features — chain functions, conditional routing, multi-step workflows
Agentic memory — persistent context across function calls for stateful workflows

Published limits (Free tier): 60 requests/min per key · 10 concurrent AI calls · 100KB max payload · 120s timeout per run. All responses include standard rate-limit headers: X-RateLimit-Remaining, X-RateLimit-Reset.

Adding AI to your app should take minutes, not days

You describe it. The system builds it.

Describe

Build

Test & improve

Use it

Call it from Node.js. Or any language over HTTP.

Give it examples. It figures out the rest.

Examples

Build

Test

Improve

Use it

generateInstructions

generateJudgeRules

raceModels

Quality gates, versioning, and rollback

Draft

Validate

Release

23 functions built in. Add your own in minutes.

A real function, start to finish

extract-invoice-lines

Before optimization

After 4 autonomous cycles

Verifiable, debuggable, no surprises

What we log / don't log

Debugging built in

Model selection & pinning

Human-anchored evaluation

Request attribution

Provider analytics built in

Share with your team — or keep it private

🌐 Shared library

🔒 Private library

Free today. Pro when you need it.

Add AI to your app.In minutes.

Add AI to your app.
In minutes.