A production-ready pattern for AI in WordPress: async jobs, signed webhooks, and external workers

Why this pattern
– WordPress is great at routing and rendering, not long-running I/O.
– AI calls are slow, variable, and expensive; they need retries, quotas, and tracing.
– The solution: push jobs to an external worker and accept results via signed webhooks.

Architecture (high level)
– Client (WP admin or theme) submits an AI request to a WP REST route.
– WordPress writes a job row (pending), enqueues to an external queue (or HTTP to a worker gateway).
– Worker (Python/Node) pulls the job, calls the AI provider, then POSTs a signed webhook back to WordPress.
– WordPress verifies the signature, stores result, and invalidates relevant cache.
– Frontend polls or uses SSE/WS via a lightweight proxy for updates.

Database schema (custom table)
– wp_ai_jobs
– id (bigint PK)
– user_id (bigint)
– status (enum: pending, running, succeeded, failed)
– input_hash (char(64)) for idempotency
– request_json (longtext)
– result_json (longtext, nullable)
– error_text (text, nullable)
– created_at, updated_at (datetime)
– idempotency_key (varchar(64), unique)
– webhook_ts (datetime, nullable)

Create the table on plugin activation
– dbDelta with utf8mb4, proper indexes:
– INDEX status_created (status, created_at)
– UNIQUE idempotency_key (idempotency_key)
– INDEX input_hash (input_hash)

Plugin structure (minimal)
– ai-integration/
– ai-integration.php (bootstrap, routes, activation)
– includes/
– class-ai-controller.php (REST endpoints)
– class-ai-webhook.php (webhook verifier)
– class-ai-repo.php (DB access)
– class-ai-queue.php (enqueue out to worker)
– helpers.php (crypto, validation)
– Do not store secrets in options; put them in wp-config.php.

Secrets and config (wp-config.php)
– define(‘AI_WORKER_URL’, ‘https://worker.example.com/jobs’);
– define(‘AI_WEBHOOK_SECRET’, ‘base64-32-bytes’);
– define(‘AI_JWT_PRIVATE_KEY’, ‘—–BEGIN PRIVATE KEY—–…’);
– define(‘AI_QUEUE_TIMEOUT’, 2); // seconds for outbound enqueue

REST endpoint: create job (POST /wp-json/ai/v1/jobs)
– Validate capability (logged-in or signed public token).
– Build idempotency_key from client or hash(input_json + user_id + model).
– Insert row (pending).
– Enqueue to worker:
– POST to AI_WORKER_URL with signed JWT (kid, iat, exp, sub=user_id, jti=idempotency_key).
– Timeout <= 2s. If enqueue fails, leave job pending; a retry worker (Action Scheduler) can re-enqueue.
– Return { job_id, status: "pending" }.

Example: tiny enqueue
– Headers: Authorization: Bearer
– Body: { job_id, idempotency_key, request: {…}, callback_url: “https://site.com/wp-json/ai/v1/webhook” }

Webhook endpoint: receive result (POST /wp-json/ai/v1/webhook)
– Require HMAC-SHA256 signature header: X-AI-Signature: base64(hmac(secret, body))
– Require idempotency_key and job_id in body.
– Verify:
– Constant-time compare HMAC.
– Check timestamp drift <= 2 minutes (X-AI-Timestamp).
– Enforce replay guard: cache "webhook:{jti}" in Redis for 10m.
– Update row (status to succeeded/failed, set result_json or error_text, webhook_ts).
– Return 204.

Minimal verification (PHP)
– $sig = base64_decode($_SERVER['HTTP_X_AI_SIGNATURE'] ?? '');
– $calc = hash_hmac('sha256', $rawBody, AI_WEBHOOK_SECRET, true);
– hash_equals($sig, $calc) or wp_die('invalid sig', 403);

Frontend polling pattern
– Client gets job_id, then polls GET /wp-json/ai/v1/jobs/{id} every 1–2s (cap at 30s).
– Cache-control: private, max-age=0. Use ETag from updated_at to 304 unchanged.
– Optional: stream via SSE proxied through PHP only if your infra supports long-lived requests without PHP-FPM worker starvation.

Idempotency and dedupe
– On create:
– If idempotency_key exists, return existing job.
– Also check input_hash + user_id within time window to reduce duplicates from flaky clients.

Rate limiting
– Per-user sliding window: e.g., 60 jobs/10m.
– Use wp_cache (Redis/Memcached). Key: rl:{user}:{minute-epoch}. Increment and check.
– On limit exceed, 429 with Retry-After.

Background retries
– Action Scheduler job scans pending/running older than N minutes:
– Re-enqueue if no worker ack.
– Mark failed if exceeded retry budget; store error_text.

Security checklist
– Do not accept webhooks without HMAC and timestamp.
– JWT to worker uses short exp (<=60s). Sign with ES256 or RS256; rotate keys quarterly.
– Sanitize and escape all fields when rendering.
– Disable file edits in prod; restrict wp-admin to known IPs if possible.
– Log minimal PII; encrypt sensitive request_json fields at rest if needed (sodium_crypto_secretbox).

Performance considerations
– Never call AI providers inside a WP page render path.
– Outbound enqueue must be non-blocking (<2s). Use Requests::post with short timeouts and no redirects.
– Store only necessary parts of result_json; large blobs to object storage (S3) with signed URLs.
– Use indexes to keep dashboard queries fast; paginate admin list by created_at DESC.
– Cache job summaries with wp_cache_set on read path; invalidate on webhook.

Worker reference (Python, outline)
– Pull from queue, call provider with circuit breaker and retry/backoff (e.g., 100ms→2s jitter).
– On completion, POST result to callback_url with:
– Headers: X-AI-Signature, X-AI-Timestamp
– Body: { job_id, idempotency_key, status, result_json, usage: {tokens, ms} }
– Keep results small; upload big artifacts elsewhere first.

Minimal job table index DDL
– INDEX status_created (status, created_at)
– INDEX user_created (user_id, created_at)
– UNIQUE idempotency_key (idempotency_key)

Observability
– Add a request_id to all flows; return it to client.
– Store provider latency, tokens, and error codes in result_json. Useful for cost/perf dashboards.
– Emit Server-Timing headers on job reads: worker;dur=123,provider;dur=456.

Admin UI ideas
– List jobs with filters (status, user, model).
– Re-enqueue button (capability checked).
– Export CSV of usage by date/user.

Deployment checklist
– HTTPS everywhere; verify real client IP behind any CDN.
– Set AI_WEBHOOK_SECRET via environment, not version control.
– Protect webhook with allowlist of worker IPs if static.
– Enable object cache. Prefer Redis with persistence.
– Load test: 200 req/s create → ensure PHP-FPM pool and DB connections stay healthy.
– Back up the table and rotate old rows to cold storage monthly.

What to avoid
– Synchronous AI calls in templates.
– Storing provider keys in options.
– Webhooks without signature or timestamp.
– Unbounded job payload sizes.

This pattern scales from small sites to high-traffic publishers, keeps your PHP requests fast, and centralizes reliability and security where they belong: in the worker and webhook boundary.

A production-grade pattern for AI endpoints in WordPress (secure proxy, caching, rate limits, observability)

This post shows a real, production-ready pattern for running AI inference from WordPress without exposing API keys to the browser. We’ll build a secure REST endpoint, add caching and rate limits, handle retries and webhooks, and log everything for observability.

Use case examples:
– Generate product descriptions or summaries from authenticated admin screens
– Enrich form submissions (classify, route, extract fields)
– Power a custom block or dashboard tool that fetches AI results server-side

High-level architecture
– Frontend (block/admin page) → WP REST endpoint (server) → AI provider (OpenAI/Anthropic/etc.)
– Caching at the WordPress layer (transient or object cache)
– Rate limiting per user/site to prevent abuse
– Background jobs for long-running tasks via Action Scheduler
– Optional webhooks from AI provider back to WordPress
– Audit logs stored in a custom table with PII minimization

Prerequisites
– WordPress 6.4+
– PHP 8.1+
– Persistent object cache (Redis/Memcached) recommended
– Action Scheduler plugin or equivalent job runner
– Environment configuration for secrets (wp-config.php or environment variables)

1) Minimal plugin scaffold
Create wp-content/plugins/ai-secure-proxy/ai-secure-proxy.php

prefix . ‘aisp_logs’;
$charset = $wpdb->get_charset_collate();
$sql = “CREATE TABLE IF NOT EXISTS $table (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
user_id BIGINT UNSIGNED NULL,
route VARCHAR(191) NOT NULL,
req_hash CHAR(64) NOT NULL,
tokens_in INT UNSIGNED DEFAULT 0,
tokens_out INT UNSIGNED DEFAULT 0,
duration_ms INT UNSIGNED DEFAULT 0,
status_code INT DEFAULT 0,
error TEXT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
) $charset;”;
require_once ABSPATH . ‘wp-admin/includes/upgrade.php’;
dbDelta($sql);
}

public function register_routes() {
register_rest_route(self::NAMESPACE, ‘/infer’, [
‘methods’ => WP_REST_Server::CREATABLE,
‘permission_callback’ => [$this, ‘can_access’],
‘callback’ => [$this, ‘handle_infer’],
‘args’ => [
‘prompt’ => [‘required’ => true, ‘type’ => ‘string’, ‘minLength’ => 1, ‘maxLength’ => 8000],
‘model’ => [‘required’ => false, ‘type’ => ‘string’, ‘default’ => ‘gpt-4o-mini’],
‘cache’ => [‘required’ => false, ‘type’ => ‘boolean’, ‘default’ => true]
]
]);

register_rest_route(self::NAMESPACE, ‘/webhook’, [
‘methods’ => WP_REST_Server::CREATABLE,
‘permission_callback’ => ‘__return_true’,
‘callback’ => [$this, ‘handle_webhook’],
]);
}

public function can_access(WP_REST_Request $req) {
// Require logged-in user with capability. For public use, implement signed HMAC header instead.
return current_user_can(self::CAPABILITY);
}

private function rate_key($user_id) {
return self::OPT_PREFIX . ‘rate_’ . $user_id . ‘_’ . gmdate(‘YmdH’);
}

private function is_rate_limited($user_id) {
$key = $this->rate_key($user_id);
$count = (int) wp_cache_get($key, ”, false, $found);
if (!$found) $count = (int) get_option($key, 0);
return $count >= self::RATE_LIMIT;
}

private function bump_rate($user_id) {
$key = $this->rate_key($user_id);
$count = (int) get_option($key, 0) + 1;
update_option($key, $count, false);
wp_cache_set($key, $count, ”, 3600);
}

public function handle_infer(WP_REST_Request $req) {
$user_id = get_current_user_id();
if ($this->is_rate_limited($user_id)) {
return new WP_Error(‘rate_limited’, ‘Rate limit exceeded’, [‘status’ => 429]);
}

$prompt = trim($req->get_param(‘prompt’));
$model = sanitize_text_field($req->get_param(‘model’));
$use_cache = (bool) $req->get_param(‘cache’);

// Hash to cache against normalized inputs
$req_hash = hash(‘sha256’, json_encode([‘m’=>$model, ‘p’=>$prompt]));

$cache_key = self::OPT_PREFIX . ‘c_’ . $req_hash;
if ($use_cache) {
$cached = wp_cache_get($cache_key);
if ($cached !== false) {
$this->log($user_id, ‘/infer’, $req_hash, 0, 0, 1, 200, null);
return new WP_REST_Response([‘cached’ => true, ‘result’ => $cached], 200);
}
}

$start = microtime(true);
$resp = $this->call_ai_provider($model, $prompt);
$duration_ms = (int) round((microtime(true) – $start) * 1000);

if (is_wp_error($resp)) {
$this->log($user_id, ‘/infer’, $req_hash, 0, 0, $duration_ms, 500, $resp->get_error_message());
return $resp;
}

$result = [
‘text’ => $resp[‘text’] ?? ”,
‘tokens_in’ => $resp[‘tokens_in’] ?? 0,
‘tokens_out’ => $resp[‘tokens_out’] ?? 0,
‘model’ => $model,
];

if ($use_cache && !empty($result[‘text’])) {
wp_cache_set($cache_key, $result, ”, self::CACHE_TTL);
}

$this->bump_rate($user_id);
$this->log($user_id, ‘/infer’, $req_hash, (int)$result[‘tokens_in’], (int)$result[‘tokens_out’], $duration_ms, 200, null);

return new WP_REST_Response([‘cached’ => false, ‘result’ => $result], 200);
}

private function call_ai_provider($model, $prompt) {
// Secrets via environment or wp-config
$api_key = getenv(‘OPENAI_API_KEY’) ?: (defined(‘OPENAI_API_KEY’) ? OPENAI_API_KEY : ”);
if (!$api_key) return new WP_Error(‘config_error’, ‘Missing AI API key’, [‘status’ => 500]);

$body = [
‘model’ => $model,
‘input’ => $prompt,
];

$attempts = 0;
$max_attempts = 3;
$last_err = null;

while ($attempts 15,
‘headers’ => [
‘Authorization’ => ‘Bearer ‘ . $api_key,
‘Content-Type’ => ‘application/json’
],
‘body’ => wp_json_encode($body),
]);

if (is_wp_error($response)) {
$last_err = $response;
} else {
$code = wp_remote_retrieve_response_code($response);
$data = json_decode(wp_remote_retrieve_body($response), true);

// Simple provider-agnostic parse
if ($code >= 200 && $code $text, ‘tokens_in’ => $tokens_in, ‘tokens_out’ => $tokens_out];
}

// Retry on 429/5xx with exponential backoff
if (in_array($code, [429, 500, 502, 503, 504], true)) {
$last_err = new WP_Error(‘ai_retry’, ‘Transient AI error: ‘ . $code);
} else {
return new WP_Error(‘ai_error’, ‘AI provider error: ‘ . $code, [‘status’ => $code, ‘data’ => $data]);
}
}

// Backoff
usleep((int) (pow(2, $attempts) * 200000)); // 200ms, 400ms, 800ms
}

return $last_err ?: new WP_Error(‘ai_error’, ‘Unknown AI error’, [‘status’ => 500]);
}

public function handle_webhook(WP_REST_Request $req) {
// Optional: verify HMAC signature from provider
$shared = getenv(‘AI_WEBHOOK_SECRET’) ?: ”;
$sig = $req->get_header(‘x-aisig’);
if ($shared && $sig) {
$calc = hash_hmac(‘sha256’, $req->get_body(), $shared);
if (!hash_equals($calc, $sig)) {
return new WP_Error(‘forbidden’, ‘Invalid signature’, [‘status’ => 403]);
}
}
// Process event (store job result, update post meta, etc.)
do_action(‘aisp_webhook_received’, $req->get_json_params());
return new WP_REST_Response([‘ok’ => true], 200);
}

private function log($user_id, $route, $req_hash, $tin, $tout, $dur_ms, $code, $error) {
global $wpdb;
$table = $wpdb->prefix . ‘aisp_logs’;
$wpdb->insert($table, [
‘user_id’ => $user_id ?: null,
‘route’ => $route,
‘req_hash’ => $req_hash,
‘tokens_in’ => $tin,
‘tokens_out’ => $tout,
‘duration_ms’ => $dur_ms,
‘status_code’ => $code,
‘error’ => $error,
], [‘%d’,’%s’,’%s’,’%d’,’%d’,’%d’,’%d’,’%s’]);
}
}
new AI_Secure_Proxy();

2) Secure frontend usage
– Admin page or block should never expose provider keys.
– Call the endpoint with wp.apiFetch or fetch, including nonce.
Example:

const res = await wp.apiFetch({
path: ‘/ai-secure-proxy/v1/infer’,
method: ‘POST’,
data: { prompt, model: ‘gpt-4o-mini’, cache: true }
});

3) Authentication options
– Logged-in only (capability-based): safest for back-office tools.
– Public usage: require HMAC-signed requests.
– Client sends X-Sign: HMAC_SHA256(body, PUBLIC_CLIENT_SECRET).
– Server validates before processing.
– Consider IP allowlists for server-to-server.

4) Caching strategy
– Key by normalized inputs and model.
– Use persistent object cache for speed and TTL-based eviction.
– Invalidate on model/version changes or when content sources update.

5) Rate limiting
– Example above uses per-user/hour counters.
– For high-traffic public endpoints, use Redis INCR with TTL or a token bucket.
– Return 429 with Retry-After.

6) Background jobs
– For long prompts or batch operations, offload to Action Scheduler:
– POST creates a job and returns job_id.
– Worker picks up job, calls AI, stores result in post meta or custom table.
– Optional webhook updates job status when provider supports async.

7) Webhooks
– Verify HMAC signature.
– Idempotency: store last seen event IDs to avoid duplicate processing.
– Enqueue work; do not block the webhook handler.

8) Observability
– Log request hash, tokens, duration, status code.
– Build admin UI with filters by user/date/status.
– Emit wp-json logs or use error_log for quick triage in staging.
– Add metrics: P95 latency, cache hit ratio, 4xx/5xx rates.

9) Security checklist
– Store API keys in environment or wp-config, never the DB.
– Enforce capability or signed requests.
– Validate and length-limit inputs.
– Strip PII where possible before sending to provider.
– Set conservative timeouts; implement retries with backoff.
– Disable indexing for admin tools; use HTTPS everywhere.

10) Performance notes
– Keep endpoint thin; avoid loading unnecessary WP subsystems.
– Enable OPcache and a persistent object cache.
– Batch multiple small prompts into one call when possible.
– Circuit breaker: if upstream is failing consistently, short-circuit for a cooldown period.

11) Deployment tips
– Separate staging keys and webhooks.
– Run a load test (k6/Locust) with realistic prompts.
– Monitor Redis hit ratio and PHP-FPM slow logs.
– Backup logs table and rotate old entries.

Extending the pattern
– Add SSE endpoint for streaming tokens to an admin tool.
– Implement content moderation via a pre-check model before saving results.
– Build a “dry run” mode for previews without committing changes.

This approach keeps AI logic server-side, with guardrails for cost, performance, and security—ready for production in WordPress environments.

Streaming AI Chat in WordPress with a Django Backend, Redis Caching, and SSE (Production-Ready)

This post shows how to implement a production-ready AI chat in WordPress that streams responses with low latency and strong security. The setup:

– WordPress plugin renders the chat UI and signs requests.
– Django backend handles model calls, streams with SSE, caches with Redis, and enforces auth/rate limits.
– No API keys in the browser or WordPress DB.

Architecture
– WordPress (frontend): Shortcode [ai_chat], enqueue JS, get WP nonce, fetch a signed JWT from WP to call the backend.
– Django API: /v1/chat/stream (SSE), validates JWT, checks rate limits, optional moderation, calls OpenAI (or other LLM), streams tokens.
– Redis: Caching prompt fingerprints and user quotas, storing partial transcripts if needed.
– Nginx/Proxy: Keep-Alive, proxy_buffering off for SSE, generous timeouts.

WordPress plugin (minimal, secure)
– Store no model keys in WordPress.
– Use a secret in wp-config.php for signing short-lived JWTs.
– Nonce to mitigate CSRF.

wp-config.php
define(‘AIGUY_SSO_JWT_SECRET’, ‘rotate-this-64-bytes’);
define(‘AIGUY_BACKEND_BASE’, ‘https://api.example.com’);

Plugin file: wp-content/plugins/aiguy-chat/aiguy-chat.php
<?php
/**
* Plugin Name: AI Guy Chat
* Description: Streaming AI chat via secure backend.
*/
if (!defined('ABSPATH')) exit;

add_action('init', function () { add_shortcode('ai_chat', 'aiguy_chat_shortcode'); });

function aiguy_chat_shortcode() {
if (!is_user_logged_in()) return '

Please log in to use chat.

‘;
$nonce = wp_create_nonce(‘aiguy_chat’);
$payload = [
‘sub’ => get_current_user_id(),
‘roles’ => wp_get_current_user()->roles,
‘iat’ => time(),
‘exp’ => time() + 300,
‘site’ => get_site_url(),
];
$jwt = aiguy_jwt_encode($payload, AIGUY_SSO_JWT_SECRET);
wp_enqueue_script(‘aiguy-chat-js’, plugin_dir_url(__FILE__).’chat.js’, [], ‘1.0’, true);
wp_localize_script(‘aiguy-chat-js’, ‘AIGUYCFG’, [
‘nonce’ => $nonce,
‘jwt’ => $jwt,
‘api’ => AIGUY_BACKEND_BASE.’/v1/chat/stream’,
]);
ob_start(); ?>

‘HS256′,’typ’=>’JWT’])), ‘+/’, ‘-_’), ‘=’);
$p = rtrim(strtr(base64_encode(json_encode($payload)), ‘+/’, ‘-_’), ‘=’);
$s = rtrim(strtr(base64_encode(hash_hmac(‘sha256’, “$h.$p”, $secret, true)), ‘+/’, ‘-_’), ‘=’);
return “$h.$p.$s”;
}

JS: wp-content/plugins/aiguy-chat/chat.js
(function(){
const log = document.getElementById(‘aiguy-log’);
const form = document.getElementById(‘aiguy-form’);
const input = document.getElementById(‘aiguy-input’);

function append(role, text) {
const el = document.createElement(‘div’);
el.className = role;
el.textContent = text;
log.appendChild(el);
log.scrollTop = log.scrollHeight;
}

form.addEventListener(‘submit’, async (e) => {
e.preventDefault();
const q = input.value.trim();
if (!q) return;
append(‘user’, q);
input.value = ”;

const url = AIGUYCFG.api;
const params = { message: q };
const ctrl = new AbortController();
const res = await fetch(url, {
method: ‘POST’,
headers: {
‘Content-Type’: ‘application/json’,
‘Authorization’: ‘Bearer ‘+AIGUYCFG.jwt,
‘X-WP-Nonce’: AIGUYCFG.nonce
},
body: JSON.stringify(params),
signal: ctrl.signal
});
if (!res.ok) { append(‘error’, ‘Request failed.’); return; }

const reader = res.body.getReader();
const dec = new TextDecoder();
let acc = ”;
append(‘assistant’, ”);
const last = log.lastChild;

while (true) {
const {value, done} = await reader.read();
if (done) break;
acc += dec.decode(value, {stream: true});
for (const line of acc.split(‘n’)) {
if (!line.startsWith(‘data:’)) continue;
const payload = line.slice(5).trim();
if (payload === ‘[DONE]’) break;
try {
const chunk = JSON.parse(payload);
last.textContent += chunk.delta || ”;
} catch(_) {}
}
}
});
})();

Django backend (SSE with OpenAI, Redis, and limits)
– Use Django or Django + Django Ninja/DRF. Example below uses plain Django view.
– Keep OpenAI key only on the server.
– Validate JWT from WordPress using the shared secret.

settings.py (env-driven)
AIGUY_JWT_SECRET = os.getenv(‘AIGUY_JWT_SECRET’)
OPENAI_API_KEY = os.getenv(‘OPENAI_API_KEY’)
REDIS_URL = os.getenv(‘REDIS_URL’, ‘redis://127.0.0.1:6379/0’)
ALLOWED_ORIGINS = [‘https://www.example.com’]

urls.py
from django.urls import path
from .views import chat_stream
urlpatterns = [ path(‘v1/chat/stream’, chat_stream), ]

views.py
import json, time, hashlib, hmac, base64, os
from django.http import StreamingHttpResponse, HttpResponseForbidden, JsonResponse
from django.views.decorators.csrf import csrf_exempt
import redis
import openai

r = redis.from_url(os.getenv(‘REDIS_URL’,’redis://localhost:6379/0′))
openai.api_key = os.getenv(‘OPENAI_API_KEY’)
JWT_SECRET = os.getenv(‘AIGUY_JWT_SECRET’,”)

def verify_jwt(token):
try:
h,p,s = token.split(‘.’)
sig = base64.urlsafe_b64encode(hmac.new(JWT_SECRET.encode(), f'{h}.{p}’.encode(), ‘sha256′).digest()).rstrip(b’=’).decode()
if sig != s: return None
payload = json.loads(base64.urlsafe_b64decode(p + ‘==’))
if payload.get(‘exp’,0)

# Prompt fingerprint cache key
fp = hashlib.sha256((message.strip()).encode()).hexdigest()
cache_key = f’chat:fp:{fp}’

messages = [
{“role”:”system”,”content”:”You are a helpful assistant for the site.”},
{“role”:”user”,”content”:message}
]

resp = StreamingHttpResponse(
sse_iter(‘gpt-4o-mini’, messages, cache_key, payload[‘sub’]),
content_type=’text/event-stream’
)
resp[‘Cache-Control’] = ‘no-cache’
resp[‘X-Accel-Buffering’] = ‘no’ # for Nginx
return resp

Nginx proxy config (SSE safe)
location /v1/chat/stream {
proxy_pass http://django:8000/v1/chat/stream;
proxy_http_version 1.1;
proxy_set_header Connection ”;
proxy_set_header Host $host;
proxy_buffering off;
proxy_read_timeout 3600;
chunked_transfer_encoding on;
}

Security notes
– Keys only on backend. Never expose provider keys to WordPress or JS.
– Short-lived JWT (5 minutes). Rotate AIGUY_SSO_JWT_SECRET regularly.
– Validate CORS at the proxy. Pin ALLOWED_ORIGINS.
– Enforce rate limits with Redis. Add IP + user-based buckets.
– Log prompts and token counts with PII scrubbing.

Performance and cost
– Stream with SSE to improve UX latency.
– Cache frequent answers by prompt fingerprint (with TTL).
– Temperature low and max_tokens bounded server-side.
– Use connection pooling for Redis and HTTP.
– Add circuit breakers and provider timeouts.

Observability
– Log request_id, user_id, latency, tokens.
– Push metrics to Prometheus or Grafana via StatsD.
– Store error samples with stack traces and masked content.

Testing checklist
– Verify SSE survives 60+ seconds without proxy buffering.
– Confirm JWT expiration fails as expected.
– Confirm rate limit returns 429 and resets on TTL.
– Load test 100 concurrent chats; monitor CPU, Redis ops/sec, and open file handles.
– Validate caching hits on repeated inputs.

Deployment tips
– Put Django behind Nginx; keep workers async-friendly (uvicorn/gunicorn with gevent/uvicorn workers).
– Set Cloudflare to bypass cache on /v1/chat/stream and increase 524 timeout.
– Autoscale based on concurrent connections, not just CPU.

Extensions
– Add chat history with user scoping in Redis or Postgres.
– Add tools/functions (search, orders) via backend tool calling.
– Add moderation before model call and redact PII.

This pattern keeps WordPress simple and secure, offloads AI to a hardened backend, and delivers fast, streaming responses with production safeguards.

Production-Ready AI Chat for WordPress: Async Architecture, Secure Proxy, and Streaming

Goal
Add an AI assistant to WordPress that is fast, secure, and reliable under load. We’ll avoid long-running PHP, protect API keys, and support streaming. This is a production pattern we’ve deployed in client sites.

Architecture (high level)
– WP Plugin (PHP): UI + REST endpoints, stores minimal state, never holds LLM keys.
– API Gateway/Edge (Cloudflare/AWS): Auth, rate limiting, logging.
– Worker (Python/FastAPI): Orchestrates LLM calls (OpenAI/Azure/Anthropic), tools, retries; pushes results.
– Queue (Redis/SQS): Buffer workloads, smooth spikes.
– Storage: WordPress (postmeta/options) for user-visible data; object store/DB for logs.
– Optional Streaming: SSE via edge worker or reverse proxy, not from PHP-FPM.

Why this pattern
– Security: Keys never live in WP DB or theme code.
– Performance: No long PHP requests; async jobs run outside WordPress.
– Reliability: Queue + retries handle spikes/timeouts.
– Observability: Centralized logs/metrics for LLM latency and errors.

Data flow
1) Browser -> WP REST: user prompt + context (nonce + auth).
2) WP -> API Gateway: signed JWT; enqueue job.
3) Worker -> LLM/tools: executes, streams or finalizes.
4a) Streaming: Gateway proxies SSE to browser.
4b) Async: Worker posts result webhook back to WP; WP stores and notifies client via polling or webhooks.

Security hardening
– Cap checks: Only authenticated users, or signed public sessions with rate limits.
– Nonces for UI actions, WP REST permissions_callback.
– JWT from WP to gateway with short TTL; rotated HS256 secret in env, not DB.
– IP allowlist for worker->WP webhooks.
– Sanitize/escape all content rendered in WP.
– Log PII boundaries; redact before sending to LLM.

WordPress plugin skeleton (core pieces)
// Plugin header omitted for brevity
1) Settings (no LLM keys in WP):
– api_gateway_url
– jwt_issuer, jwt_kid
– public_rate_limit/window
– webhook_secret (HMAC shared with worker)

2) Admin settings + capability checks (manage_options).

3) REST routes:
– /ai/v1/chat (POST) -> enqueues
– /ai/v1/webhook (POST) -> receives results (IP + HMAC verified)
– /ai/v1/status (GET) -> poll job state (transients or custom table)

Minimal REST route example (enqueue)
“`
add_action(‘rest_api_init’, function() {
register_rest_route(‘ai/v1’, ‘/chat’, [
‘methods’ => ‘POST’,
‘callback’ => ‘ai_chat_enqueue’,
‘permission_callback’ => function($req){ return is_user_logged_in() || ai_allow_public(); }
]);
});

function ai_chat_enqueue(WP_REST_Request $req){
$prompt = wp_strip_all_tags($req->get_param(‘prompt’) ?? ”);
if (!$prompt) return new WP_Error(‘bad_request’, ‘Missing prompt’, [‘status’=>400]);

// Rate limit (user or IP)
if (!ai_rate_ok($req)) return new WP_Error(‘rate_limited’,’Try later’,[‘status’=>429]);

// Create job record (custom table or postmeta)
$job_id = ai_create_job([‘user_id’=>get_current_user_id(), ‘prompt’=>$prompt, ‘status’=>’queued’]);

// Sign JWT for gateway
$jwt = ai_sign_jwt([
‘iss’=>get_option(‘ai_jwt_issuer’),
‘sub’=> (string) get_current_user_id(),
‘jti’=> $job_id,
‘iat’=> time(),
‘exp’=> time()+60
]);

// Send to gateway
$resp = wp_remote_post(get_option(‘ai_gateway_url’).’/jobs’, [
‘timeout’=>5,
‘headers’=>[‘Authorization’=>”Bearer $jwt”, ‘Content-Type’=>’application/json’],
‘body’=> wp_json_encode([
‘job_id’=>$job_id,
‘prompt’=>$prompt,
‘context’=> ai_collect_context(),
‘webhook’=> home_url(‘/wp-json/ai/v1/webhook’)
])
]);

if (is_wp_error($resp) || wp_remote_retrieve_response_code($resp) >= 300) {
ai_update_job($job_id,[‘status’=>’error’,’error’=>’enqueue_failed’]);
return new WP_Error(‘upstream_error’,’Queue unavailable’,[‘status’=>502]);
}

return [‘job_id’=>$job_id, ‘status’=>’queued’];
}
“`

Webhook handler (worker -> WP)
“`
register_rest_route(‘ai/v1′,’/webhook’,[
‘methods’=>’POST’,
‘callback’=>’ai_webhook_handler’,
‘permission_callback’=>’__return_true’
]);

function ai_webhook_handler(WP_REST_Request $req){
// Verify HMAC and IP allowlist
if (!ai_verify_hmac($req)) return new WP_Error(‘forbidden’,’bad signature’,[‘status’=>403]);

$job_id = sanitize_text_field($req->get_param(‘job_id’));
$content = wp_kses_post($req->get_param(‘content’) ?? ”);
$usage = $req->get_param(‘usage’) ?? [];

ai_update_job($job_id, [‘status’=>’done’,’content’=>$content,’usage’=>$usage]);
// Optionally cache a hash of the prompt->response
set_transient(‘ai_job_’.$job_id, [‘status’=>’done’], 600);
return [‘ok’=>true];
}
“`

Frontend usage (enqueue + poll)
– Enqueue via wp.ajax or fetch to /wp-json/ai/v1/chat.
– Poll /status until done or use SSE channel if implemented at gateway.
– Render with esc_html or safe HTML subset.

Python worker (FastAPI outline)
– Validates JWT (issuer, exp, kid).
– Pushes to queue (Redis/SQS) with retry/backoff.
– Consumes jobs, calls LLM with tools/function-calls as needed.
– Optional SSE: Streams tokens through gateway to client.
– Posts final result to WP webhook with HMAC.

FastAPI snippets
“`
@app.post(“/jobs”)
def enqueue(job: Job, auth=Depends(verify_jwt)):
q.enqueue(job.dict())
return {“accepted”: True}

def process(job):
try:
resp = client.chat.completions.create(
model=os.getenv(“MODEL”),
messages=[{“role”:”user”,”content”: job[“prompt”]}],
timeout=30
)
content = resp.choices[0].message.content
usage = resp.usage.model_dump() if hasattr(resp,’usage’) else {}
post_webhook(job[“webhook”], job[“job_id”], content, usage)
except Exception as e:
post_webhook_error(job, str(e))
“`

Streaming via SSE (recommended)
– Don’t stream from PHP. Use edge worker that:
– Authenticates client cookie/session with a one-time token from WP.
– Opens SSE to Python worker that proxies LLM token stream.
– Applies per-user rate limits.
– WordPress only issues short-lived stream tokens; never touches LLM socket.

Rate limiting
– Per IP for anonymous, per user_id for logged-in.
– Server-side bucket at gateway (e.g., 30 requests/5m).
– UI throttle + backoff to avoid hammering.

Caching and cost control
– Cache deterministic prompts by normalized hash (strip whitespace, lowercased).
– Store minimal response + usage metrics for analytics.
– Deny-list high-cost paths; enforce max tokens and model allowlist at worker.

Observability
– Centralized logs with job_id correlation across WP, gateway, worker.
– Metrics: queue depth, LLM latency, error rate, token usage per user/site.

Database notes
– For scale, use a custom table ai_jobs (job_id PK, user_id, prompt_hash, status, content MEDIUMTEXT, usage JSON, created_at, updated_at).
– Index status, created_at for cleanup jobs.
– Cron: purge >30 days or anonymize prompt text.

Hardening checklist
– No LLM keys in WordPress DB or code.
– Short JWT TTL; rotate secrets; pin alg and kid.
– Webhook HMAC + IP allowlist.
– Escape output; sanitize HTML.
– Strict CORS only to site origin.
– Disable file edit in wp-config; limit plugin access.

When to choose what
– Small sites: WordPress + API Gateway + single worker (no queue) with short timeouts and SSE.
– Growing traffic: Add queue, retries, and circuit breakers.
– Enterprise: Multi-region gateway, KMS-managed secrets, private networking to WP origin.

Deliverables you can ship today
– WP plugin skeleton above + settings UI.
– FastAPI worker with JWT verify + enqueue + result postback.
– Cloudflare Worker for SSE proxy + rate limit.
– IaC to provision queue, logs, secrets.

Production-ready AI Proxy for WordPress: Django + Redis with Secure Streaming

Why you need a proxy
– Keeps vendor keys off WordPress.
– Centralizes auth, rate limits, caching, and retries.
– Enables streaming and uniform observability across sites.

High-level architecture
– WordPress site(s) → Proxy (Django ASGI) → Provider APIs (OpenAI/Anthropic/etc.)
– Redis for rate limits, idempotency, and response cache.
– Postgres optional for audit logs.
– Cloudflare → NGINX → Uvicorn (Django ASGI).

Security model
– Per-site API key pair: site_id + site_secret (stored in WP).
– Request signature: HMAC-SHA256 over body + timestamp.
– JWT issued by proxy for short-lived sessions (optional).
– Nonce in WordPress UI, capability checks for settings.
– IP allowlist and user-agent tagging for WordPress clients.
– Enforce TLS end-to-end.

Django proxy (ASGI) essentials
– Django 5.x, Python 3.11+, Redis, httpx (async), uvicorn.
– Endpoints:
– POST /v1/chat (stream or non-stream)
– POST /v1/embeddings
– GET /v1/models (capability discovery)
– Headers:
– X-Site-ID, X-Timestamp, X-Signature, X-Client-Request-ID

Example models.py (optional audit)
from django.db import models

class InferenceLog(models.Model):
request_id = models.CharField(max_length=64, db_index=True)
site_id = models.CharField(max_length=64, db_index=True)
route = models.CharField(max_length=32)
prompt_hash = models.CharField(max_length=64, db_index=True)
provider = models.CharField(max_length=32)
tokens_in = models.IntegerField(default=0)
tokens_out = models.IntegerField(default=0)
status = models.IntegerField()
elapsed_ms = models.IntegerField()
created_at = models.DateTimeField(auto_now_add=True)

Rate limiting (Redis, sliding window)
– Key: rl:{site_id}:{route}
– Allow N requests per window, e.g., 60/min, 600/hour.
– Return 429 with Retry-After.

Pseudo-implementation (views.py, chat streaming)
import asyncio, hmac, hashlib, time, json, os
import httpx
from django.http import StreamingHttpResponse, JsonResponse
from django.views.decorators.csrf import csrf_exempt
from django.utils.crypto import constant_time_compare
import redis

r = redis.Redis.from_url(os.environ[“REDIS_URL”], decode_responses=False)
PROVIDER_KEY = os.environ[“PROVIDER_KEY”]
HMAC_SECRET = os.environ[“HMAC_SECRET”].encode()

def verify_sig(raw, ts, sig):
if abs(int(time.time()) – int(ts)) > 60:
return False
mac = hmac.new(HMAC_SECRET, raw + ts.encode(), hashlib.sha256).hexdigest()
return constant_time_compare(mac, sig)

def rl_allow(site_id, route, limit=60, window=60):
key = f”rl:{site_id}:{route}”
now = int(time.time())
pipe = r.pipeline()
pipe.zremrangebyscore(key, 0, now-window)
pipe.zadd(key, {str(now): now})
pipe.zcard(key)
pipe.expire(key, window)
_, _, count, _ = pipe.execute()
return count 3:
yield b”event: errorndata: {“message”:”upstream_failed”}nn”
break
await asyncio.sleep(backoff)
backoff *= 2
yield b”event: donendata: {}nn”

if stream:
return StreamingHttpResponse(gen(), content_type=”text/event-stream”)
else:
# Non-stream path
async with httpx.AsyncClient(timeout=60) as client:
resp = await client.post(
“https://api.openai.com/v1/chat/completions”,
headers={“Authorization”: f”Bearer {PROVIDER_KEY}”},
json={“model”: payload.get(“model”,”gpt-4o-mini”), “messages”: prompt}
)
data = resp.json()
r.setex(ck, 60, json.dumps(data))
return JsonResponse(data, status=resp.status_code)

NGINX (SSE buffering)
– proxy_buffering off;
– proxy_read_timeout 300s;
– add_header Cache-Control no-cache;

Gunicorn/Uvicorn
– uvicorn app.asgi:application –host 0.0.0.0 –port 8000 –workers 2 –loop uvloop –http h11

WordPress plugin (minimal)
– Stores Proxy Base URL, Site ID, Site Secret.
– Adds a shortcode [ai_chat] that renders a simple chat box.
– Uses SSE via EventSource to stream responses.
– Nonces for AJAX init; sanitize all options; only admins can edit.

Plugin main file (ai-proxy-chat/ai-proxy-chat.php)
‘POST’,
‘permission_callback’=>function(){ return wp_verify_nonce($_POST[‘_wpnonce’] ?? ”, ‘aig_sig’); },
‘callback’=>[$this,’sign’]
]);
});
}
public function menu() {
add_options_page(‘AI Proxy Chat’,’AI Proxy Chat’,’manage_options’,’aig-proxy-chat’,[$this,’settings’]);
}
public function register() {
register_setting(self::OPT, self::OPT, [‘sanitize_callback’=>[$this,’sanitize’]]);
add_settings_section(‘main’,’Settings’, ‘__return_false’,’aig-proxy-chat’);
add_settings_field(‘base_url’,’Proxy Base URL’,[$this,’field’],’aig-proxy-chat’,’main’,[‘k’=>’base_url’]);
add_settings_field(‘site_id’,’Site ID’,[$this,’field’],’aig-proxy-chat’,’main’,[‘k’=>’site_id’]);
add_settings_field(‘site_secret’,’Site Secret’,[$this,’field’],’aig-proxy-chat’,’main’,[‘k’=>’site_secret’]);
}
public function sanitize($v){
return [
‘base_url’=>esc_url_raw($v[‘base_url’] ?? ”),
‘site_id’=>sanitize_text_field($v[‘site_id’] ?? ”),
‘site_secret’=>sanitize_text_field($v[‘site_secret’] ?? ”)
];
}
public function field($args){
$o = get_option(self::OPT,[]);
$k = $args[‘k’];
$type = $k===’site_secret’ ? ‘password’ : ‘text’;
printf(”, $type, self::OPT, esc_attr($k), esc_attr($o[$k] ?? ”));
}
public function settings(){
echo ‘

AI Proxy Chat

‘;
settings_fields(self::OPT); do_settings_sections(‘aig-proxy-chat’); submit_button(); echo ‘

‘;
}
public function assets() {
wp_register_script(‘aig-chat’, plugins_url(‘chat.js’, __FILE__), [], ‘1.0’, true);
wp_localize_script(‘aig-chat’, ‘AIG_CHAT’, [
‘nonce’=>wp_create_nonce(‘aig_sig’),
‘sigEndpoint’=>rest_url(‘aig/v1/sig’)
]);
}
public function shortcode(){
wp_enqueue_script(‘aig-chat’);
ob_start(); ?>

get_param(‘body’) ?? ”;
$ts = time();
$sig = hash_hmac(‘sha256’, $body . $ts, $o[‘site_secret’] ?? ”);
return [‘ts’=>$ts,’sig’=>$sig,’site_id’=>$o[‘site_id’] ?? ”,’base_url’=>$o[‘base_url’] ?? ”];
}
}
new AIG_Proxy_Chat();

Client JS (ai-proxy-chat/chat.js)
(function(){
const log = (t)=>{ const el = document.getElementById(‘aig-log’); el.innerHTML += t + ‘
‘; el.scrollTop=el.scrollHeight; };
document.addEventListener(‘submit’, async (e)=>{
if(e.target.id !== ‘aig-form’) return;
e.preventDefault();
const input = document.getElementById(‘aig-input’);
const msg = input.value.trim(); if(!msg) return;
log(‘You: ‘+msg); input.value = ”;

const body = JSON.stringify({model:’gpt-4o-mini’, stream:true, messages:[{role:’user’,content:msg}]});
const sigRes = await fetch(AIG_CHAT.sigEndpoint, {method:’POST’, credentials:’same-origin’, headers:{‘Content-Type’:’application/x-www-form-urlencoded’}, body:new URLSearchParams({_wpnonce:AIG_CHAT.nonce, body})});
const {ts,sig,site_id,base_url} = await sigRes.json();

const url = base_url.replace(//+$/,”) + ‘/v1/chat’;
const es = new EventSourcePolyfill ? new EventSourcePolyfill(url, {
headers: {‘X-Site-ID’:site_id,’X-Timestamp’:String(ts),’X-Signature’:sig,’Content-Type’:’application/json’},
payload: body
}) : null;

if (es){
let acc = ”;
es.onmessage = (ev)=>{ try {
const d = JSON.parse(ev.data);
const delta = d.choices?.[0]?.delta?.content || d.choices?.[0]?.message?.content || ”;
if(delta){ acc += delta; }
if(delta) log(delta);
} catch(_){} };
es.addEventListener(‘done’, ()=>{ log(‘


‘); es.close(); });
es.addEventListener(‘error’, ()=>{ log(‘Stream error‘); es.close(); });
} else {
// Fallback: POST then append
const res = await fetch(url, {method:’POST’, headers:{‘X-Site-ID’:site_id,’X-Timestamp’:String(ts),’X-Signature’:sig,’Content-Type’:’application/json’}, body});
const data = await res.json();
const text = data.choices?.[0]?.message?.content || ‘[no content]’;
log(text); log(‘


‘);
}
}, true);
})();

Hardening checklist
– WordPress: escape output, sanitize options, restrict settings to manage_options, use nonces everywhere.
– Proxy: validate JSON schema, enforce token limits, redact PII in logs, cap request size (e.g., 256KB), timeouts + retries, 429/503 behavior.
– NGINX: limit_req by IP as outer guard; set client_max_body_size 512k.
– Redis: use ACLs and TLS; set maxmemory with allkeys-lru for cache eviction.
– Keys: rotate provider keys; per-site secrets; revoke on abuse.
– Observability: request_id header, structured JSON logs, latency and token metrics.

Performance notes
– Streaming path: TTFB ~80–150 ms via proxy; throughput limited by provider stream.
– Non-stream with cache: ~5–15 ms from Redis hit.
– Uvicorn workers scale horizontally; keep WordPress PHP-FPM unchanged.

When to extend
– Add /v1/embeddings with response cache and vector store indexing.
– Add model routing policy and quota per site.
– Add document upload pipeline (signed URLs, antivirus, OCR) before LLM.

Production-Ready Streaming AI Chat in WordPress: REST Endpoint, Nonces, and Redis Rate Limits

What we’re building
– A WordPress plugin that streams AI responses to the browser using Server-Sent Events (SSE)
– Front-end shortcode + minimal JS
– Backend REST endpoint with nonce validation, rate limits in Redis, and provider key isolation
– Works with any SSE-capable model provider (OpenAI, Anthropic, etc.)

High-level architecture
– Browser: Shortcode renders chat UI and script. JS posts user messages to /wp-json/ai/v1/chat and reads streaming tokens via EventSource.
– WordPress plugin: Validates nonce, enforces rate limit, proxies request to the AI provider, streams tokens back.
– Redis: Token bucket per user/IP for rate limiting.
– Secrets: API key injected via environment (wp-config.php), never exposed to front-end.

Prereqs
– WordPress 6.4+
– PHP 8.1+
– Redis available (phpredis or Predis). Fallback to transients if Redis isn’t available.
– Web server buffering disabled for the streaming route (notes below).

Plugin: file structure
– wp-content/plugins/ai-stream-chat/ai-stream-chat.php
– wp-content/plugins/ai-stream-chat/assets/chat.js

ai-stream-chat.php (core plugin)
esc_url_raw(rest_url(self::REST_NAMESPACE . ‘/’ . self::REST_ROUTE)),
‘nonce’ => wp_create_nonce(self::NONCE_ACTION),
]);
wp_enqueue_script($handle);
wp_enqueue_style(‘ai-sc-css’, false);
$css = ‘.ai-sc{max-width:720px;margin:1rem auto;padding:1rem;border:1px solid #ddd;border-radius:8px}.ai-sc-log{white-space:pre-wrap;font-family:system-ui, -apple-system, Segoe UI, Roboto, Arial;padding:.5rem;height:320px;overflow:auto;background:#fafafa;border:1px solid #eee;border-radius:6px;margin:.5rem 0}.ai-sc-row{display:flex;gap:.5rem}.ai-sc-row input{flex:1;padding:.6rem}.ai-sc-row button{padding:.6rem 1rem}’;
wp_add_inline_style(‘ai-sc-css’, $css);
}

public function shortcode($atts) {
ob_start(); ?>
<div class="ai-sc" data-endpoint="”>

WP_REST_Server::READABLE,
‘callback’ => [$this, ‘handle_sse’],
‘permission_callback’ => ‘__return_true’,
‘args’ => [
‘q’ => [‘required’ => true, ‘sanitize_callback’ => ‘sanitize_text_field’],
‘_wpnonce’ => [‘required’ => true],
],
]);
}

private function get_api_key() {
// Define in wp-config.php: define(‘AI_PROVIDER_KEY’, ‘sk-…’);
return defined(‘AI_PROVIDER_KEY’) ? AI_PROVIDER_KEY : getenv(‘AI_PROVIDER_KEY’);
}

private function get_client_id() {
$ip = $_SERVER[‘REMOTE_ADDR’] ?? ‘0.0.0.0’;
$uid = get_current_user_id();
return $uid ? “u:$uid” : “ip:$ip”;
}

private function redis() {
if (class_exists(‘Redis’)) {
static $r = null;
if (!$r) {
$r = new Redis();
$r->connect(defined(‘WP_REDIS_HOST’) ? WP_REDIS_HOST : ‘127.0.0.1’, defined(‘WP_REDIS_PORT’) ? WP_REDIS_PORT : 6379, 1.0);
if (defined(‘WP_REDIS_PASSWORD’) && WP_REDIS_PASSWORD) $r->auth(WP_REDIS_PASSWORD);
if (defined(‘WP_REDIS_DB’)) $r->select(WP_REDIS_DB);
}
return $r;
}
return null;
}

private function token_bucket_allow($key, $capacity, $refill_ms) {
$now = (int) (microtime(true) * 1000);
$r = $this->redis();
if ($r) {
$lua = ”
local key = KEYS[1]
local now = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local refill_ms = tonumber(ARGV[3])
local tokens = tonumber(redis.call(‘HGET’, key, ‘tokens’) or capacity)
local ts = tonumber(redis.call(‘HGET’, key, ‘ts’) or now)
local delta = math.max(0, now – ts)
local add = (capacity * delta) / refill_ms
tokens = math.min(capacity, tokens + add)
local allowed = 0
if tokens >= 1 then
tokens = tokens – 1
allowed = 1
end
redis.call(‘HSET’, key, ‘tokens’, tokens, ‘ts’, now)
redis.call(‘PEXPIRE’, key, refill_ms)
return allowed
“;
$res = $r->eval($lua, [$key, $now, $capacity, $refill_ms], 1);
return (bool) $res;
}
// Fallback to transients
$st = get_transient($key);
if (!$st) $st = [‘tokens’ => $capacity, ‘ts’ => $now];
$delta = max(0, $now – $st[‘ts’]);
$st[‘tokens’] = min($capacity, $st[‘tokens’] + ($capacity * $delta) / $refill_ms);
$allowed = $st[‘tokens’] >= 1;
if ($allowed) $st[‘tokens’] -= 1;
$st[‘ts’] = $now;
set_transient($key, $st, 60);
return $allowed;
}

public function handle_sse(WP_REST_Request $req) {
if (!wp_verify_nonce($req->get_param(‘_wpnonce’), self::NONCE_ACTION)) {
return new WP_REST_Response([‘error’ => ‘Invalid nonce’], 403);
}

$q = trim((string) $req->get_param(‘q’));
if ($q === ” || strlen($q) > 500) {
return new WP_REST_Response([‘error’ => ‘Invalid input’], 400);
}

$clientId = $this->get_client_id();
$bucketKey = self::RATE_LIMIT_BUCKET . ‘:’ . $clientId;
if (!$this->token_bucket_allow($bucketKey, self::RATE_LIMIT_CAPACITY, self::RATE_LIMIT_REFILL_MS)) {
return new WP_REST_Response([‘error’ => ‘Rate limited’], 429);
}

$apiKey = $this->get_api_key();
if (!$apiKey) {
return new WP_REST_Response([‘error’ => ‘Server not configured’], 500);
}

// Start SSE stream
nocache_headers();
header(‘Content-Type: text/event-stream’);
header(‘Cache-Control: no-cache, no-transform’);
header(‘X-Accel-Buffering: no’); // Nginx
header(‘Connection: keep-alive’);

// Disable WP/Apache buffering
if (function_exists(‘apache_setenv’)) @apache_setenv(‘no-gzip’, ‘1’);
@ini_set(‘output_buffering’, ‘off’);
@ini_set(‘zlib.output_compression’, ‘0’);
while (ob_get_level() > 0) @ob_end_flush();
@ob_implicit_flush(1);

$providerUrl = ‘https://api.openai.com/v1/chat/completions’;
$payload = [
‘model’ => ‘gpt-4o-mini’,
‘stream’ => true,
‘messages’ => [
[‘role’ => ‘system’, ‘content’ => ‘You are a concise assistant.’],
[‘role’ => ‘user’, ‘content’ => $q]
],
// Optional: ‘temperature’ => 0.2,
];
$ch = curl_init($providerUrl);
curl_setopt_array($ch, [
CURLOPT_HTTPHEADER => [
‘Authorization: Bearer ‘ . $apiKey,
‘Content-Type: application/json’,
],
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => json_encode($payload),
CURLOPT_WRITEFUNCTION => function($ch, $data) {
// Pass provider SSE chunks to client
echo $data;
echo “n”; // Ensure flush cadence
flush();
return strlen($data);
},
CURLOPT_TIMEOUT => 60,
CURLOPT_CONNECTTIMEOUT => 5,
CURLOPT_RETURNTRANSFER => false,
]);

// Preface event to help client reset UI
echo “event: startndata: {“ok”:true}nn”;
flush();

curl_exec($ch);
if (curl_errno($ch)) {
$err = curl_error($ch);
echo “event: errorndata: ” . json_encode([‘message’ => $err]) . “nn”;
flush();
}
curl_close($ch);

echo “event: donendata: {“ok”:true}nn”;
flush();
exit; // Important: stop WordPress from adding anything else
}
}
new AI_Stream_Chat();

assets/chat.js (front-end)
(function(){
const log = document.getElementById(‘ai-sc-log’);
const input = document.getElementById(‘ai-sc-input’);
const btn = document.getElementById(‘ai-sc-send’);
const endpoint = (window.AI_SC && AI_SC.restUrl) || document.querySelector(‘.ai-sc’)?.dataset?.endpoint;

function append(text) {
log.textContent += text;
log.scrollTop = log.scrollHeight;
}

function startStream(q) {
// Transform REST GET with params for SSE
const url = new URL(endpoint);
url.searchParams.set(‘q’, q);
url.searchParams.set(‘_wpnonce’, AI_SC.nonce);

const es = new EventSource(url.toString());
let buffer = ”;

es.addEventListener(‘start’, () => {
append(‘Assistant: ‘);
});

es.onmessage = (e) => {
// Provider sends JSON chunks with choices[0].delta.content or similar; we handle plain text fragments too.
try {
const data = JSON.parse(e.data);
// OpenAI stream frames contain: choices[0].delta.content
const token = data?.choices?.[0]?.delta?.content ?? ”;
if (token) append(token);
} catch {
// Some providers send plain text lines
if (e.data && e.data !== ‘[DONE]’) append(e.data);
}
};

es.addEventListener(‘done’, () => {
append(‘n’);
es.close();
btn.disabled = false;
input.disabled = false;
});

es.addEventListener(‘error’, (e) => {
append(‘n[Stream error]n’);
es.close();
btn.disabled = false;
input.disabled = false;
});
}

btn?.addEventListener(‘click’, () => {
const q = (input.value || ”).trim();
if (!q) return;
btn.disabled = true;
input.disabled = true;
append(‘You: ‘ + q + ‘n’);
startStream(q);
input.value = ”;
});

input?.addEventListener(‘keydown’, (e) => {
if (e.key === ‘Enter’) btn.click();
});
})();

wp-config.php secrets
– Add: define(‘AI_PROVIDER_KEY’, ‘sk-live-…’);
– Optional Redis: define(‘WP_REDIS_HOST’, ‘127.0.0.1’); define(‘WP_REDIS_PORT’, 6379); define(‘WP_REDIS_PASSWORD’, ‘…’); define(‘WP_REDIS_DB’, 1);

Nginx/Apache streaming notes
– Nginx: add to location for /wp-json/ai/v1/chat: proxy_buffering off; add_header X-Accel-Buffering no;
– Cloudflare/Proxies: ensure response is not buffered; disable HTML minification for the route.
– PHP-FPM: set fastcgi_buffering off; increase read timeout to 60s if needed.

Security hardening
– Nonce required per request; no provider key in the browser.
– Input sanitized and length-limited.
– Rate limiting per user/IP; tune capacity/refill.
– Consider restricting to logged-in users only, or use an additional secret header from server-rendered page.
– Log abuse via WP_DEBUG_LOG or a dedicated audit table.

Performance considerations
– Streaming prevents large payload waits; flush early, flush often.
– Keep REST handler stateless; no session locks.
– Use short provider timeouts and surface errors.
– If no Redis, transients work but are weaker under load.

Extending to production
– Add system prompt controls in wp-admin.
– Store minimal chat history server-side (post meta or custom table) with TTL.
– Swap providers by moving providerUrl/payload to a strategy class.
– Add billing/quotas if exposing to public users.

Usage
– Activate plugin
– Place [ai_chat] on a page
– Set AI_PROVIDER_KEY
– Test: open page, send a message, verify streaming

Build a Secure AI Inference Proxy for WordPress (Caching, Rate Limits, No Exposed Keys)

Why this pattern
– Keep API keys off the client and out of theme code.
– Centralize validation, rate limits, and model allowlists.
– Add caching and observability for cost and performance control.

High-level architecture
– Client: fetches /wp-json/ai/v1/infer with a nonce.
– WordPress plugin: validates input, enforces rate limits, caches responses, proxies to the LLM vendor with server-side keys.
– Vendor API: OpenAI, Anthropic, or your inference server.
– Optional queue: for long-running generations.

Minimal plugin (secure proxy)
File: wp-content/plugins/ai-inference-proxy/ai-inference-proxy.php

‘POST’,
‘callback’ => [$this, ‘handle_infer’],
‘permission_callback’ => function() {
return current_user_can(‘read’) || is_user_logged_in() || wp_doing_ajax() || true;
},
‘args’ => [
‘prompt’ => [‘type’ => ‘string’, ‘required’ => true],
‘model’ => [‘type’ => ‘string’, ‘required’ => false],
‘max_tokens’ => [‘type’ => ‘integer’, ‘required’ => false],
‘temperature’ => [‘type’ => ‘number’, ‘required’ => false],
],
]);
}

private function get_client_fingerprint(WP_REST_Request $req) {
$user = get_current_user_id();
$ip = $_SERVER[‘REMOTE_ADDR’] ?? ‘0.0.0.0’;
return $user ? “u:$user” : “ip:$ip”;
}

private function rate_limit_key($fp) {
return “aigil_rl_{$fp}”;
}

private function cache_key($model, $prompt, $params) {
$h = wp_hash($model . ‘|’ . $prompt . ‘|’ . json_encode($params));
return “aigil_cache_$h”;
}

private function hit_rate_limit($key) {
$now = time();
$entry = get_transient($key);
if (!$entry) {
$entry = [‘count’ => 1, ‘reset’ => $now + self::WINDOW_SEC];
set_transient($key, $entry, self::WINDOW_SEC);
return false;
}
if ($entry[‘reset’] 1, ‘reset’ => $now + self::WINDOW_SEC];
set_transient($key, $entry, self::WINDOW_SEC);
return false;
}
$entry[‘count’]++;
set_transient($key, $entry, $entry[‘reset’] – $now);
return $entry[‘count’] > self::RATE_LIMIT;
}

private function vendor_request($body) {
// Read server-side secrets from wp-config.php or environment.
$api_key = defined(‘OPENAI_API_KEY’) ? OPENAI_API_KEY : getenv(‘OPENAI_API_KEY’);
if (!$api_key) return new WP_Error(‘no_key’, ‘Server not configured’, [‘status’ => 500]);

// Map to your vendor endpoint. Example: OpenAI Responses API
$url = ‘https://api.openai.com/v1/responses’;

$args = [
‘timeout’ => 20,
‘redirection’ => 0,
‘blocking’ => true,
‘headers’ => [
‘Authorization’ => ‘Bearer ‘ . $api_key,
‘Content-Type’ => ‘application/json’,
],
‘body’ => wp_json_encode($body),
];

$resp = wp_remote_post($url, $args);
if (is_wp_error($resp)) return $resp;

$code = wp_remote_retrieve_response_code($resp);
$data = json_decode(wp_remote_retrieve_body($resp), true);

if ($code >= 400) {
return new WP_Error(‘vendor_error’, ‘Upstream error’, [
‘status’ => 502,
‘details’ => [‘code’ => $code, ‘body’ => $data]
]);
}
return $data;
}

public function handle_infer(WP_REST_Request $req) {
// Input hardening
$prompt = trim((string) $req->get_param(‘prompt’));
if ($prompt === ” || mb_strlen($prompt) > 4000) {
return new WP_Error(‘bad_input’, ‘Invalid prompt’, [‘status’ => 400]);
}

$model = (string) ($req->get_param(‘model’) ?: ‘gpt-4o-mini’);
$allow = [‘gpt-4o-mini’, ‘gpt-4o’, ‘o3-mini’];
if (!in_array($model, $allow, true)) {
return new WP_Error(‘model_not_allowed’, ‘Model not allowed’, [‘status’ => 400]);
}

$max_tokens = min(800, max(50, (int) ($req->get_param(‘max_tokens’) ?: 400)));
$temperature = max(0.0, min(1.0, (float) ($req->get_param(‘temperature’) ?: 0.2)));

// Rate limiting
$fp = $this->get_client_fingerprint($req);
$rl_key = $this->rate_limit_key($fp);
if ($this->hit_rate_limit($rl_key)) {
return new WP_Error(‘rate_limited’, ‘Too many requests’, [‘status’ => 429]);
}

// Cache check
$cache_params = [‘model’ => $model, ‘max_tokens’ => $max_tokens, ‘temperature’ => $temperature];
$ckey = $this->cache_key($model, $prompt, $cache_params);
$cached = wp_cache_get($ckey, ‘aigil’);
if ($cached) {
return rest_ensure_response([
‘cached’ => true,
‘model’ => $model,
‘output’ => $cached,
]);
}

// Build vendor body (OpenAI Responses format)
$body = [
‘model’ => $model,
‘input’ => [
[‘role’ => ‘system’, ‘content’ => ‘Be concise and helpful.’],
[‘role’ => ‘user’, ‘content’ => $prompt],
],
‘max_output_tokens’ => $max_tokens,
‘temperature’ => $temperature,
];

// Call vendor
$data = $this->vendor_request($body);
if (is_wp_error($data)) return $data;

// Extract text safely (Responses API)
$text = ”;
if (isset($data[‘output’]) && is_array($data[‘output’])) {
foreach ($data[‘output’] as $item) {
if (($item[‘type’] ?? ”) === ‘message’ && isset($item[‘content’][0][‘text’])) {
$text .= $item[‘content’][0][‘text’];
}
}
} elseif (isset($data[‘choices’][0][‘message’][‘content’])) {
$text = $data[‘choices’][0][‘message’][‘content’];
}

$text = trim((string) $text);

// Cache store (object cache/Redis-aware)
if ($text !== ”) {
wp_cache_set($ckey, $text, ‘aigil’, self::CACHE_TTL);
}

// Minimal analytics log (avoid PII)
error_log(sprintf(‘[AI_PROXY] model=%s len=%d user=%s’, $model, strlen($text), $fp));

return rest_ensure_response([
‘cached’ => false,
‘model’ => $model,
‘output’ => $text,
]);
}
}

new AIGIL_Proxy();

Server config
– Store keys server-side:
– In wp-config.php: define(‘OPENAI_API_KEY’, ‘sk-xxx’);
– Or environment: set in Docker/K8s secret, read via getenv.
– Enable persistent object cache (Redis or Memcached) for effective caching.
– Set correct timeouts at PHP-FPM and reverse proxy (Nginx) > plugin timeout.

Front-end usage (nonce + fetch)
1) Enqueue and localize in your theme or plugin:

esc_url_raw( rest_url(‘ai/v1’) ),
‘nonce’ => wp_create_nonce(‘wp_rest’),
]);
});

2) ai-client.js:

async function askLLM(prompt) {
const res = await fetch(`${AIGIL.root}/infer`, {
method: ‘POST’,
headers: {
‘Content-Type’: ‘application/json’,
‘X-WP-Nonce’: AIGIL.nonce
},
body: JSON.stringify({
prompt,
model: ‘gpt-4o-mini’,
max_tokens: 400,
temperature: 0.2
})
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(err?.message || `HTTP ${res.status}`);
}
return res.json();
}

askLLM(‘Summarize today’s sales KPIs.’).then(console.log).catch(console.error);

Production notes
– Validate input length and strip HTML from user content if taking from forms.
– Model allowlist blocks costlier or experimental models by default.
– Rate limits: move to IP + user + UA combo if needed. For high-traffic, use Redis INCR with TTL.
– Caching: hash prompt + params. For authenticated/private use, consider user-scoped keys to avoid data leakage.
– Timeouts/retries: prefer a single attempt with a 20–30s timeout; log upstream latency.
– Logging: ship anonymized logs to a central sink (e.g., CloudWatch, ELK). Never log full prompts with PII.
– Streaming: if you need token streaming, prefer a Node/Python edge worker and forward via Server-Sent Events; WordPress can stream, but proxies and PHP buffers often break it.
– Cost control: apply server-side prompt templates and max token caps. Add a simple quota per user.
– Security: do not expose keys client-side; use HTTPS; audit access to the REST route; consider capability checks for admin-only models.

Extending the proxy
– Add tool/function calling with an allowlisted function registry and strict JSON schemas.
– Queue long jobs using Action Scheduler; return a job_id and poll a status route.
– Add vendor adapters (Anthropic, OpenRouter, local) with a small interface for portability.

This proxy pattern keeps your WordPress stack secure, fast, and maintainable while integrating LLM features in production.

Production-Ready AI Chat Endpoint for WordPress: Secure REST API, Token Budgeting, and Queueing

Why this matters
– Most “AI for WordPress” attempts call providers directly from the browser. That leaks keys, invites prompt injection, and breaks at scale.
– This post shows a production-ready pattern: a secure WordPress REST endpoint that enqueues requests, budgets tokens, calls a server-side AI proxy, and returns cached results.

Architecture overview
– Client (WP front end or headless): POST /wp-json/ai/v1/chat with a conversation id and message.
– WordPress plugin:
– Validates nonce/JWT and role capability.
– Enforces per-user and per-route rate limits.
– Computes token budgets and truncates history.
– Enqueues a background job via Action Scheduler.
– Returns a request id; client polls GET /wp-json/ai/v1/chat/{id}.
– AI Proxy (recommended): a small backend (e.g., Django/FastAPI) that holds provider API keys, normalizes providers (OpenAI, Anthropic), handles retries, and redacts PII per policy.
– Storage:
– wp_posts or custom table for ai_requests (status, input hash, output, token usage).
– Transient or object cache for hot responses.
– Optional: SSE/WebSocket via a small Node/Edge worker if you need streaming.

Data model (custom table)
– Table: wp_ai_requests
– id (bigint), user_id, status (pending, running, done, error)
– route (chat, summarize, classify)
– input_hash (sha256 for cache dedupe)
– prompt_json (sanitized, compact)
– result_json
– provider (openai, anthropic)
– tokens_in, tokens_out, cost_usd, created_at, updated_at

Minimal plugin (core pieces)
File: ai-chat-endpoint/ai-chat-endpoint.php
/*
Plugin Name: AI Chat Endpoint
Description: Secure AI chat REST API with queueing and token budgeting.
Version: 0.1.0
*/

if (!defined(‘ABSPATH’)) exit;

class AIGuyLA_AI_Endpoint {
const NS = ‘ai/v1’;

public function __construct() {
add_action(‘rest_api_init’, [$this, ‘routes’]);
add_action(‘ai_chat_process_request’, [$this, ‘process_request’], 10, 1);
}

public function routes() {
register_rest_route(self::NS, ‘/chat’, [
‘methods’ => ‘POST’,
‘callback’ => [$this, ‘create_request’],
‘permission_callback’ => [$this, ‘can_use_ai’],
‘args’ => [
‘conversation_id’ => [‘required’ => true],
‘message’ => [‘required’ => true],
],
]);

register_rest_route(self::NS, ‘/chat/(?Pd+)’, [
‘methods’ => ‘GET’,
‘callback’ => [$this, ‘get_request’],
‘permission_callback’ => [$this, ‘can_use_ai’],
]);
}

public function can_use_ai(WP_REST_Request $req) {
// Nonce or JWT check; fallback to logged-in capability.
if (is_user_logged_in() && current_user_can(‘read’)) return true;
return false;
}

private function rate_limited($user_id) {
$key = ‘ai_rl_’ . $user_id;
$hits = (int) get_transient($key);
if ($hits > 30) return true; // 30 req / 5 min
set_transient($key, $hits + 1, 5 * MINUTE_IN_SECONDS);
return false;
}

private function tokenize_estimate($text) {
// Cheap heuristic; replace with tiktoken server-side if needed.
$wc = str_word_count($text);
return (int) max(1, $wc * 1.3);
}

private function trim_history($messages, $max_tokens) {
$budget = $max_tokens;
$out = [];
for ($i = count($messages) – 1; $i >= 0; $i–) {
$t = $this->tokenize_estimate(json_encode($messages[$i]));
if ($t > $budget) break;
$out[] = $messages[$i];
$budget -= $t;
}
return array_reverse($out);
}

public function create_request(WP_REST_Request $req) {
$user_id = get_current_user_id();
if ($this->rate_limited($user_id)) {
return new WP_REST_Response([‘error’ => ‘rate_limited’], 429);
}

$conv_id = sanitize_text_field($req[‘conversation_id’]);
$message = wp_kses_post($req[‘message’]);

// Build messages (fetch last N from your store).
$history = []; // TODO: load from your conversation table.
$messages = array_merge($history, [[‘role’ => ‘user’, ‘content’ => $message]]);

$messages = $this->trim_history($messages, 6000); // leave room for output

$payload = [
‘provider’ => ‘openai:gpt-4o-mini’,
‘temperature’ => 0.2,
‘messages’ => $messages,
‘system’ => ‘You are a concise assistant.’,
‘max_output_tokens’ => 800,
‘metadata’ => [‘wp_user’ => $user_id, ‘conversation_id’ => $conv_id],
];

$input_hash = hash(‘sha256’, json_encode($payload));

global $wpdb;
$table = $wpdb->prefix . ‘ai_requests’;
$wpdb->insert($table, [
‘user_id’ => $user_id,
‘status’ => ‘pending’,
‘route’ => ‘chat’,
‘input_hash’ => $input_hash,
‘prompt_json’ => wp_json_encode($payload),
‘created_at’ => current_time(‘mysql’, 1),
‘updated_at’ => current_time(‘mysql’, 1),
]);
$id = (int) $wpdb->insert_id;

if (function_exists(‘as_enqueue_async_action’)) {
as_enqueue_async_action(‘ai_chat_process_request’, [$id], ‘ai’);
} else {
// Fallback: process inline (not recommended in prod).
$this->process_request($id);
}

return [‘id’ => $id, ‘status’ => ‘queued’];
}

public function get_request(WP_REST_Request $req) {
global $wpdb;
$table = $wpdb->prefix . ‘ai_requests’;
$row = $wpdb->get_row($wpdb->prepare(“SELECT * FROM $table WHERE id=%d”, (int) $req[‘id’]), ARRAY_A);
if (!$row) return new WP_REST_Response([‘error’ => ‘not_found’], 404);

// Limit data exposure.
return [
‘id’ => (int) $row[‘id’],
‘status’ => $row[‘status’],
‘result’ => $row[‘result_json’] ? json_decode($row[‘result_json’], true) : null,
‘tokens’ => [
‘in’ => (int) $row[‘tokens_in’],
‘out’ => (int) $row[‘tokens_out’],
]
];
}

public function process_request($id) {
global $wpdb;
$table = $wpdb->prefix . ‘ai_requests’;
$row = $wpdb->get_row($wpdb->prepare(“SELECT * FROM $table WHERE id=%d”, (int)$id), ARRAY_A);
if (!$row || $row[‘status’] !== ‘pending’) return;

$wpdb->update($table, [‘status’ => ‘running’, ‘updated_at’ => current_time(‘mysql’, 1)], [‘id’ => $id]);

$payload = json_decode($row[‘prompt_json’], true);

// Call your secure proxy instead of provider directly.
$proxy_url = getenv(‘AI_PROXY_URL’);
$proxy_key = getenv(‘AI_PROXY_KEY’);

$resp = wp_remote_post($proxy_url . ‘/v1/chat’, [
‘timeout’ => 30,
‘headers’ => [
‘Authorization’ => ‘Bearer ‘ . $proxy_key,
‘Content-Type’ => ‘application/json’,
],
‘body’ => wp_json_encode($payload),
]);

if (is_wp_error($resp)) {
$wpdb->update($table, [‘status’ => ‘error’, ‘result_json’ => wp_json_encode([‘error’ => $resp->get_error_message()])], [‘id’ => $id]);
return;
}

$code = wp_remote_retrieve_response_code($resp);
$body = wp_remote_retrieve_body($resp);

if ($code !== 200) {
$wpdb->update($table, [‘status’ => ‘error’, ‘result_json’ => $body], [‘id’ => $id]);
return;
}

$data = json_decode($body, true);
$tokens_in = isset($data[‘usage’][‘prompt_tokens’]) ? (int)$data[‘usage’][‘prompt_tokens’] : 0;
$tokens_out = isset($data[‘usage’][‘completion_tokens’]) ? (int)$data[‘usage’][‘completion_tokens’] : 0;

$wpdb->update($table, [
‘status’ => ‘done’,
‘result_json’ => wp_json_encode($data),
‘tokens_in’ => $tokens_in,
‘tokens_out’ => $tokens_out,
‘updated_at’ => current_time(‘mysql’, 1)
], [‘id’ => $id]);
}
}
new AIGuyLA_AI_Endpoint();

Register the table on activation
– Create the table using dbDelta.
– Install Action Scheduler (composer or plugin) for reliable background jobs.

Security hardening
– Never embed provider API keys in JS. Use a server-side proxy with IP allowlist and per-tenant keys.
– Validate nonce or JWT on every request. For headless, use short-lived JWT via a login endpoint.
– Enforce:
– Per-user rate limit (transient/object cache).
– Per-route max tokens and max output tokens.
– Allowed roles/capabilities (e.g., manage_options for admin-only routes).
– Sanitize content and strip HTML from user prompts where not needed.
– Log only necessary fields; avoid storing raw PII.

Performance considerations
– Cache identical requests by input_hash for 5–30 minutes to eliminate repeats.
– Use persistent object cache (Redis) to reduce db hits.
– Set timeouts and retry with exponential backoff in the proxy, not in WordPress.
– Batch-cron Action Scheduler to run with a dedicated queue (group “ai”) and WP-CLI runner.
– Keep payloads compact; remove redundant system prompts and reduce message metadata.

Server-side AI proxy (FastAPI example sketch)
– Endpoints: POST /v1/chat
– Responsibilities:
– Map provider models, inject safety/system prompts, enforce token ceilings.
– Retry on 429/5xx with jitter.
– Return normalized JSON with usage and finish_reason.
– Sign results with an HMAC if you need tamper detection.

Client usage example
– POST /wp-json/ai/v1/chat with:
– conversation_id: “abc123”
– message: “Summarize the last 3 updates in this thread.”
– Response: { id: 42, status: “queued” }
– Poll GET /wp-json/ai/v1/chat/42 until status = “done”, then render result. For streaming, offload to an SSE service.

Observability and cost control
– Store tokens_in/tokens_out and compute cost_usd in a nightly job.
– Add a WP-CLI command: wp ai:stats to print per-user usage.
– Alert when provider 5xx rate > 2% over 15 minutes or latency > 6s p95.

Deployment checklist
– Put AI_PROXY_URL and AI_PROXY_KEY in wp-config.php via environment variables.
– Enforce HTTPS everywhere; HSTS on the proxy.
– Enable Redis object cache and Action Scheduler health checks.
– Backup the ai_requests table; set a 30–60 day retention policy.

What to build next
– SSE streaming endpoint via a tiny Node worker subscribed to a Redis pub/sub channel.
– Vector augmentation: add a retrieval step (pgvector) before the chat call.
– UI block: a Gutenberg block that handles nonce, posting, and polling with exponential backoff.

This pattern keeps secrets off the client, scales with queues, and provides clear control over cost, latency, and reliability—all within a WordPress environment.

Developing Smart WordPress & Web Solutions

WordPress is not just a blogging platform; it’s a robust content management system (CMS) that powers over 40% of the websites on the internet. When businesses invest in smart WordPress and web solutions, they are essentially tailoring the web’s most popular CMS to meet their exact needs. For a growing company, an off-the-shelf theme or plugin can only go so far. To stand out in a crowded digital landscape you need a site that reflects your brand identity, streamlines operations and scales with you. Smart solutions go beyond aesthetics — they fuse design, functionality, marketing and automation into a unified digital strategy. Whether you run a local shop, a membership community or a global e-commerce store, crafting a custom experience shows customers that you care about their journey.

A smart build starts with solid foundations. Use a lightweight, well-maintained theme or create a child theme to safely override styles and functions without risking future updates. Custom themes let you control every pixel, ensuring your site looks polished on desktop and mobile. For dynamic features, build or commission bespoke plugins instead of piling on third-party extensions that slow down your site. A custom plugin can handle a specific business need like calculating shipping rates, managing event registrations or creating a tailored booking workflow. By keeping your code lean and tailored, you avoid bloated features you don’t need and reduce security risks from abandoned plugins. A modular approach also makes future changes easier because you know exactly how each component works.

Integration is where a WordPress site becomes a true business hub. Instead of manually copying data between systems, connect your forms, storefront and membership areas to CRMs, marketing platforms and payment gateways via secure APIs. For instance, your contact form can feed leads directly into HubSpot or Salesforce, triggering follow-up sequences. WooCommerce orders can sync with your inventory management software so stock levels stay accurate in real time. Appointment bookings might update your calendar and send meeting invites automatically. Automations built with tools like Zapier or custom webhook handlers reduce repetitive tasks, improve accuracy and free up your team for higher-value work. When data flows seamlessly across systems you get a holistic view of your customer journey and can make smarter decisions.

Smart web solutions prioritize performance and security from day one. A slow site not only frustrates visitors but also hurts your search rankings. Optimize performance by compressing images, lazy loading media, minifying CSS and JavaScript files and leveraging browser caching and content delivery networks (CDNs). Conduct regular audits to identify plugins or scripts that slow down page load times. Security is equally important. Keep WordPress core, themes and plugins up to date, enforce strong passwords and two-factor authentication and install a reputable security plugin to monitor suspicious activity. Back up your site daily and test your disaster recovery plan. Implement HTTPS everywhere and limit login attempts to deter brute-force attacks. By proactively addressing performance and security you build trust with your visitors and safeguard your business.

Artificial intelligence and machine learning can elevate a WordPress site into a smart digital assistant. AI-powered chatbots built with platforms like Dialogflow or ChatGPT can handle pre-sales questions, schedule appointments, offer product recommendations and even troubleshoot common issues around the clock. Recommendation engines analyze user behavior to suggest content, products or services that increase engagement and conversion rates. Natural language processing can summarize blog posts or generate SEO-friendly meta descriptions automatically. Image recognition tools can tag photos for accessibility and search. These capabilities are no longer reserved for big companies; cloud-based APIs make it affordable to integrate AI into smaller sites. By automating routine interactions, AI frees up human staff for high-touch tasks while delivering a personalized experience for every visitor.

To continuously improve your website, you need data. Built-in analytics tools like Google Analytics or Matomo provide traffic insights, but smart sites go further. Use heatmaps to see where users click and scroll, and session recordings to identify friction points. Implement structured data (schema markup) so search engines understand your content and feature your site in rich snippets. Run A/B tests on headlines, page layouts and call-to-action buttons using tools like Google Optimize to find what resonates most with your audience. Set up goal tracking for sign-ups, purchases and other conversions, and tie that data back to your marketing campaigns. Regularly reviewing analytics helps you refine your content strategy, optimize funnels and allocate resources where they have the greatest impact.

Ultimately, developing smart WordPress and web solutions is an iterative process. Start by outlining your business goals and the user journeys that support them, then translate those into technical requirements. Work with experienced developers who understand both WordPress best practices and broader web standards, and ensure each feature you add has a clear purpose. Keep your site lean, secure and fast, integrate it with the tools that power your business and embrace automation and AI where it makes sense. By treating your website as a living system rather than a static brochure, you’ll create a platform that adapts to changing needs, delivers measurable results and grows alongside your business.

Start by choosing a lightweight theme and enhancing it with custom blocks or child themes. Use plugins like WooCommerce for e‑commerce, then connect them to your CRM so new orders trigger emails and updates. Consider adding AI chatbots or recommendation engines to improve conversion and customer satisfaction.

Performance and security matter, too. Optimize images, enable caching and keep your software up to date. Regularly review analytics and A/B test pages to understand how visitors behave. By building smart WordPress solutions, you’ll deliver better experiences and free up time to focus on growth.