Build a Lead Qualification Autopilot: Django + OpenAI + Slack + CRM

Why this matters
– Most SMBs and mid-market teams lose speed qualifying leads across channels.
– A simple, reliable AI triage layer will cut first-response time to minutes and focus human effort where it matters.

What we’ll build
– A Django-based service that:
– Ingests leads from web forms, inbound email, and LinkedIn exports.
– Normalizes into a unified Lead table.
– Uses a strict scoring rubric with OpenAI and deterministic rules.
– Generates a reasoned summary and next-action.
– Routes qualified leads to Slack and your CRM with owner assignment.
– Monitors performance and drift with weekly review artifacts.

High-level architecture
– Sources: Web form (WordPress), Gmail/GSuite, LinkedIn CSV or API.
– Ingest: Django REST endpoints + a Gmail watcher + a CSV importer.
– Processing: Celery worker, Redis queue, OpenAI for extraction + scoring.
– Storage: Postgres (Lead, Company, SourceEvent, Decision).
– Routing: Slack (webhook/app), CRM (HubSpot/Salesforce API), Email fallback.
– Observability: Django admin + Grafana/Prometheus (Celery/HTTP) + S3/Drive for artifacts.

Data model (Postgres)
– lead (id, company_id, source, raw_payload_json, name, email, title, phone, website, country, product_interest, free_text, utm_source, created_at)
– company (id, domain, name, employee_range, industry, tech_signals_json, first_seen_at)
– decision (id, lead_id, score_int, label_enum: [A,B,C,Drop], reasons_text, risk_flags_json, model_version, decided_at)
– route_event (id, lead_id, channel_enum: [Slack,CRM,Email], target, status, response_json, created_at)
– source_event (id, lead_id, channel_enum, external_id, received_at)

WordPress form integration (server-to-server)
– Use WPForms/Gravity Forms → Webhook to POST JSON to /api/leads/intake.
– Include UTM fields and page URL.

Django endpoints
– POST /api/leads/intake
– Auth: HMAC header X-Signature: sha256(body, SHARED_SECRET)
– Body: raw form/email payload
– Action: write source_event, create lead (normalized), enqueue process_lead task
– POST /api/leads/linkedin-import
– CSV upload → batch create leads, enqueue jobs
– POST /api/leads/email-hook
– For Gmail watcher to post new messages (subject, sender, snippet, body, attachments meta)

Normalization (cheap, deterministic)
– Extract probable name, email, company, website with regex + domain parse.
– If missing company, derive from email domain (public suffix list).
– Enrich employee_range/industry via Clearbit/People Data Labs or internal lookup (optional, cache by domain).

Scoring rubric (hybrid rules + LLM)
– Hard rules first (fast, free):
– Drop if disposable email or role-based (info@, sales@) unless site form indicates budget > $X.
– Country allow/deny list based on coverage.
– Product fit heuristics on free_text keywords.
– LLM extraction (OpenAI gpt-4o-mini or gpt-4.1-mini):
– Use JSON mode or function calling to return:
– {use_case, urgency_days, budget_band, decision_maker_bool, competitor_mentioned, complexity_level, blockers}
– Final score:
– Start 0
– +30 fit (use_case in supported list)
– +20 urgency 2%.

Cost controls
– Batch website fetch with 2s timeout, 1000-char cap.
– Prefer mini models; only escalate to larger model if uncertainty threshold triggered (e.g., two key fields null).
– Token accounting: persist per-decision token usage.

Evaluation loop (weekly)
– Sample 50 decisions (stratified by label).
– Human review in Google Sheet: correct label? reason quality? action appropriateness?
– Compute precision@A and downgrade/upgrade rates.
– Update rubric weights and prompt; bump model_version.

Security and compliance
– HMAC verification for all intake.
– Gmail watcher via Pub/Sub or Google Workspace push; never store full bodies longer than 30 days.
– DLP: redact credit cards, SSNs via regex before LLM send.
– Data retention policy per region (EU vs US storage).

Deployment notes
– Django + Gunicorn behind Nginx.
– Celery + Redis or SQS; schedule nightly health tasks.
– Postgres with row-level encryption for PII columns.
– Infrastructure as code (Terraform) and CI/CD (GitHub Actions).
– Feature flags for channel rollouts.

Go-live checklist
– Create Slack channels and app with necessary scopes.
– Connect CRM sandbox first; run shadow mode for 1 week (no routing, just scoring).
– Set owners and round-robin rules.
– Define Drop auto-reply template and disable initially.
– Establish weekly evaluation and a rollback plan.

ROI example (conservative)
– Current: 300 leads/month, 40% touched within 24h, 10% convert to opportunity, 20% win rate. Avg deal $6k.
– After autopilot: 90% touched within 2h, 13% convert to opportunity (lift from faster response and better routing), same win rate.
– Monthly opportunities: 39 → 51; wins: 7.8 → 10.2; incremental 2.4 wins ≈ $14.4k/month.
– Infra + API + build/maint: ≈ $1.5k/month variable + initial build. Payback under 1 month in many cases.

Minimal code pointers (pseudocode only)
– process_lead(lead_id):
– features = rules_extract(lead)
– if needs_llm(features): llm = call_openai(schema, context)
– score = combine(features, llm)
– save decision
– enqueue route_outbox(lead, decision)

– route_worker():
– for pending route_event: send_to_slack(); upsert_crm(); email_fallback()
– mark delivered with idempotency_key

Where this scales
– Add vertical-specific rubrics (SaaS vs Services).
– Auto-detect duplicates and merge at company level.
– Plug-in calendar availability for instant booking.

AI Guy in LA