# Proposal — Lead Enrichment via Clearbit / Apollo / People Data Labs

**Status:** draft (audit iter 9, 2026-05-16)
**Owner:** TBD
**Effort:** 5 days (3 backend, 1 frontend, 1 docs+QA)
**Plan gate:** Paid tiers only (Pro / Business / Enterprise)

---

## Problem

Today a `Lead` row stores only what the visitor types into the form:
`email`, `phone`, `name`, plus a free-form `fields` JSON. The sales rep
opening the inbox sees the bare minimum and has to manually look the
lead up on LinkedIn / Crunchbase / the company site before deciding
whether the lead is worth chasing.

Buyer feedback (Lucian, batch v3):
> "Half my leads are personal Gmail signups. If I could see the
> company size + title without me Googling each one I'd close 2× the
> qualified ones."

Competitors (Intercom, Drift, HubSpot Chat) ship this as a default —
when a known business email comes in, the inspector pane auto-shows
the company name, industry, headcount band, and the visitor's job
title.

---

## Goals (this proposal)

1. When a Lead is captured, async-fetch enrichment from a configurable
   provider chain and store the result on `leads.enrichment` (new
   JSON column).
2. Display enrichment in the existing operator inbox `LeadDrawer`
   pane — company logo + name + industry + headcount + visitor title.
3. Plan-gate the feature: only Pro / Business / Enterprise workspaces
   call the provider; Free tier sees a "Upgrade to enable" placeholder.
4. Per-workspace toggle (admin can disable enrichment to comply with
   GDPR / data-processing concerns).
5. Cache by domain for 30 days to keep external API spend low.

---

## Non-goals

- **No realtime enrichment on the hot path.** Enrichment fires from
  `LeadCapturedJob` (post-stream side effect), never before first
  token.
- **No automatic CRM push.** That's the existing `RouteLeadJob` path —
  enrichment is separate.
- **No PII enrichment beyond company-side data.** We pull domain →
  company, not email → person's home address / phone. Provider config
  whitelists fields.
- **No bulk enrichment of historical leads.** Net-new leads only.
  Backfill is a separate command for a follow-up release.

---

## Provider chain

Mirror the LLM provider pattern (`OpenAiClient` interface + 3 real
impls + fake). New service contract `LeadEnricher`:

```php
namespace App\Services\Leads\Contracts;

interface LeadEnricher
{
    /**
     * @param string $emailDomain  e.g. "shopify.com"
     * @param ?string $jobTitleHint  visitor-supplied title if available
     * @return array{
     *   provider: string,
     *   fetched_at: string,
     *   company: array{name: string, industry: ?string, size_band: ?string, country: ?string, logo_url: ?string, website: ?string},
     *   person: ?array{title: ?string, seniority: ?string, linkedin: ?string},
     * }|null
     */
    public function enrich(string $emailDomain, ?string $jobTitleHint = null): ?array;
}
```

Real impls (drop-in interchangeable via env):

| Provider          | Pricing (Jan 2026)              | Coverage    | API quality | Choice  |
|-------------------|---------------------------------|-------------|-------------|---------|
| Clearbit          | $99/mo for 250 enrichments      | High (NA/EU)| Best        | Default |
| Apollo            | $49/mo for 600 enrichments      | High        | Good        | Tier 2  |
| People Data Labs  | Pay-as-you-go, $0.05–0.20/lookup| Wide        | Mixed       | Tier 3  |
| FakeEnricher      | free                            | tests only  | n/a         | dev     |

Provider resolved in `AppServiceProvider::register()` like
`OpenAiClient`. Config keys: `LEAD_ENRICHER` (`clearbit|apollo|pdl|none`),
provider-specific `*_API_KEY`.

If `LEAD_ENRICHER=none` or provider returns null, persist
`enrichment = {provider:'none', reason:'…'}` so the UI knows we tried.

---

## Data model

New migration:

```php
Schema::table('leads', function (Blueprint $table) {
    $table->json('enrichment')->nullable()->after('fields');
    $table->timestampTz('enriched_at')->nullable()->after('enrichment');
});
```

Cast: `'enrichment' => 'array'`.

Cache table for domain-level reuse (so 50 leads from `shopify.com` =
1 provider call):

```php
Schema::create('lead_enrichment_cache', function (Blueprint $table) {
    $table->uuid('id')->primary();
    $table->string('domain')->unique();
    $table->string('provider', 32);
    $table->json('payload');
    $table->timestampTz('expires_at')->index();
    $table->timestampsTz();
});
```

Expiry = 30 days post-fetch; refresh on next request after that.

---

## Job pipeline

1. `LeadCapturedJob` (existing, fires from widget lead-capture
   controller) → `EnrichLeadJob::dispatch($lead->id)`.
2. `EnrichLeadJob`:
   - `$tries = 1`, `$timeout = 30`. Transient failures don't retry —
     enrichment is best-effort, not load-bearing.
   - Skip if workspace plan disables enrichment OR per-workspace
     toggle off OR email is null / personal-domain
     (`gmail.com|yahoo.com|outlook.com|hotmail.com|protonmail.com|icloud.com`
     hardcoded blocklist).
   - Look up cache by domain. Hit → write to lead, done.
   - Miss → call `LeadEnricher::enrich()`. Persist cache row + lead
     row in single transaction.
   - On exception: log, leave `enrichment=null`, no retry.
3. `RouteLeadJob` (existing CRM push) reads `lead->enrichment` if
   present, includes it in the outbound payload (HubSpot accepts
   `company_size_band` natively; Salesforce maps to custom field).

Hot-path safety: zero changes to the visitor-message stream. Lead
capture happens in a separate endpoint (`POST /api/v1/widget/leads`)
which is not on the hot path.

---

## Plan gating

Add to `App\Services\Billing\PlanLimits`:

```php
public function leadEnrichmentEnabled(Workspace $workspace): bool
{
    $plan = $this->planFor($workspace);
    return $plan !== null && (bool) ($plan->lead_enrichment_enabled ?? false);
}
```

Schema:

```php
Schema::table('plans', function (Blueprint $table) {
    $table->boolean('lead_enrichment_enabled')->default(false);
});
```

Default ON for Pro / Business / Enterprise (set in seeder), OFF for
Free. Workspace admin can additionally disable per-workspace via a new
`workspaces.lead_enrichment_enabled` boolean toggle (defaults TRUE
when plan supports it — opt-out, not opt-in, to keep the UX
discoverable).

---

## Frontend

`resources/js/pages/app/conversations/components/LeadDrawer.tsx` —
new "Company" section below the existing fields panel:

```
┌─ Company ──────────────────────────────────────┐
│ [logo] Shopify Inc.                            │
│        Software · 5,001-10,000 · Canada        │
│        shopify.com  ↗                          │
├─ Visitor role ─────────────────────────────────┤
│ Senior Product Manager · LinkedIn ↗            │
└────────────────────────────────────────────────┘
```

Skeleton state while `lead.enrichment === null && lead.enriched_at ===
null` (still pending). Empty state when `enrichment.provider === 'none'`.
Fallback `<Avatar/>` when `logo_url` missing.

Settings page: `resources/js/pages/app/settings/integrations.tsx` —
add a "Lead enrichment" card with provider name (read-only from env),
workspace toggle, and a sample preview using the workspace's most
recent enriched lead.

---

## Test plan

Pest feature tests:

- `EnrichLeadJob` calls FakeEnricher and writes `enrichment` + `enriched_at`
- `EnrichLeadJob` is a no-op when plan disables (verify no provider call)
- `EnrichLeadJob` is a no-op for personal-domain emails
- Second lead from same domain hits cache (no second provider call)
- Cache expiry refetches after `expires_at`
- LeadDrawer renders empty state when `provider:'none'`
- LeadDrawer renders enrichment fields when present
- Cross-tenant: workspace A's enrichment never appears on workspace B's lead

UI test plan (added to board card on close):
1. Sign in as Pro workspace admin.
2. Open the visitor widget on the demo agent, fill the lead form with
   `someone@stripe.com`.
3. Switch to operator inbox, open the new lead.
4. Confirm Company panel renders within 5 seconds (Clearbit lookup +
   cache write).
5. Capture a second lead from a different visitor on the same
   `stripe.com` domain → confirm the panel renders immediately (cache).
6. Sign in as a Free workspace → capture a lead → confirm panel shows
   "Upgrade to enable enrichment".

---

## Rollout

1. Ship the migration + job + FakeEnricher + tests behind
   `LEAD_ENRICHER=none` (default). No customer impact.
2. Set `LEAD_ENRICHER=clearbit` + `CLEARBIT_API_KEY` in prod for a
   small canary workspace (Pitchbar's own demo workspace).
3. Monitor: enrichment success rate, p95 enrichment latency, provider
   spend per day.
4. Flip plans seeder so Pro/Business/Enterprise default-on.
5. Doc page `troubleshooting-lead-enrichment.blade.php` + nav entry.

---

## Risks / open questions

- **Provider data freshness.** Clearbit refreshes quarterly at best;
  treat enrichment as a hint, not ground truth.
- **GDPR / data-processing addendum.** Sending visitor email to a
  third-party processor requires updating the DPA. Per-workspace
  toggle exists for this exact reason — admin can disable, then no
  PII ever leaves Pitchbar.
- **Provider outages.** If Clearbit returns 5xx, log + skip. We don't
  retry — the next lead from the same domain hits the same cache miss
  and will fail too. Acceptable; sales reps still see the rest of
  the lead.
- **Cost overrun.** Cache + 30-day expiry should keep spend bounded.
  Add a `lead_enrichment_calls_per_day` counter to the metrics
  dashboard if a workspace spikes.

---

## Why now

- Buyer ask is explicit and recent (Lucian, mid-May 2026).
- Competitors all ship this — Pitchbar leadgen feature loses against
  Intercom precisely on this dimension.
- Engineering scope is bounded (5 days incl. tests + docs); no
  hot-path risk; provider abstraction mirrors existing `OpenAiClient`
  pattern so the codebase model stays consistent.
