# Proposal: Zendesk Help Center Import

Draft. 2026-05-16. Iter 8 of audit loop.

---

## Why

Pitchbar's knowledge ingest has: URL, sitemap, file upload, Notion, Google Docs,
Google Sheets, SQL connector. Conspicuously missing for the help_center vertical:
**Zendesk Help Center article import**.

Buyers running help_center on Pitchbar overwhelmingly come FROM Zendesk (cheaper
alternative pitch). Today they have to either:
1. Manually point a sitemap crawl at their public Zendesk Help Center URL (which
   works but is slow + misses articles behind auth)
2. Export Zendesk articles manually to CSV/MD and upload (tedious for 100+ articles)

Both flows are friction. Competing chatbots (Chatbase, Cust.ai, even Intercom Fin)
ship native Zendesk import as a one-click setup wizard.

## Scope (v1)

**In:**
- New Source type: `zendesk`
- Customer enters Zendesk subdomain + API token in `/app/agents/{id}/sources/new?type=zendesk`
- Pitchbar calls Zendesk Help Center API to enumerate categories + sections + articles
- Per article: title + body + URL → Document row → IndexDocumentJob
- Periodic re-sync via existing `pitchbar:refresh-stale-sources` cron pattern

**Out (v2+):**
- Zendesk Support (ticket) import — different API, different shape
- Multi-brand Zendesk accounts — v1 assumes single subdomain
- Article translations — v1 indexes default locale only
- Diff sync — v1 re-indexes everything on refresh; v2 uses Zendesk's `start_time`
  parameter for incremental

## Stack

Pure HTTP API. Zendesk Help Center API is well-documented + stable:
- `GET /api/v2/help_center/articles.json` — paginated article list
- `GET /api/v2/help_center/articles/{id}.json` — single article (we don't need; the
  list response includes body)
- Auth: `email/token:api_token` Basic Auth, or OAuth (defer; Basic is enough for
  v1 self-service buyers)

Rate limit: 700 req/min per endpoint. Pagination is 30 articles/page. A 500-
article HC = ~17 API calls, well within rate limit even at burst.

## Data model

Reuses existing `Source` + `Document` + `Chunk` chain. New Source type:

```
sources:
  type = 'zendesk'
  config = {
    "subdomain": "acme",
    "api_email": "support@acme.com",
    "api_token_encrypted": "<encrypted>",
    "locale": "en-us",
    "include_section_ids": [123, 456, null]  // null = all sections
  }
```

`api_token_encrypted` rides the existing encrypted-cast pattern (Source already
encrypts `credentials_encrypted` JSON for SQL connector).

## Routes

```
POST /api/v1/agents/{agent}/sources           // existing, adds type=zendesk branch
POST /app/agents/{agent}/sources/test-zendesk  // pre-save credential probe
```

The test endpoint hits `GET /api/v2/help_center/locales.json` (cheap, returns
{"locales":["en-us","de",…]}) to verify creds + subdomain before saving.

## Hot path

Zendesk fetch is async via new `SyncZendeskSourceJob`:
1. Fetch article list (paginated)
2. For each article: create-or-update Document keyed by Zendesk article_id
3. Dispatch IndexDocumentJob per Document

Same retry / failed() handler pattern as `IngestNotionPageJob` (`$tries = 3`,
idempotent re-reads — Zendesk API is read-only here).

## Multi-tenancy

Source.workspace_id (via agent_id chain) is the tenant gate. No cross-tenant
concerns — Zendesk creds are stored per Source.

## Effort estimate

| Phase | Days | What ships |
|---|---|---|
| Source type registration + config form | 0.5 | UI for type=zendesk in /sources/new |
| Test-credentials endpoint | 0.5 | API probe before save |
| SyncZendeskSourceJob | 1 | List + reconcile articles |
| IndexDocumentJob integration | 0.5 | re-use existing chain |
| Periodic refresh wiring | 0.5 | hook into refresh-stale-sources cron |
| Tests | 0.5 | mock HTTP, validate Source create + sync |
| Docs | 0.5 | new page under Knowledge → Connectors |
| **Total** | **4 days** | |

## Open questions

1. **Multi-brand Zendesk** — Some buyers have 2+ Zendesk brands under one account. Defer; document workaround = one Source per brand.
2. **Article visibility** — Zendesk has `draft`/`published`/`internal`. v1 indexes `published` only. Document.
3. **Article comments / community posts** — also indexable via separate APIs. v2.
4. **Token storage** — encrypted at rest via Source.credentials_encrypted JSON cast.

## Why this is a fit for the next bet

| Signal | Source |
|---|---|
| help_center vertical buyers are largely ex-Zendesk | CodeCanyon reviews, sales calls |
| Existing Source / Document / IndexDocumentJob chain takes the new connector in stride | low new-surface engineering |
| Competitive parity with Chatbase, Cust.ai, Intercom Fin | market positioning |
| 4-day effort, plan-gateable (or free — cheap differentiator) | low cost, high perceived value |

## Sequencing

| Order | Proposal | Effort | Doc |
|---|---|---|---|
| 1 | Email Channel | 5d | docs/PROPOSAL-EMAIL-CHANNEL.md |
| 2 | WhatsApp Channel | 8d | docs/PROPOSAL-WHATSAPP-CHANNEL.md |
| 3 | Plan-gated White-Label | 4d | docs/PROPOSAL-PLAN-GATED-WHITE-LABEL.md |
| 4 | **Zendesk Import** | **4d** | **this doc** |

Total Q3 push: ~21 days engineering, all plan-gateable or free differentiators.
