> **产品：** Reliable AI Agent Operations
> **版本：** v1
> **生成时间：** 2026-05-11 09:53
> **数据来源：** ai-customer-support-automation__scored-demand__20260511-0918.md（Score: 30/40 BUILD）

# PRD — Reliable AI Agent Operations

## 1. Problem & User

**Target User:** Customer support operations leaders, support directors, and CX managers at SMB/mid-market SaaS companies deploying AI-first support.

**Core Problem:** AI support bots deployed on Intercom, Zendesk, or Freshdesk regularly give wrong or low-confidence answers, and when they fail there's no clean handoff — agents receive conversations with zero context, customers get frustrated, and trust erodes. Existing platforms don't offer a lightweight guardrail layer; teams either over-automate (and break trust) or under-automate (and waste the investment).

**User Quote:** "AI does a shit job of customer service, so humans are the ones that can route the errors the fastest..."

## 2. Target Outcome & KPIs

- **Primary:** Support teams can safely deploy AI automation with confidence-gated escalation, reducing bad AI responses reaching customers
- **KPIs:**
  - AI conversations flagged and reviewed before they cause churn: ≥ 20% reduction in bad-bot incidents
  - Human handoff time to full context: < 30 seconds
  - Activation: first flagged conversation reviewed within 5 minutes of onboarding
  - Free→Paid conversion trigger: 500 AI conversations processed

## 3. MVP Scope (In)

- Workspace setup with knowledge source ingestion (paste URLs or upload FAQ text)
- Test chat widget: ask questions and see AI answers with source citations
- Confidence scoring per response (based on retrieval score + citation coverage)
- Escalation rules: define threshold (e.g., confidence < 0.6 → flag)
- Flagged conversation inbox: review low-confidence conversations
- Human handoff summary: one-click summary with full context for agent takeover
- Basic analytics: total conversations, flag rate, escalation rate
- Stripe checkout: free tier (500 conversations/mo), paid $49/mo (500), $99/mo (2,500)

## 4. Out of Scope

- Native Intercom/Zendesk/Freshdesk integrations (MVP: webhook/CSV import)
- Custom model training or fine-tuning
- Voice support channels
- Multi-workspace/team management
- SLA enforcement or ticket routing automation

## 5. User Flow (Happy Path)

1. **Sign up** (email/password via Supabase Auth) → land on empty workspace dashboard
2. **Add knowledge source**: paste 3 help-center URLs or upload FAQ text → system scrapes and chunks content (< 2 min)
3. **Aha Moment (5 min):** Open test widget → type a support question → see AI answer with highlighted source citation and confidence score badge (green/yellow/red)
4. **Set escalation rule:** Select confidence threshold → save rule
5. **Connect transcript source:** paste a JSON webhook or upload CSV of recent chat transcripts
6. **Flagged inbox populates:** Low-confidence conversations appear with AI summary + suggested escalation reason
7. **Handoff summary:** Click any flagged conversation → copy one-click handoff brief for human agent
8. **Analytics dashboard:** See flag rate trend over last 7 days
9. **Paid wall trigger:** At 500 conversations processed → in-app banner + automated email: "You've hit your free limit — upgrade to keep your AI reliable"

## 6. Functional Requirements (P0)

| ID | Requirement |
|----|-------------|
| F1 | User can add knowledge source by pasting URL(s); system fetches and chunks content |
| F2 | Test chat widget generates answer with confidence score (0.0–1.0) and source citation |
| F3 | User can set confidence threshold rule (slider, 0.1–1.0 increments) |
| F4 | System flags conversations below threshold and surfaces in inbox |
| F5 | Handoff summary generator: produces structured brief (issue, last 5 turns, suggested resolution, source docs) |
| F6 | Analytics page: total convos, flag count, avg confidence, escalation rate |
| F7 | Stripe checkout: free tier auto-enrolled, upgrade prompt at 500 convos |
| F8 | Automated email triggered at paywall event (SendGrid or Resend) |
| F9 | Onboarding: empty states with step-by-step prompts — no human needed |

## 7. Data Model (Minimal)

```
workspaces (id, name, owner_id, plan, conversation_count)
knowledge_sources (id, workspace_id, url, raw_text, chunks[], created_at)
conversations (id, workspace_id, messages[], confidence_score, flagged, escalated_at)
escalation_rules (id, workspace_id, threshold, action)
handoff_summaries (id, conversation_id, content, created_at)
```

## 8. API / Integration Notes

- **LLM:** OpenAI GPT-4o for answer generation; confidence = retrieval cosine similarity score from pgvector
- **Embeddings:** `text-embedding-3-small` for doc chunks and query matching
- **Auth:** Supabase Auth (email/password)
- **DB:** Supabase + pgvector extension for semantic search
- **Storage:** Supabase Storage for uploaded FAQ files
- **Payments:** Stripe Checkout + webhooks for plan state sync
- **Email:** Resend for transactional emails (paywall trigger, weekly digest)
- **Transcript ingestion:** Accept JSON POST to `/api/ingest` or CSV upload (no native helpdesk SDK in MVP)

## 9. Acceptance Criteria

- [ ] User can complete workspace setup + first knowledge source ingestion in < 5 minutes
- [ ] Test widget returns answer with confidence score in < 5 seconds
- [ ] Conversations below threshold appear in flagged inbox within 60 seconds of ingestion
- [ ] Handoff summary generated in < 3 seconds and contains all 4 required sections
- [ ] Stripe checkout completes successfully and plan updates immediately
- [ ] Paywall email fires within 5 minutes of 500th conversation being processed
- [ ] Analytics dashboard loads with < 2s response time

## 10. Delivery Plan

### Milestone 1 — Core AI Loop (Hours 1–8)
**Files:**
- `app/api/ingest/route.ts` — URL scraping + text chunking endpoint
- `app/api/chat/route.ts` — LLM answer generation with pgvector retrieval + confidence scoring
- `lib/embeddings.ts` — OpenAI embedding wrapper
- `supabase/migrations/001_schema.sql` — workspaces, knowledge_sources, conversations tables
- `components/ChatWidget.tsx` — test chat UI with confidence badge

**Rationale:** Core AI functionality must be proven before building any surrounding UX.

**Exit Criteria:** `POST /api/chat` with a test question returns `{answer, confidence, sources[]}` with confidence between 0–1; pgvector search returns top 3 relevant chunks.

### Milestone 2 — Ops Dashboard (Hours 9–15)
**Files:**
- `app/dashboard/page.tsx` — analytics overview
- `app/inbox/page.tsx` — flagged conversation list with filters
- `app/inbox/[id]/page.tsx` — conversation detail + handoff summary generator
- `app/api/rules/route.ts` — CRUD for escalation rules
- `app/api/handoff/route.ts` — summary generation endpoint
- `components/RuleBuilder.tsx` — threshold slider UI

**Rationale:** Dashboard is the product's daily value loop for support ops teams.

**Exit Criteria:** Flagged inbox shows conversations below threshold; clicking "Generate Handoff Brief" returns formatted summary with issue, turns, and source docs within 3 seconds.

### Milestone 3 — Monetization & Launch (Hours 16–20)
**Files:**
- `app/api/stripe/webhook/route.ts` — plan sync on subscription events
- `app/billing/page.tsx` — upgrade prompt and Stripe Checkout redirect
- `lib/email.ts` — Resend integration for paywall and onboarding emails
- `app/onboarding/page.tsx` — step-by-step empty state flow

**Rationale:** Ship paywall before launch to validate willingness to pay immediately.

**Exit Criteria:** Stripe test checkout completes, `subscription.updated` webhook updates workspace plan; paywall email sends within 5 min of 500th conversation; onboarding flow completes without external help.

## 11. Risks & Mitigations

| Risk | Mitigation |
|------|-----------|
| Confidence scoring is unreliable | Start with cosine similarity threshold; add user feedback loop (thumbs up/down) in week 2 |
| Transcript ingestion too manual | Make CSV template downloadable; add webhook endpoint for Zapier bridge |
| LLM costs exceed margin at $49/mo | Cache embeddings; limit free tier to 500 convos; monitor token/convo ratio |
| Competition from native helpdesk AI | Position as "reliability layer on top of" not "replacement for" — works with any tool |

## 12. Chargeability Rationale

**Free:** Up to 500 AI conversations/month — enough for a small support team to experience the full value loop (ingest docs → test questions → review flagged conversations → generate handoff summaries).

**Paid ($49/mo → $99/mo):** Unlocks higher conversation volumes, longer conversation history retention (90 days vs 7 days), and priority email support.

**Paywall Trigger:** 500th conversation processed — user has already experienced the product's value and has ongoing operational dependency.

**Why teams pay:** A single mis-handled AI support ticket that causes churn costs more than $49/mo. The ROI is immediate and tangible.