Version: v2

AI Assistant

The docs site ships an AI-powered chat assistant that answers questions about Convenient Checkout using a Retrieval-Augmented Generation (RAG) pipeline: an explicit AI API server retrieves semantically relevant documentation from Azure AI Search, then synthesises a structured, role-aware response via Azure OpenAI.

This page covers the complete implementation: architecture, components, search strategy, system prompt, and operations.

In this series:

AI Assistant — Overview & Architecture (this page)
AI Assistant — Prompt Engineering & Caching
AI Assistant — Training & Fine-Tuning
AI Assistant — Indexing Internals

Architecture

Browser (ChatBot.js)
  └─ walletAIService.askExplicitClient(query)
       └─ POST /api/ask  { query, session_id, system_prompt, search_config }
            └─ nginx (/api/ proxy → ${AI_API_SERVICE_URL})
                  └─ AI API Server (in-cluster: ai-service:8080, container port 8080)
                        ├─ Pass 1 — Azure AI Search (category filter: spec only, top 3)
                        ├─ Pass 2 — Azure AI Search (all docs, top 8)
                        │      └─ wallet-docs-index-v2
                        └─ Azure OpenAI (gpt-4-1-kb deployment)
                               └─ RAG: context window from merged results → structured Markdown answer

The AI API runs as a separate Kubernetes Deployment (ai-service) in the same namespace and cluster as the docs site. Nginx proxies /api/ browser requests to ai-service:8080 via the AI_API_SERVICE_URL env var — no direct browser-to-backend calls.

RAG pipeline — step by step

ChatBot calls askAssistant(query) → askExplicitClient(query) in walletAIService.js

The service POSTs a full payload to /api/ask (nginx proxies to ai-service:8080):

{
  "query": "...",
  "session_id": "<uuid per browser tab>",
  "system_prompt": "...",
  "search_config": { "index_name": "wallet-docs-index-v2", "top": 5 }
}

The explicit server classifies the query: API/integration or business/general
Runs one or two Azure AI Search passes and merges, deduplicates results
Assembles a context window and sends it to Azure OpenAI with the system prompt
Returns { answer, docs, sources } to the browser

Components

1. `src/components/ChatBot/walletAIService.js` — Browser AI service

Callable ES module that is the single integration point between the ChatBot and the explicit AI API server.

Export	Description
`askAssistant(query)`	High-level call used by `ChatBot.js`. Handles short-query guard and unavailable fallback.
`askExplicitClient(query, config?)`	Direct POST to `/api/ask` with full explicit payload.
`indexDocs(payload?)`	Triggers the explicit server to re-index docs via POST `/api/index`.

Session management: A unique UUID is generated per browser tab on first call and stored in sessionStorage. The server uses session_id to maintain conversation context across messages in the same tab.

Config defaults:

Parameter	Value	Description
`DEFAULT_BASE_URL`	`/api`	Nginx-proxied path to the in-cluster AI service
`search_config.index_name`	`wallet-docs-index-v2`	Azure AI Search index
`search_config.top`	`5`	Results returned per query
`MIN_QUERY_LENGTH`	`2`	Minimum characters before sending

2. `src/components/ChatBot/aiConfig.js` — System prompt

The systemPrompt constant is the single source of truth for all OpenAI response behaviour. It is imported by walletAIService.js and sent with every /api/ask request.

See Prompt Engineering & Caching for the full prompt structure.

3. `scripts/ai-search/localAiAssistant.js` — Local CLI client

A Node.js CLI for testing the explicit server locally without running the full docs site.

# Trigger index upload
node scripts/ai-search/localAiAssistant.js index

# Ask a question
node scripts/ai-search/localAiAssistant.js ask "How do I initiate a payment?"

Override defaults with environment variables:

AI_API_URL=http://localhost:8000 INDEX_NAME=wallet-docs-index-v2 \
  node scripts/ai-search/localAiAssistant.js index

4. `dockerfile` — Single-stage build

The root dockerfile now contains two stages:

Stage	Base image	Purpose
`builder`	JFrog nginx `1.29-latest-dev`	Installs `envsubst` and `busybox`
(default)	JFrog nginx `1.29-latest`	Final docs/nginx image

The AI API is not built or deployed as part of this image. It runs as a standalone external process.

Deployment

The AI API server (ai-service) runs in the same Kubernetes namespace and cluster as the docs site. Nginx proxies /api/ requests to it via AI_API_SERVICE_URL (set to ai-service:8080 in helm values).

Connecting a different server URL

Update AI_API_SERVICE_URL in the helm values file for the target environment:

deployment:
  environment:
    - name: AI_API_SERVICE_URL
      value: 'my-ai-service:8080'

Index management

After adding or updating documentation, re-index by calling the explicit server:

node scripts/ai-search/localAiAssistant.js index

Or directly via curl:

curl --request POST \
  --url http://localhost:8000/api/index \
  --header 'Content-Type: application/json' \
  --data '{
    "docs_dir": ["/path/to/docs"],
    "yaml_dir": ["/path/to/build/redocusaurus"],
    "index_name": "wallet-docs-index-v2",
    "overwrite": true
  }'

Credentials are injected from two K8s secrets:

env:
  - name: AZURE_SEARCH_ENDPOINT
    valueFrom:
      secretKeyRef:
        name: azure-search-secret
        key: endpoint
  - name: AZURE_SEARCH_API_KEY
    valueFrom:
      secretKeyRef:
        name: azure-search-secret
        key: apiKey
  - name: AZURE_OPENAI_ENDPOINT
    valueFrom:
      secretKeyRef:
        name: azure-openai-secret
        key: endpoint
  - name: AZURE_OPENAI_API_KEY
    valueFrom:
      secretKeyRef:
        name: azure-openai-secret
        key: apiKey
  - name: AZURE_OPENAI_DEPLOYMENT
    value: gpt-4-1-kb

7. `nginx/nginx.template` — Proxy configuration

The /api/ location block proxies requests to the AI API:

location /api/ {
    proxy_pass         http://${AI_API_SERVICE_URL}/api/;
    proxy_read_timeout 30s;
    proxy_connect_timeout 5s;
    proxy_intercept_errors on;
    error_page 502 503 504 = @ai_unavailable;
}

location @ai_unavailable {
    default_type application/json;
    return 503 '{"error":"AI API is not available"}';
}

AI_API_SERVICE_URL defaults to localhost:3001 in entrypoint.sh when the env var is not set, so nginx always starts cleanly even when the AI API sidecar is disabled.

8. `src/components/ChatBot/aiAssistant.js` — Frontend integration

The askAssistant(query) function:

POSTs { query } to /api/ask
On success, renders the answer and linked docs
On fetch failure or non-2xx, falls through to the "unavailable" message (Fuse.js fallback is disabled via ENABLE_FUSE_FALLBACK = false)

9. `src/components/ChatBot/ChatBot.js` — Chat widget

The floating chat widget is a React component. Key capabilities:

MarkdownContent — renders ##/### headings (styled with colored left border), **bold**, `inline code`, fenced code blocks, bullet lists, numbered lists, and --- dividers
CodeBlock — fenced code blocks with a language label and copy-to-clipboard button (2s "Copied!" feedback)
RelatedDocs — groups linked docs by URL pattern into color-coded badge groups:

Badge	Color	URL pattern
API Spec	Blue `#2563eb`	`/api-reference*`
Webhook Spec	Teal `#0d9488`	`/webhooks-v2/*`
Integration	Green `#059669`	`/developers/*`
Business	Purple `#7c3aed`	`/business/*`
Internal	Red `#dc2626`	`/engineering-guide/*`
Guidelines	Cyan `#0891b2`	`/contributing-guide/*`
Documentation	Amber `#b45309`	(catch-all)

Spec entry titles are automatically shortened: [Webhook] paymentFailed — PAYMENT_FAILED → PAYMENT_FAILED.

Prompt suggestions — 12 pre-set chips grouped into Business / Integration / API, toggled by the lightbulb button in the input row. Suggestions and the welcome message are driven from src/components/ChatBot/prompt-suggestions.json — no code changes needed to update them.
Lightbulb button — golden glow animation when suggestions are open; subtle colour pulse when closed

10. `scripts/uploadToAzureSearch.js` — Index population

Parses all markdown files from docs/ and OpenAPI specs, then uploads them to the wallet-docs-index Azure AI Search index. Runs in 3 steps:

Step 1 — yarn build (aborts on failure) Docusaurus bundles and resolves all OpenAPI specs into build/redocusaurus/. These fully-resolved YAML files are used for indexing (no external $ref files remain — they are all inlined by the build).

Built file	Content
`build/redocusaurus/plugin-redoc-1.yaml`	v2 public API (fully bundled)
`build/redocusaurus/plugin-redoc-2.yaml`	v2 Webhooks spec (fully bundled)
`build/redocusaurus/plugin-redoc-0.yaml`	v1 public API (fully bundled)

Step 2 — Parse documents

loadDocEntries() — walks docs/ recursively, strips frontmatter, extracts title and headings
loadSpecEntries() — iterates each spec's paths: (or webhooks: key for the webhook spec), extracts every operation, resolves request/response schemas, generates a searchable snippet and curl skeleton

Spec title format:

Webhook operations: [Webhook] {eventName} — {summary}
API operations: [V2] POST /path – {summary} / [V1] POST /path – {summary}

Step 3 — Upload to Azure AI Search

Calls ensureIndex() to create or update the index schema
Clears all existing documents
Uploads in batches of 100

AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js

caution

Requires an admin key (not a query key) — the script creates/updates the index schema and deletes existing documents.

Index document fields:

Field	Type	Description
`id`	string (key)	Base64-encoded URL, URL-safe chars only
`title`	string	Page/operation title
`file_path`	string	Relative URL (e.g. `/docs/developers/...`)
`content`	string	Full text snippet
`headings`	string	Pipe-separated heading list
`summary`	string	Top 5 keywords joined by comma
`category`	string	First URL segment (e.g. `docs`, `api-reference-v2`, `webhooks-v2`)
`section`	string	Second URL segment
`last_modified`	DateTimeOffset	Upload timestamp

The category field is used for the two-pass search filter (category ne 'docs' isolates all spec categories).

Current index stats: ~214 documents — 169 markdown + 45 API spec entries (16 v2 operations, 13 webhooks, 16 v1 operations)

Azure OpenAI Resource

Property	Value
Resource name	`CCG-AI-ASSISTANT`
Endpoint	`https://ccg-ai-assistant.openai.azure.com/`
Deployment	`gpt-4-1-kb`
Model	GPT-4.1 (1M context)
Max output tokens (API queries)	2500
Max output tokens (business queries)	800
Temperature	0.3
API version	`2024-02-01`
Subscription	`OFS_Common_Checkout` (`541ef442-1fea-41f8-b23c-5f2cc8ff5246`)
Resource group	`fcc-comn-chkt-rg`
Location	`Central US`
Private endpoint	`ccg-ai-assistant-pe` (VNet: `fcc-comn-chkt-vnet-dev-standby-centralus/lb-subnet`)
Target sub-resource	`account`

The resource is accessed via a private endpoint — no public internet egress is required from AKS pods. Local development requires VPN access to the private endpoint network.

Token Utilization

Every chat request has two token costs: input tokens (context sent to the model) and output tokens (the model's response). Both are billed.

What drives input token cost

Input tokens = system prompt + context window assembled from Azure AI Search results.

Component	Approximate tokens	Notes
System prompt	~600 tokens	Fixed per request — loaded from `api/config.js`
Context window — API query	~6 000 tokens	Up to 24 000 chars ÷ ~4 chars/token
Context window — business query	~2 000 tokens	Up to 8 000 chars ÷ ~4 chars/token
User question	~50–125 tokens	Capped at 500 chars

Total input per API query: ~6 750 tokens
Total input per business query: ~2 750 tokens

What drives output token cost

max_tokens is the ceiling for the completion — the model is billed only for what it actually generates, not the ceiling. Setting a lower ceiling for business questions prevents runaway completions:

Query type	`max_tokens`	Typical actual usage
API / integration	`2500`	800–1 800 tokens (curl + field listing)
Business / general	`800`	150–400 tokens (prose summary)

Cost estimation (GPT-4.1 pricing)

GPT-4.1 pricing as of 2025: $2.00 per 1M input tokens, $8.00 per 1M output tokens.

Query type	Input cost	Output cost (typical)	Total per query
API / integration	~$0.0135	~$0.010	~$0.024
Business / general	~$0.0055	~$0.002	~$0.008

These are estimates. Actual costs depend on the subscription tier, private endpoint pricing, and actual tokens generated per conversation.

How the API call is constructed

In api/server.js, callOpenAI() assembles the request body and POSTs directly to the Azure OpenAI REST API — no SDK is used:

// POST /openai/deployments/{deployment}/chat/completions?api-version=2024-02-01
const body = {
  messages: [
    { role: 'system', content: systemPrompt },  // ~600 input tokens
    { role: 'user',   content:
        `Documentation context:\n\n${contextText}\n\n---\n\nQuestion: ${query}` },
  ],
  max_tokens:  apiQuery ? 2500 : 800,  // output ceiling
  temperature: 0.3,
};

The request is sent with an api-key header (not Authorization: Bearer). Response is parsed from json.choices[0].message.content.

Reducing cost further

Lever	Current setting	Trade-off
Reduce `contextMaxChars`	24 000 (API) / 8 000 (business)	Fewer input tokens but may miss request body fields in long spec entries
Lower `maxTokens`	2500 (API) / 800 (business)	Cheaper but completions may be cut short
Raise `temperature` to 0.5	0.3 now	More varied / creative answers — not ideal for exact API docs
Add more `apiQueryKeywords`	33 keywords	Better classification → fewer queries incorrectly routed to the expensive API path
Azure Search query key vs admin key	Use query key in production	Read-only, lower privilege — does not affect cost

Azure AI Search Index

Property	Value
Resource	`ccg-docs`
Endpoint	`https://ccg-docs.search.windows.net`
Index	`wallet-docs-index`
Semantic config	`default` (title field: `title`, content field: `content`, keyword field: `summary`)
API version	`2023-11-01`
Query type	`semantic` with extractive answers (top 3) and extractive captions

Deploying

Prerequisites

Create both K8s secrets in the target namespace before deploying:

kubectl create secret generic azure-search-secret \
  --from-literal=endpoint=https://ccg-docs.search.windows.net \
  --from-literal=apiKey=<query-key> \
  -n <namespace>

kubectl create secret generic azure-openai-secret \
  --from-literal=endpoint=https://ccg-ai-assistant.openai.azure.com/ \
  --from-literal=apiKey=<openai-api-key> \
  -n <namespace>

caution

Key names are case-sensitive: endpoint and apiKey (capital K). The Helm template maps these exact keys.

Enable per environment

In the relevant values-<env>.yaml:

aiApi:
  enabled: true
  secretName: azure-search-secret
  searchIndex: wallet-docs-index
  openaiSecretName: azure-openai-secret
  openaiDeployment: gpt-4-1-kb

deployment:
  environment:
    - name: AI_API_SERVICE_URL
      value: 'docs-ai-api:3001'

Azure AI Search firewall

The AKS cluster's outbound IP must be added to the AI Search firewall allowlist:

Azure Portal → AI Search (ccg-docs) → Settings → Networking → Firewall → add IP

Environment	Outbound IP
Test	`20.98.179.84`

The outbound IP can be retrieved from a running pod:

kubectl exec -n <namespace> <any-pod> -c <container> -- \
  node -e "require('https').get('https://ifconfig.me',r=>{let d='';r.on('data',c=>d+=c);r.on('end',()=>console.log(d))}).on('error',e=>console.error(e.message))"

caution

If the AKS load balancer is recreated (e.g. cluster upgrade), the outbound IP may change, causing 403 errors from Azure AI Search. The firewall rule must be updated to match the new IP.

Local Development

info

The Azure OpenAI resource uses a private endpoint — it is only reachable from within the AKS VNet. You must be connected to VPN for local development with OpenAI enabled. Azure AI Search has an IP firewall allowlist; your machine's IP must be added if testing the upload script locally.

Start the API server locally:

lsof -ti :3001 | xargs kill -9 2>/dev/null; sleep 1 && \
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
AZURE_OPENAI_ENDPOINT=https://ccg-ai-assistant.openai.azure.com/ \
AZURE_OPENAI_API_KEY=<openai-api-key> \
AZURE_OPENAI_DEPLOYMENT=gpt-4-1-kb \
node api/server.js

Run the docs site locally (separate terminal):

yarn start

The chat widget in yarn start will call /api/ask which is proxied to port 3001 via the Docusaurus dev server proxy (or directly if not proxied — configure in docusaurus.config.js).

Re-upload the index after content changes:

AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js

This automatically runs yarn build first (Step 1) to ensure build/redocusaurus/ contains the latest bundled specs.

Operations

Check pod status

kubectl get pods -n <namespace> | grep docs-ai-api
kubectl logs -n <namespace> -l app=docs-ai-api -c ai-api

Test the endpoint directly

kubectl exec -n <namespace> <docs-pod> -- sh -c \
  "echo '{\"query\":\"how do I add a payment method\"}' | /busybox wget -q -O- \
  --header='Content-Type: application/json' \
  --post-file=- http://docs-ai-api:3001/api/ask"

Reload after secret update

kubectl delete secret azure-search-secret -n <namespace>
kubectl create secret generic azure-search-secret \
  --from-literal=endpoint=<endpoint> \
  --from-literal=apiKey=<new-key> \
  -n <namespace>

kubectl delete secret azure-openai-secret -n <namespace>
kubectl create secret generic azure-openai-secret \
  --from-literal=endpoint=https://ccg-ai-assistant.openai.azure.com/ \
  --from-literal=apiKey=<new-openai-key> \
  -n <namespace>

kubectl rollout restart deployment/docs-ai-api -n <namespace>

Re-upload index content

Run after adding/changing documentation or OpenAPI specs. The script builds the site first to ensure bundled spec files are current:

AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js

Troubleshooting

Symptom	Cause	Fix
nginx fails to start: `no host in upstream`	`AI_API_SERVICE_URL` env var not set	Add to `deployment.environment` in values file
Browser shows "AI assistant currently unavailable"	`/api/ask` returning non-2xx	Check pod logs; see below
403 from Azure Search	Firewall blocking AKS outbound IP	Add outbound IP to AI Search firewall
426 Upgrade Required	Istio mTLS interception	Ensure `excludeInboundPorts: "3001"` annotation is on the `docs-ai-api` deployment
Pod stuck at `1/2 Running`	Istio sidecar initialising	Wait ~30s; if persistent, check `kubectl describe pod`
500 from `/api/ask`	Azure Search returned error	Check pod logs for `Search error:` line
`base name should not be blank` in Docker build	`JFROG_SAAS_NODE_DOCKER_IMAGE_URL` var not set in repo	Add org variable to repo GitHub Actions settings
Answers are plain search content (not AI-generated)	`AZURE_OPENAI_ENDPOINT` or `AZURE_OPENAI_API_KEY` not set	Create `azure-openai-secret` and set `openaiSecretName` in values file
401 from Azure OpenAI	Wrong or missing `apiKey` in secret	Recreate `azure-openai-secret` with the correct key
403 from Azure OpenAI	Private endpoint not reachable from pod subnet	Verify the `ccg-ai-assistant-pe` private endpoint targets `lb-subnet` in the AKS VNet
OpenAI connection error locally	Not on VPN	Connect to VPN — the OpenAI resource has a private endpoint only
Upload script aborts at Step 1	`yarn build` failed	Fix the build error first; the upload will not run against stale content
API spec entries missing from search results	Bundled spec files are stale	Re-run upload script — it rebuilds and re-bundles specs automatically
Spec answers lack field details	External `$ref` schemas not resolved	This should not happen with built files; verify SPECS points to `build/redocusaurus/` not `openapi/`
Answers not using `##` headings	Model ignoring format instructions	Check `api/config.js` system prompt — the `## Summary` template must be present

Architecture​

RAG pipeline — step by step​

Components​

1. src/components/ChatBot/walletAIService.js — Browser AI service​

2. src/components/ChatBot/aiConfig.js — System prompt​

3. scripts/ai-search/localAiAssistant.js — Local CLI client​

4. dockerfile — Single-stage build​

Deployment​

Connecting a different server URL​

Index management​

7. nginx/nginx.template — Proxy configuration​

8. src/components/ChatBot/aiAssistant.js — Frontend integration​

9. src/components/ChatBot/ChatBot.js — Chat widget​

10. scripts/uploadToAzureSearch.js — Index population​

Azure OpenAI Resource​

Token Utilization​

What drives input token cost​

What drives output token cost​

Cost estimation (GPT-4.1 pricing)​

How the API call is constructed​

Reducing cost further​

Azure AI Search Index​

Deploying​

Prerequisites​

Enable per environment​

Azure AI Search firewall​

Local Development​

Operations​

Check pod status​

Test the endpoint directly​

Reload after secret update​

Re-upload index content​

Troubleshooting​

Architecture

RAG pipeline — step by step

Components

1. `src/components/ChatBot/walletAIService.js` — Browser AI service

2. `src/components/ChatBot/aiConfig.js` — System prompt

3. `scripts/ai-search/localAiAssistant.js` — Local CLI client

4. `dockerfile` — Single-stage build

Deployment

Connecting a different server URL

Index management

7. `nginx/nginx.template` — Proxy configuration

8. `src/components/ChatBot/aiAssistant.js` — Frontend integration

9. `src/components/ChatBot/ChatBot.js` — Chat widget

10. `scripts/uploadToAzureSearch.js` — Index population

Azure OpenAI Resource

Token Utilization

What drives input token cost

What drives output token cost

Cost estimation (GPT-4.1 pricing)

How the API call is constructed

Reducing cost further

Azure AI Search Index

Deploying

Prerequisites

Enable per environment

Azure AI Search firewall

Local Development

Operations

Check pod status

Test the endpoint directly

Reload after secret update

Re-upload index content

Troubleshooting