AI Assistant
The docs site ships an AI-powered chat assistant that answers questions about Convenient Checkout using a Retrieval-Augmented Generation (RAG) pipeline: an explicit AI API server retrieves semantically relevant documentation from Azure AI Search, then synthesises a structured, role-aware response via Azure OpenAI.
This page covers the complete implementation: architecture, components, search strategy, system prompt, and operations.
In this series:
- AI Assistant โ Overview & Architecture (this page)
- AI Assistant โ Prompt Engineering & Caching
- AI Assistant โ Training & Fine-Tuning
- AI Assistant โ Indexing Internals
Architectureโ
Browser (ChatBot.js)
โโ walletAIService.askExplicitClient(query)
โโ POST /api/ask { query, session_id, system_prompt, search_config }
โโ nginx (/api/ proxy โ ${AI_API_SERVICE_URL})
โโ AI API Server (in-cluster: ai-service:8080, container port 8080)
โโ Pass 1 โ Azure AI Search (category filter: spec only, top 3)
โโ Pass 2 โ Azure AI Search (all docs, top 8)
โ โโ wallet-docs-index-v2
โโ Azure OpenAI (gpt-4-1-kb deployment)
โโ RAG: context window from merged results โ structured Markdown answer
The AI API runs as a separate Kubernetes Deployment (ai-service) in the same namespace and cluster as the docs site. Nginx proxies /api/ browser requests to ai-service:8080 via the AI_API_SERVICE_URL env var โ no direct browser-to-backend calls.
RAG pipeline โ step by stepโ
- ChatBot calls
askAssistant(query)โaskExplicitClient(query)inwalletAIService.js - The service POSTs a full payload to
/api/ask(nginx proxies toai-service:8080):{
"query": "...",
"session_id": "<uuid per browser tab>",
"system_prompt": "...",
"search_config": { "index_name": "wallet-docs-index-v2", "top": 5 }
} - The explicit server classifies the query: API/integration or business/general
- Runs one or two Azure AI Search passes and merges, deduplicates results
- Assembles a context window and sends it to Azure OpenAI with the system prompt
- Returns
{ answer, docs, sources }to the browser
Componentsโ
1. src/components/ChatBot/walletAIService.js โ Browser AI serviceโ
Callable ES module that is the single integration point between the ChatBot and the explicit AI API server.
| Export | Description |
|---|---|
askAssistant(query) | High-level call used by ChatBot.js. Handles short-query guard and unavailable fallback. |
askExplicitClient(query, config?) | Direct POST to /api/ask with full explicit payload. |
indexDocs(payload?) | Triggers the explicit server to re-index docs via POST /api/index. |
Session management: A unique UUID is generated per browser tab on first call and stored in sessionStorage. The server uses session_id to maintain conversation context across messages in the same tab.
Config defaults:
| Parameter | Value | Description |
|---|---|---|
DEFAULT_BASE_URL | /api | Nginx-proxied path to the in-cluster AI service |
search_config.index_name | wallet-docs-index-v2 | Azure AI Search index |
search_config.top | 5 | Results returned per query |
MIN_QUERY_LENGTH | 2 | Minimum characters before sending |
2. src/components/ChatBot/aiConfig.js โ System promptโ
The systemPrompt constant is the single source of truth for all OpenAI response behaviour. It is imported by walletAIService.js and sent with every /api/ask request.
See Prompt Engineering & Caching for the full prompt structure.
3. scripts/ai-search/localAiAssistant.js โ Local CLI clientโ
A Node.js CLI for testing the explicit server locally without running the full docs site.
# Trigger index upload
node scripts/ai-search/localAiAssistant.js index
# Ask a question
node scripts/ai-search/localAiAssistant.js ask "How do I initiate a payment?"
Override defaults with environment variables:
AI_API_URL=http://localhost:8000 INDEX_NAME=wallet-docs-index-v2 \
node scripts/ai-search/localAiAssistant.js index
4. dockerfile โ Single-stage buildโ
The root dockerfile now contains two stages:
| Stage | Base image | Purpose |
|---|---|---|
builder | JFrog nginx 1.29-latest-dev | Installs envsubst and busybox |
| (default) | JFrog nginx 1.29-latest | Final docs/nginx image |
The AI API is not built or deployed as part of this image. It runs as a standalone external process.
Deploymentโ
The AI API server (ai-service) runs in the same Kubernetes namespace and cluster as the docs site. Nginx proxies /api/ requests to it via AI_API_SERVICE_URL (set to ai-service:8080 in helm values).
Connecting a different server URLโ
Update AI_API_SERVICE_URL in the helm values file for the target environment:
deployment:
environment:
- name: AI_API_SERVICE_URL
value: 'my-ai-service:8080'
Index managementโ
After adding or updating documentation, re-index by calling the explicit server:
node scripts/ai-search/localAiAssistant.js index
Or directly via curl:
curl --request POST \
--url http://localhost:8000/api/index \
--header 'Content-Type: application/json' \
--data '{
"docs_dir": ["/path/to/docs"],
"yaml_dir": ["/path/to/build/redocusaurus"],
"index_name": "wallet-docs-index-v2",
"overwrite": true
}'
Credentials are injected from two K8s secrets:
env:
- name: AZURE_SEARCH_ENDPOINT
valueFrom:
secretKeyRef:
name: azure-search-secret
key: endpoint
- name: AZURE_SEARCH_API_KEY
valueFrom:
secretKeyRef:
name: azure-search-secret
key: apiKey
- name: AZURE_OPENAI_ENDPOINT
valueFrom:
secretKeyRef:
name: azure-openai-secret
key: endpoint
- name: AZURE_OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: azure-openai-secret
key: apiKey
- name: AZURE_OPENAI_DEPLOYMENT
value: gpt-4-1-kb
7. nginx/nginx.template โ Proxy configurationโ
The /api/ location block proxies requests to the AI API:
location /api/ {
proxy_pass http://${AI_API_SERVICE_URL}/api/;
proxy_read_timeout 30s;
proxy_connect_timeout 5s;
proxy_intercept_errors on;
error_page 502 503 504 = @ai_unavailable;
}
location @ai_unavailable {
default_type application/json;
return 503 '{"error":"AI API is not available"}';
}
AI_API_SERVICE_URL defaults to localhost:3001 in entrypoint.sh when the env var is not set, so nginx always starts cleanly even when the AI API sidecar is disabled.
8. src/components/ChatBot/aiAssistant.js โ Frontend integrationโ
The askAssistant(query) function:
- POSTs
{ query }to/api/ask - On success, renders the answer and linked docs
- On fetch failure or non-2xx, falls through to the "unavailable" message (Fuse.js fallback is disabled via
ENABLE_FUSE_FALLBACK = false)
9. src/components/ChatBot/ChatBot.js โ Chat widgetโ
The floating chat widget is a React component. Key capabilities:
MarkdownContentโ renders##/###headings (styled with colored left border),**bold**,`inline code`, fenced code blocks, bullet lists, numbered lists, and---dividersCodeBlockโ fenced code blocks with a language label and copy-to-clipboard button (2s "Copied!" feedback)RelatedDocsโ groups linked docs by URL pattern into color-coded badge groups:
| Badge | Color | URL pattern |
|---|---|---|
| API Spec | Blue #2563eb | /api-reference* |
| Webhook Spec | Teal #0d9488 | /webhooks-v2/* |
| Integration | Green #059669 | /developers/* |
| Business | Purple #7c3aed | /business/* |
| Internal | Red #dc2626 | /engineering-guide/* |
| Guidelines | Cyan #0891b2 | /contributing-guide/* |
| Documentation | Amber #b45309 | (catch-all) |
Spec entry titles are automatically shortened: [Webhook] paymentFailed โ PAYMENT_FAILED โ PAYMENT_FAILED.
- Prompt suggestions โ 12 pre-set chips grouped into Business / Integration / API, toggled by the lightbulb button in the input row. Suggestions and the welcome message are driven from
src/components/ChatBot/prompt-suggestions.jsonโ no code changes needed to update them. - Lightbulb button โ golden glow animation when suggestions are open; subtle colour pulse when closed
10. scripts/uploadToAzureSearch.js โ Index populationโ
Parses all markdown files from docs/ and OpenAPI specs, then uploads them to the wallet-docs-index Azure AI Search index. Runs in 3 steps:
Step 1 โ yarn build (aborts on failure)
Docusaurus bundles and resolves all OpenAPI specs into build/redocusaurus/. These fully-resolved YAML files are used for indexing (no external $ref files remain โ they are all inlined by the build).
| Built file | Content |
|---|---|
build/redocusaurus/plugin-redoc-1.yaml | v2 public API (fully bundled) |
build/redocusaurus/plugin-redoc-2.yaml | v2 Webhooks spec (fully bundled) |
build/redocusaurus/plugin-redoc-0.yaml | v1 public API (fully bundled) |
Step 2 โ Parse documents
loadDocEntries()โ walksdocs/recursively, strips frontmatter, extracts title and headingsloadSpecEntries()โ iterates each spec'spaths:(orwebhooks:key for the webhook spec), extracts every operation, resolves request/response schemas, generates a searchable snippet and curl skeleton
Spec title format:
- Webhook operations:
[Webhook] {eventName} โ {summary} - API operations:
[V2] POST /path โ {summary}/[V1] POST /path โ {summary}
Step 3 โ Upload to Azure AI Search
- Calls
ensureIndex()to create or update the index schema - Clears all existing documents
- Uploads in batches of 100
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js
Requires an admin key (not a query key) โ the script creates/updates the index schema and deletes existing documents.
Index document fields:
| Field | Type | Description |
|---|---|---|
id | string (key) | Base64-encoded URL, URL-safe chars only |
title | string | Page/operation title |
file_path | string | Relative URL (e.g. /docs/developers/...) |
content | string | Full text snippet |
headings | string | Pipe-separated heading list |
summary | string | Top 5 keywords joined by comma |
category | string | First URL segment (e.g. docs, api-reference-v2, webhooks-v2) |
section | string | Second URL segment |
last_modified | DateTimeOffset | Upload timestamp |
The category field is used for the two-pass search filter (category ne 'docs' isolates all spec categories).
Current index stats: ~214 documents โ 169 markdown + 45 API spec entries (16 v2 operations, 13 webhooks, 16 v1 operations)
Azure OpenAI Resourceโ
| Property | Value |
|---|---|
| Resource name | CCG-AI-ASSISTANT |
| Endpoint | https://ccg-ai-assistant.openai.azure.com/ |
| Deployment | gpt-4-1-kb |
| Model | GPT-4.1 (1M context) |
| Max output tokens (API queries) | 2500 |
| Max output tokens (business queries) | 800 |
| Temperature | 0.3 |
| API version | 2024-02-01 |
| Subscription | OFS_Common_Checkout (541ef442-1fea-41f8-b23c-5f2cc8ff5246) |
| Resource group | fcc-comn-chkt-rg |
| Location | Central US |
| Private endpoint | ccg-ai-assistant-pe (VNet: fcc-comn-chkt-vnet-dev-standby-centralus/lb-subnet) |
| Target sub-resource | account |
The resource is accessed via a private endpoint โ no public internet egress is required from AKS pods. Local development requires VPN access to the private endpoint network.
Token Utilizationโ
Every chat request has two token costs: input tokens (context sent to the model) and output tokens (the model's response). Both are billed.
What drives input token costโ
Input tokens = system prompt + context window assembled from Azure AI Search results.
| Component | Approximate tokens | Notes |
|---|---|---|
| System prompt | ~600 tokens | Fixed per request โ loaded from api/config.js |
| Context window โ API query | ~6 000 tokens | Up to 24 000 chars รท ~4 chars/token |
| Context window โ business query | ~2 000 tokens | Up to 8 000 chars รท ~4 chars/token |
| User question | ~50โ125 tokens | Capped at 500 chars |
Total input per API query: ~6 750 tokens
Total input per business query: ~2 750 tokens
What drives output token costโ
max_tokens is the ceiling for the completion โ the model is billed only for what it actually generates, not the ceiling. Setting a lower ceiling for business questions prevents runaway completions:
| Query type | max_tokens | Typical actual usage |
|---|---|---|
| API / integration | 2500 | 800โ1 800 tokens (curl + field listing) |
| Business / general | 800 | 150โ400 tokens (prose summary) |
Cost estimation (GPT-4.1 pricing)โ
GPT-4.1 pricing as of 2025: $2.00 per 1M input tokens, $8.00 per 1M output tokens.
| Query type | Input cost | Output cost (typical) | Total per query |
|---|---|---|---|
| API / integration | ~$0.0135 | ~$0.010 | ~$0.024 |
| Business / general | ~$0.0055 | ~$0.002 | ~$0.008 |
These are estimates. Actual costs depend on the subscription tier, private endpoint pricing, and actual tokens generated per conversation.
How the API call is constructedโ
In api/server.js, callOpenAI() assembles the request body and POSTs directly to the Azure OpenAI REST API โ no SDK is used:
// POST /openai/deployments/{deployment}/chat/completions?api-version=2024-02-01
const body = {
messages: [
{ role: 'system', content: systemPrompt }, // ~600 input tokens
{ role: 'user', content:
`Documentation context:\n\n${contextText}\n\n---\n\nQuestion: ${query}` },
],
max_tokens: apiQuery ? 2500 : 800, // output ceiling
temperature: 0.3,
};
The request is sent with an api-key header (not Authorization: Bearer). Response is parsed from json.choices[0].message.content.
Reducing cost furtherโ
| Lever | Current setting | Trade-off |
|---|---|---|
Reduce contextMaxChars | 24 000 (API) / 8 000 (business) | Fewer input tokens but may miss request body fields in long spec entries |
Lower maxTokens | 2500 (API) / 800 (business) | Cheaper but completions may be cut short |
Raise temperature to 0.5 | 0.3 now | More varied / creative answers โ not ideal for exact API docs |
Add more apiQueryKeywords | 33 keywords | Better classification โ fewer queries incorrectly routed to the expensive API path |
| Azure Search query key vs admin key | Use query key in production | Read-only, lower privilege โ does not affect cost |
Azure AI Search Indexโ
| Property | Value |
|---|---|
| Resource | ccg-docs |
| Endpoint | https://ccg-docs.search.windows.net |
| Index | wallet-docs-index |
| Semantic config | default (title field: title, content field: content, keyword field: summary) |
| API version | 2023-11-01 |
| Query type | semantic with extractive answers (top 3) and extractive captions |
Deployingโ
Prerequisitesโ
Create both K8s secrets in the target namespace before deploying:
kubectl create secret generic azure-search-secret \
--from-literal=endpoint=https://ccg-docs.search.windows.net \
--from-literal=apiKey=<query-key> \
-n <namespace>
kubectl create secret generic azure-openai-secret \
--from-literal=endpoint=https://ccg-ai-assistant.openai.azure.com/ \
--from-literal=apiKey=<openai-api-key> \
-n <namespace>
Key names are case-sensitive: endpoint and apiKey (capital K). The Helm template maps these exact keys.
Enable per environmentโ
In the relevant values-<env>.yaml:
aiApi:
enabled: true
secretName: azure-search-secret
searchIndex: wallet-docs-index
openaiSecretName: azure-openai-secret
openaiDeployment: gpt-4-1-kb
deployment:
environment:
- name: AI_API_SERVICE_URL
value: 'docs-ai-api:3001'
Azure AI Search firewallโ
The AKS cluster's outbound IP must be added to the AI Search firewall allowlist:
Azure Portal โ AI Search (ccg-docs) โ Settings โ Networking โ Firewall โ add IP
| Environment | Outbound IP |
|---|---|
| Test | 20.98.179.84 |
The outbound IP can be retrieved from a running pod:
kubectl exec -n <namespace> <any-pod> -c <container> -- \
node -e "require('https').get('https://ifconfig.me',r=>{let d='';r.on('data',c=>d+=c);r.on('end',()=>console.log(d))}).on('error',e=>console.error(e.message))"
If the AKS load balancer is recreated (e.g. cluster upgrade), the outbound IP may change, causing 403 errors from Azure AI Search. The firewall rule must be updated to match the new IP.
Local Developmentโ
The Azure OpenAI resource uses a private endpoint โ it is only reachable from within the AKS VNet. You must be connected to VPN for local development with OpenAI enabled. Azure AI Search has an IP firewall allowlist; your machine's IP must be added if testing the upload script locally.
Start the API server locally:
lsof -ti :3001 | xargs kill -9 2>/dev/null; sleep 1 && \
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
AZURE_OPENAI_ENDPOINT=https://ccg-ai-assistant.openai.azure.com/ \
AZURE_OPENAI_API_KEY=<openai-api-key> \
AZURE_OPENAI_DEPLOYMENT=gpt-4-1-kb \
node api/server.js
Run the docs site locally (separate terminal):
yarn start
The chat widget in yarn start will call /api/ask which is proxied to port 3001 via the Docusaurus dev server proxy (or directly if not proxied โ configure in docusaurus.config.js).
Re-upload the index after content changes:
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js
This automatically runs yarn build first (Step 1) to ensure build/redocusaurus/ contains the latest bundled specs.
Operationsโ
Check pod statusโ
kubectl get pods -n <namespace> | grep docs-ai-api
kubectl logs -n <namespace> -l app=docs-ai-api -c ai-api
Test the endpoint directlyโ
kubectl exec -n <namespace> <docs-pod> -- sh -c \
"echo '{\"query\":\"how do I add a payment method\"}' | /busybox wget -q -O- \
--header='Content-Type: application/json' \
--post-file=- http://docs-ai-api:3001/api/ask"
Reload after secret updateโ
kubectl delete secret azure-search-secret -n <namespace>
kubectl create secret generic azure-search-secret \
--from-literal=endpoint=<endpoint> \
--from-literal=apiKey=<new-key> \
-n <namespace>
kubectl delete secret azure-openai-secret -n <namespace>
kubectl create secret generic azure-openai-secret \
--from-literal=endpoint=https://ccg-ai-assistant.openai.azure.com/ \
--from-literal=apiKey=<new-openai-key> \
-n <namespace>
kubectl rollout restart deployment/docs-ai-api -n <namespace>
Re-upload index contentโ
Run after adding/changing documentation or OpenAPI specs. The script builds the site first to ensure bundled spec files are current:
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js
Troubleshootingโ
| Symptom | Cause | Fix |
|---|---|---|
nginx fails to start: no host in upstream | AI_API_SERVICE_URL env var not set | Add to deployment.environment in values file |
| Browser shows "AI assistant currently unavailable" | /api/ask returning non-2xx | Check pod logs; see below |
| 403 from Azure Search | Firewall blocking AKS outbound IP | Add outbound IP to AI Search firewall |
| 426 Upgrade Required | Istio mTLS interception | Ensure excludeInboundPorts: "3001" annotation is on the docs-ai-api deployment |
Pod stuck at 1/2 Running | Istio sidecar initialising | Wait ~30s; if persistent, check kubectl describe pod |
500 from /api/ask | Azure Search returned error | Check pod logs for Search error: line |
base name should not be blank in Docker build | JFROG_SAAS_NODE_DOCKER_IMAGE_URL var not set in repo | Add org variable to repo GitHub Actions settings |
| Answers are plain search content (not AI-generated) | AZURE_OPENAI_ENDPOINT or AZURE_OPENAI_API_KEY not set | Create azure-openai-secret and set openaiSecretName in values file |
| 401 from Azure OpenAI | Wrong or missing apiKey in secret | Recreate azure-openai-secret with the correct key |
| 403 from Azure OpenAI | Private endpoint not reachable from pod subnet | Verify the ccg-ai-assistant-pe private endpoint targets lb-subnet in the AKS VNet |
| OpenAI connection error locally | Not on VPN | Connect to VPN โ the OpenAI resource has a private endpoint only |
| Upload script aborts at Step 1 | yarn build failed | Fix the build error first; the upload will not run against stale content |
| API spec entries missing from search results | Bundled spec files are stale | Re-run upload script โ it rebuilds and re-bundles specs automatically |
| Spec answers lack field details | External $ref schemas not resolved | This should not happen with built files; verify SPECS points to build/redocusaurus/ not openapi/ |
Answers not using ## headings | Model ignoring format instructions | Check api/config.js system prompt โ the ## Summary template must be present |