Skip to main content
Version: v2

AI Assistant

The docs site ships an AI-powered chat assistant that answers questions about Convenient Checkout using a Retrieval-Augmented Generation (RAG) pipeline: an explicit AI API server retrieves semantically relevant documentation from Azure AI Search, then synthesises a structured, role-aware response via Azure OpenAI.

This page covers the complete implementation: architecture, components, search strategy, system prompt, and operations.

In this series:


Architectureโ€‹

Browser (ChatBot.js)
โ””โ”€ walletAIService.askExplicitClient(query)
โ””โ”€ POST /api/ask { query, session_id, system_prompt, search_config }
โ””โ”€ nginx (/api/ proxy โ†’ ${AI_API_SERVICE_URL})
โ””โ”€ AI API Server (in-cluster: ai-service:8080, container port 8080)
โ”œโ”€ Pass 1 โ€” Azure AI Search (category filter: spec only, top 3)
โ”œโ”€ Pass 2 โ€” Azure AI Search (all docs, top 8)
โ”‚ โ””โ”€ wallet-docs-index-v2
โ””โ”€ Azure OpenAI (gpt-4-1-kb deployment)
โ””โ”€ RAG: context window from merged results โ†’ structured Markdown answer

The AI API runs as a separate Kubernetes Deployment (ai-service) in the same namespace and cluster as the docs site. Nginx proxies /api/ browser requests to ai-service:8080 via the AI_API_SERVICE_URL env var โ€” no direct browser-to-backend calls.

RAG pipeline โ€” step by stepโ€‹

  1. ChatBot calls askAssistant(query) โ†’ askExplicitClient(query) in walletAIService.js
  2. The service POSTs a full payload to /api/ask (nginx proxies to ai-service:8080):
    {
    "query": "...",
    "session_id": "<uuid per browser tab>",
    "system_prompt": "...",
    "search_config": { "index_name": "wallet-docs-index-v2", "top": 5 }
    }
  3. The explicit server classifies the query: API/integration or business/general
  4. Runs one or two Azure AI Search passes and merges, deduplicates results
  5. Assembles a context window and sends it to Azure OpenAI with the system prompt
  6. Returns { answer, docs, sources } to the browser

Componentsโ€‹

1. src/components/ChatBot/walletAIService.js โ€” Browser AI serviceโ€‹

Callable ES module that is the single integration point between the ChatBot and the explicit AI API server.

ExportDescription
askAssistant(query)High-level call used by ChatBot.js. Handles short-query guard and unavailable fallback.
askExplicitClient(query, config?)Direct POST to /api/ask with full explicit payload.
indexDocs(payload?)Triggers the explicit server to re-index docs via POST /api/index.

Session management: A unique UUID is generated per browser tab on first call and stored in sessionStorage. The server uses session_id to maintain conversation context across messages in the same tab.

Config defaults:

ParameterValueDescription
DEFAULT_BASE_URL/apiNginx-proxied path to the in-cluster AI service
search_config.index_namewallet-docs-index-v2Azure AI Search index
search_config.top5Results returned per query
MIN_QUERY_LENGTH2Minimum characters before sending

2. src/components/ChatBot/aiConfig.js โ€” System promptโ€‹

The systemPrompt constant is the single source of truth for all OpenAI response behaviour. It is imported by walletAIService.js and sent with every /api/ask request.

See Prompt Engineering & Caching for the full prompt structure.


3. scripts/ai-search/localAiAssistant.js โ€” Local CLI clientโ€‹

A Node.js CLI for testing the explicit server locally without running the full docs site.

# Trigger index upload
node scripts/ai-search/localAiAssistant.js index

# Ask a question
node scripts/ai-search/localAiAssistant.js ask "How do I initiate a payment?"

Override defaults with environment variables:

AI_API_URL=http://localhost:8000 INDEX_NAME=wallet-docs-index-v2 \
node scripts/ai-search/localAiAssistant.js index

4. dockerfile โ€” Single-stage buildโ€‹

The root dockerfile now contains two stages:

StageBase imagePurpose
builderJFrog nginx 1.29-latest-devInstalls envsubst and busybox
(default)JFrog nginx 1.29-latestFinal docs/nginx image

The AI API is not built or deployed as part of this image. It runs as a standalone external process.


Deploymentโ€‹

The AI API server (ai-service) runs in the same Kubernetes namespace and cluster as the docs site. Nginx proxies /api/ requests to it via AI_API_SERVICE_URL (set to ai-service:8080 in helm values).

Connecting a different server URLโ€‹

Update AI_API_SERVICE_URL in the helm values file for the target environment:

deployment:
environment:
- name: AI_API_SERVICE_URL
value: 'my-ai-service:8080'

Index managementโ€‹

After adding or updating documentation, re-index by calling the explicit server:

node scripts/ai-search/localAiAssistant.js index

Or directly via curl:

curl --request POST \
--url http://localhost:8000/api/index \
--header 'Content-Type: application/json' \
--data '{
"docs_dir": ["/path/to/docs"],
"yaml_dir": ["/path/to/build/redocusaurus"],
"index_name": "wallet-docs-index-v2",
"overwrite": true
}'

Credentials are injected from two K8s secrets:

env:
- name: AZURE_SEARCH_ENDPOINT
valueFrom:
secretKeyRef:
name: azure-search-secret
key: endpoint
- name: AZURE_SEARCH_API_KEY
valueFrom:
secretKeyRef:
name: azure-search-secret
key: apiKey
- name: AZURE_OPENAI_ENDPOINT
valueFrom:
secretKeyRef:
name: azure-openai-secret
key: endpoint
- name: AZURE_OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: azure-openai-secret
key: apiKey
- name: AZURE_OPENAI_DEPLOYMENT
value: gpt-4-1-kb

7. nginx/nginx.template โ€” Proxy configurationโ€‹

The /api/ location block proxies requests to the AI API:

location /api/ {
proxy_pass http://${AI_API_SERVICE_URL}/api/;
proxy_read_timeout 30s;
proxy_connect_timeout 5s;
proxy_intercept_errors on;
error_page 502 503 504 = @ai_unavailable;
}

location @ai_unavailable {
default_type application/json;
return 503 '{"error":"AI API is not available"}';
}

AI_API_SERVICE_URL defaults to localhost:3001 in entrypoint.sh when the env var is not set, so nginx always starts cleanly even when the AI API sidecar is disabled.


8. src/components/ChatBot/aiAssistant.js โ€” Frontend integrationโ€‹

The askAssistant(query) function:

  1. POSTs { query } to /api/ask
  2. On success, renders the answer and linked docs
  3. On fetch failure or non-2xx, falls through to the "unavailable" message (Fuse.js fallback is disabled via ENABLE_FUSE_FALLBACK = false)

9. src/components/ChatBot/ChatBot.js โ€” Chat widgetโ€‹

The floating chat widget is a React component. Key capabilities:

  • MarkdownContent โ€” renders ##/### headings (styled with colored left border), **bold**, `inline code`, fenced code blocks, bullet lists, numbered lists, and --- dividers
  • CodeBlock โ€” fenced code blocks with a language label and copy-to-clipboard button (2s "Copied!" feedback)
  • RelatedDocs โ€” groups linked docs by URL pattern into color-coded badge groups:
BadgeColorURL pattern
API SpecBlue #2563eb/api-reference*
Webhook SpecTeal #0d9488/webhooks-v2/*
IntegrationGreen #059669/developers/*
BusinessPurple #7c3aed/business/*
InternalRed #dc2626/engineering-guide/*
GuidelinesCyan #0891b2/contributing-guide/*
DocumentationAmber #b45309(catch-all)

Spec entry titles are automatically shortened: [Webhook] paymentFailed โ€” PAYMENT_FAILED โ†’ PAYMENT_FAILED.

  • Prompt suggestions โ€” 12 pre-set chips grouped into Business / Integration / API, toggled by the lightbulb button in the input row. Suggestions and the welcome message are driven from src/components/ChatBot/prompt-suggestions.json โ€” no code changes needed to update them.
  • Lightbulb button โ€” golden glow animation when suggestions are open; subtle colour pulse when closed

10. scripts/uploadToAzureSearch.js โ€” Index populationโ€‹

Parses all markdown files from docs/ and OpenAPI specs, then uploads them to the wallet-docs-index Azure AI Search index. Runs in 3 steps:

Step 1 โ€” yarn build (aborts on failure) Docusaurus bundles and resolves all OpenAPI specs into build/redocusaurus/. These fully-resolved YAML files are used for indexing (no external $ref files remain โ€” they are all inlined by the build).

Built fileContent
build/redocusaurus/plugin-redoc-1.yamlv2 public API (fully bundled)
build/redocusaurus/plugin-redoc-2.yamlv2 Webhooks spec (fully bundled)
build/redocusaurus/plugin-redoc-0.yamlv1 public API (fully bundled)

Step 2 โ€” Parse documents

  • loadDocEntries() โ€” walks docs/ recursively, strips frontmatter, extracts title and headings
  • loadSpecEntries() โ€” iterates each spec's paths: (or webhooks: key for the webhook spec), extracts every operation, resolves request/response schemas, generates a searchable snippet and curl skeleton

Spec title format:

  • Webhook operations: [Webhook] {eventName} โ€” {summary}
  • API operations: [V2] POST /path โ€“ {summary} / [V1] POST /path โ€“ {summary}

Step 3 โ€” Upload to Azure AI Search

  • Calls ensureIndex() to create or update the index schema
  • Clears all existing documents
  • Uploads in batches of 100
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js
caution

Requires an admin key (not a query key) โ€” the script creates/updates the index schema and deletes existing documents.

Index document fields:

FieldTypeDescription
idstring (key)Base64-encoded URL, URL-safe chars only
titlestringPage/operation title
file_pathstringRelative URL (e.g. /docs/developers/...)
contentstringFull text snippet
headingsstringPipe-separated heading list
summarystringTop 5 keywords joined by comma
categorystringFirst URL segment (e.g. docs, api-reference-v2, webhooks-v2)
sectionstringSecond URL segment
last_modifiedDateTimeOffsetUpload timestamp

The category field is used for the two-pass search filter (category ne 'docs' isolates all spec categories).

Current index stats: ~214 documents โ€” 169 markdown + 45 API spec entries (16 v2 operations, 13 webhooks, 16 v1 operations)


Azure OpenAI Resourceโ€‹

PropertyValue
Resource nameCCG-AI-ASSISTANT
Endpointhttps://ccg-ai-assistant.openai.azure.com/
Deploymentgpt-4-1-kb
ModelGPT-4.1 (1M context)
Max output tokens (API queries)2500
Max output tokens (business queries)800
Temperature0.3
API version2024-02-01
SubscriptionOFS_Common_Checkout (541ef442-1fea-41f8-b23c-5f2cc8ff5246)
Resource groupfcc-comn-chkt-rg
LocationCentral US
Private endpointccg-ai-assistant-pe (VNet: fcc-comn-chkt-vnet-dev-standby-centralus/lb-subnet)
Target sub-resourceaccount

The resource is accessed via a private endpoint โ€” no public internet egress is required from AKS pods. Local development requires VPN access to the private endpoint network.


Token Utilizationโ€‹

Every chat request has two token costs: input tokens (context sent to the model) and output tokens (the model's response). Both are billed.

What drives input token costโ€‹

Input tokens = system prompt + context window assembled from Azure AI Search results.

ComponentApproximate tokensNotes
System prompt~600 tokensFixed per request โ€” loaded from api/config.js
Context window โ€” API query~6 000 tokensUp to 24 000 chars รท ~4 chars/token
Context window โ€” business query~2 000 tokensUp to 8 000 chars รท ~4 chars/token
User question~50โ€“125 tokensCapped at 500 chars

Total input per API query: ~6 750 tokens
Total input per business query: ~2 750 tokens

What drives output token costโ€‹

max_tokens is the ceiling for the completion โ€” the model is billed only for what it actually generates, not the ceiling. Setting a lower ceiling for business questions prevents runaway completions:

Query typemax_tokensTypical actual usage
API / integration2500800โ€“1 800 tokens (curl + field listing)
Business / general800150โ€“400 tokens (prose summary)

Cost estimation (GPT-4.1 pricing)โ€‹

GPT-4.1 pricing as of 2025: $2.00 per 1M input tokens, $8.00 per 1M output tokens.

Query typeInput costOutput cost (typical)Total per query
API / integration~$0.0135~$0.010~$0.024
Business / general~$0.0055~$0.002~$0.008

These are estimates. Actual costs depend on the subscription tier, private endpoint pricing, and actual tokens generated per conversation.

How the API call is constructedโ€‹

In api/server.js, callOpenAI() assembles the request body and POSTs directly to the Azure OpenAI REST API โ€” no SDK is used:

// POST /openai/deployments/{deployment}/chat/completions?api-version=2024-02-01
const body = {
messages: [
{ role: 'system', content: systemPrompt }, // ~600 input tokens
{ role: 'user', content:
`Documentation context:\n\n${contextText}\n\n---\n\nQuestion: ${query}` },
],
max_tokens: apiQuery ? 2500 : 800, // output ceiling
temperature: 0.3,
};

The request is sent with an api-key header (not Authorization: Bearer). Response is parsed from json.choices[0].message.content.

Reducing cost furtherโ€‹

LeverCurrent settingTrade-off
Reduce contextMaxChars24 000 (API) / 8 000 (business)Fewer input tokens but may miss request body fields in long spec entries
Lower maxTokens2500 (API) / 800 (business)Cheaper but completions may be cut short
Raise temperature to 0.50.3 nowMore varied / creative answers โ€” not ideal for exact API docs
Add more apiQueryKeywords33 keywordsBetter classification โ†’ fewer queries incorrectly routed to the expensive API path
Azure Search query key vs admin keyUse query key in productionRead-only, lower privilege โ€” does not affect cost

Azure AI Search Indexโ€‹

PropertyValue
Resourceccg-docs
Endpointhttps://ccg-docs.search.windows.net
Indexwallet-docs-index
Semantic configdefault (title field: title, content field: content, keyword field: summary)
API version2023-11-01
Query typesemantic with extractive answers (top 3) and extractive captions

Deployingโ€‹

Prerequisitesโ€‹

Create both K8s secrets in the target namespace before deploying:

kubectl create secret generic azure-search-secret \
--from-literal=endpoint=https://ccg-docs.search.windows.net \
--from-literal=apiKey=<query-key> \
-n <namespace>

kubectl create secret generic azure-openai-secret \
--from-literal=endpoint=https://ccg-ai-assistant.openai.azure.com/ \
--from-literal=apiKey=<openai-api-key> \
-n <namespace>
caution

Key names are case-sensitive: endpoint and apiKey (capital K). The Helm template maps these exact keys.

Enable per environmentโ€‹

In the relevant values-<env>.yaml:

aiApi:
enabled: true
secretName: azure-search-secret
searchIndex: wallet-docs-index
openaiSecretName: azure-openai-secret
openaiDeployment: gpt-4-1-kb

deployment:
environment:
- name: AI_API_SERVICE_URL
value: 'docs-ai-api:3001'

Azure AI Search firewallโ€‹

The AKS cluster's outbound IP must be added to the AI Search firewall allowlist:

Azure Portal โ†’ AI Search (ccg-docs) โ†’ Settings โ†’ Networking โ†’ Firewall โ†’ add IP

EnvironmentOutbound IP
Test20.98.179.84

The outbound IP can be retrieved from a running pod:

kubectl exec -n <namespace> <any-pod> -c <container> -- \
node -e "require('https').get('https://ifconfig.me',r=>{let d='';r.on('data',c=>d+=c);r.on('end',()=>console.log(d))}).on('error',e=>console.error(e.message))"
caution

If the AKS load balancer is recreated (e.g. cluster upgrade), the outbound IP may change, causing 403 errors from Azure AI Search. The firewall rule must be updated to match the new IP.


Local Developmentโ€‹

info

The Azure OpenAI resource uses a private endpoint โ€” it is only reachable from within the AKS VNet. You must be connected to VPN for local development with OpenAI enabled. Azure AI Search has an IP firewall allowlist; your machine's IP must be added if testing the upload script locally.

Start the API server locally:

lsof -ti :3001 | xargs kill -9 2>/dev/null; sleep 1 && \
AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
AZURE_OPENAI_ENDPOINT=https://ccg-ai-assistant.openai.azure.com/ \
AZURE_OPENAI_API_KEY=<openai-api-key> \
AZURE_OPENAI_DEPLOYMENT=gpt-4-1-kb \
node api/server.js

Run the docs site locally (separate terminal):

yarn start

The chat widget in yarn start will call /api/ask which is proxied to port 3001 via the Docusaurus dev server proxy (or directly if not proxied โ€” configure in docusaurus.config.js).

Re-upload the index after content changes:

AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js

This automatically runs yarn build first (Step 1) to ensure build/redocusaurus/ contains the latest bundled specs.


Operationsโ€‹

Check pod statusโ€‹

kubectl get pods -n <namespace> | grep docs-ai-api
kubectl logs -n <namespace> -l app=docs-ai-api -c ai-api

Test the endpoint directlyโ€‹

kubectl exec -n <namespace> <docs-pod> -- sh -c \
"echo '{\"query\":\"how do I add a payment method\"}' | /busybox wget -q -O- \
--header='Content-Type: application/json' \
--post-file=- http://docs-ai-api:3001/api/ask"

Reload after secret updateโ€‹

kubectl delete secret azure-search-secret -n <namespace>
kubectl create secret generic azure-search-secret \
--from-literal=endpoint=<endpoint> \
--from-literal=apiKey=<new-key> \
-n <namespace>

kubectl delete secret azure-openai-secret -n <namespace>
kubectl create secret generic azure-openai-secret \
--from-literal=endpoint=https://ccg-ai-assistant.openai.azure.com/ \
--from-literal=apiKey=<new-openai-key> \
-n <namespace>

kubectl rollout restart deployment/docs-ai-api -n <namespace>

Re-upload index contentโ€‹

Run after adding/changing documentation or OpenAPI specs. The script builds the site first to ensure bundled spec files are current:

AZURE_SEARCH_ENDPOINT=https://ccg-docs.search.windows.net \
AZURE_SEARCH_API_KEY=<admin-key> \
AZURE_SEARCH_INDEX=wallet-docs-index \
node scripts/uploadToAzureSearch.js

Troubleshootingโ€‹

SymptomCauseFix
nginx fails to start: no host in upstreamAI_API_SERVICE_URL env var not setAdd to deployment.environment in values file
Browser shows "AI assistant currently unavailable"/api/ask returning non-2xxCheck pod logs; see below
403 from Azure SearchFirewall blocking AKS outbound IPAdd outbound IP to AI Search firewall
426 Upgrade RequiredIstio mTLS interceptionEnsure excludeInboundPorts: "3001" annotation is on the docs-ai-api deployment
Pod stuck at 1/2 RunningIstio sidecar initialisingWait ~30s; if persistent, check kubectl describe pod
500 from /api/askAzure Search returned errorCheck pod logs for Search error: line
base name should not be blank in Docker buildJFROG_SAAS_NODE_DOCKER_IMAGE_URL var not set in repoAdd org variable to repo GitHub Actions settings
Answers are plain search content (not AI-generated)AZURE_OPENAI_ENDPOINT or AZURE_OPENAI_API_KEY not setCreate azure-openai-secret and set openaiSecretName in values file
401 from Azure OpenAIWrong or missing apiKey in secretRecreate azure-openai-secret with the correct key
403 from Azure OpenAIPrivate endpoint not reachable from pod subnetVerify the ccg-ai-assistant-pe private endpoint targets lb-subnet in the AKS VNet
OpenAI connection error locallyNot on VPNConnect to VPN โ€” the OpenAI resource has a private endpoint only
Upload script aborts at Step 1yarn build failedFix the build error first; the upload will not run against stale content
API spec entries missing from search resultsBundled spec files are staleRe-run upload script โ€” it rebuilds and re-bundles specs automatically
Spec answers lack field detailsExternal $ref schemas not resolvedThis should not happen with built files; verify SPECS points to build/redocusaurus/ not openapi/
Answers not using ## headingsModel ignoring format instructionsCheck api/config.js system prompt โ€” the ## Summary template must be present