Selection Chatbot Service - Complete Developer Documentation¶
Service: Chatbot Configuration & Selection Management
Port: 8004
Purpose: Configure chatbot type, avatar, voice, model, and purpose
Technology: FastAPI (Python 3.9+), Azure TTS, Rhubarb Lip-Sync
Code Location:/selection-chatbot-service/src/main.py(952 lines, 20+ endpoints)
Owner: Backend Team
Last Updated: 2025-12-26
Table of Contents¶
- Service Overview
- Complete Endpoints
- Avatar & Voice System
- Model Selection & System Prompts
- Azure TTS Integration
- Rhubarb Lip-Sync Generation
- Greeting Generation System
- Database Operations
- Security Analysis
- Deployment
Service Overview¶
The Selection Chatbot Service manages all chatbot configuration after creation. Users select chatbot type, avatar, voice, LLM model, and purpose. The service generates greeting audio with lip-sync data for 3D avatars.
Key Responsibilities¶
✅ Chatbot Type Selection - 3D, Text, Voice
✅ Avatar Selection - 7 avatars (Eva, Shayla, Myra, Chris, Jack, Anu, Emma)
✅ Voice Selection - 10 Azure Neural voices
✅ Model Selection - 11 LLM models
✅ Purpose Selection - Sales Bot, Service Bot, Custom Bot
✅ TTS Generation - Azure Speech SDK
✅ Lip-Sync Generation - Rhubarb for 3D avatars
✅ Greeting Audio - Auto-generated on avatar change
Statistics¶
- Total Lines: 952
- Endpoints: 20+
- Azure TTS: ⚠️ Hardcoded subscription key
- Rhubarb: Platform-specific (Windows/macOS/Linux)
Complete Endpoints¶
1. POST /v2/select-chatbot¶
Purpose: Select chatbot type (3D, Text, or Voice)
Code Location: Lines 67-88
Request:
async def select_chatbot(
user_id: str = Form(...),
project_id: str = Form(...),
chatbot_type: str = Form(...) # "3D-chatbot", "text-chatbot", "voice-chatbot"
)
Validation:
if chatbot_type not in ["text-chatbot", "voice-chatbot", "3D-chatbot"]:
raise HTTPException(status_code=400, detail="Invalid chatbot type selected")
Database Operation:
db.selection_history.updateOne(
{ user_id: "User-123456", project_id: "User-123456_Project_1" },
{
$set: {
chatbot_type: "3D-chatbot",
timestamp: ISODate("2025-01-15T14:00:00Z"),
},
},
{ upsert: true }
);
Response:
{
"message": "Chatbot selection saved successfully",
"user_id": "User-123456",
"project_id": "User-123456_Project_1",
"chatbot_type": "3D-chatbot"
}
2. POST /v2/select-purpose¶
Purpose: Set chatbot purpose (determines system prompt template)
Code Location: Lines 90-111
Valid Purposes:
- Sales Bot - Persuasive, product-focused
- Service Bot - Helpful, problem-solving
- Custom Bot - User-defined
Request:
POST /v2/select-purpose
Content-Type: multipart/form-data
user_id=User-123456
project_id=User-123456_Project_1
chatbot_purpose=Service Bot
Response:
{
"message": "Chatbot purpose selected successfully",
"user_id": "User-123456",
"project_id": "User-123456_Project_1",
"chatbot_purpose": "Service Bot"
}
3. GET /v2/select-purpose¶
Purpose: Retrieve current purpose
Code Location: Lines 113-131
4. POST /v2/select-voice¶
Purpose: Select voice for chatbot (updates both selection & chatbot collections)
Code Location: Lines 157-200
Valid Voices:
Male_1(Eric) - en-USMale_2(Guy) - en-USMale_3(Liam) - en-CAMale_IND(Prabhat) - en-INFemale_1(Ava) - en-US MultilingualFemale_2(Jenny) - en-USFemale_3(Aria) - en-USFemale_4(Sara) - en-USFemale_IND(Neerja) - en-INFemale_IND2(Neerja Expressive) - en-IN
Code:
valid_voices = [
"Male_1", "Male_2", "Male_3", "Male_IND",
"Female_1", "Female_2", "Female_3", "Female_4", "Female_IND", "Female_IND2"
]
if selection_voice not in valid_voices:
raise HTTPException(status_code=400, detail="Invalid voice type selected")
# Update selection_history
selection_collection.update_one(query, {
"$set": {
"selection_voice": selection_voice,
"timestamp": datetime.utcnow()
}
}, upsert=True)
# Update chatbot_collection if exists
existing = chatbot_collection.find_one(query)
if existing:
chatbot_collection.update_one(query, {"$set": {"voice": selection_voice}})
5. POST /v2/select-model¶
Purpose: Select LLM model and auto-configure system prompt
Code Location: Lines 226-301
Supported Models:
if selection_model not in [
"openai-4", "openai-4o", "openai-35", "mistral", "deepseek",
"llama", "phi", "openai-o1mini", "gemini-flash-25",
"claude-sonnet-4", "grok-3"
]:
raise HTTPException(status_code=400, detail="Invalid model type selected")
System Prompt Assignment Process:
- Check chatbot exists:
selection_doc = selection_collection.find_one({"user_id": user_id, "project_id": project_id})
chatbot_purpose = selection_doc.get("chatbot_purpose") # e.g., "Service Bot"
- Fetch default system prompt:
# system_prompts_default collection
system_prompt = system_prompt_collection.find_one({
"chatbot_purpose": chatbot_purpose, # "Service Bot"
"model": selection_model.lower() # "openai-35"
})
content = system_prompt.get("content", "")
- Delete old prompts & insert new:
# Delete ALL existing prompts for this user/project
user_system_prompt_collection.delete_many({
"user_id": user_id,
"project_id": project_id
})
# Insert new prompt
user_system_prompt_collection.insert_one({
"user_id": user_id,
"project_id": project_id,
"chatbot_purpose": chatbot_purpose,
"model": selection_model,
"system_prompt": content, # From default
"created_at": datetime.utcnow().isoformat(),
"sys_prompt": []
})
Response:
{
"message": "Chatbot model selected successfully (new prompt inserted)",
"user_id": "User-123456",
"project_id": "User-123456_Project_1",
"selection_model": "openai-35",
"system_prompt": "You are a helpful service bot..."
}
6. POST /v2/reset-system-prompt¶
Purpose: Reset system prompt to default for current model/purpose
Code Location: Lines 304-348
7. POST /v2/select-avatar¶
Purpose: Select 3D avatar (auto-regenerates greeting with TTS & lip-sync)
Code Location: Lines 502-721 (220 lines!)
This is the MOST COMPLEX endpoint!
Avatar & Voice System¶
Available Avatars¶
Code Location: Lines 352-360
AVATAR_TYPES = {
"Eva": "Seo-optimization-service",
"Shayla": "Ai-ml-services",
"Myra": "Ai-chatbot-services",
"Chris": "Ai-Portfolio",
"Jack": "computer-vision",
"Anu": "Avatar_remote_physio",
"Emma": "Avatar_Emma"
}
Gender Detection¶
Code Location: Lines 481-491
def get_avatar_gender(avatar_name: str) -> str:
"""Determine avatar gender based on avatar name"""
male_avatars = {"Chris", "Jack"}
female_avatars = {"Eva", "Shayla", "Myra", "Anu", "Emma"}
if avatar_name in male_avatars:
return "male"
elif avatar_name in female_avatars:
return "female"
else:
return "unknown"
Default Voice Assignment¶
Code Location: Lines 493-500
def get_default_voice_for_gender(gender: str) -> str:
"""Get default voice based on gender"""
if gender == "male":
return "Male_2" # Default male voice (Guy)
elif gender == "female":
return "Female_2" # Default female voice (Jenny)
else:
return "Female_2" # Default fallback
Avatar Selection Logic¶
Code (Lines 515-543):
# Get current selection
current_selection = selection_collection.find_one(query)
current_voice = current_selection.get("selection_voice") if current_selection else None
current_avatar = current_selection.get("selection_avatar") if current_selection else None
# Check if avatar actually changed
avatar_changed = (current_avatar != selection_avatar) if current_avatar else True
# Determine new avatar gender and default voice
new_avatar_gender = get_avatar_gender(selection_avatar)
default_voice_for_new_avatar = get_default_voice_for_gender(new_avatar_gender)
# Check if avatar GENDER is changing (important!)
avatar_gender_changed = False
if current_avatar and current_avatar != selection_avatar:
current_avatar_gender = get_avatar_gender(current_avatar)
avatar_gender_changed = current_avatar_gender != new_avatar_gender
# Voice logic: When avatar GENDER changes, reset voice to default
if avatar_gender_changed or not current_voice:
final_voice = default_voice_for_new_avatar
logger.info(f"Voice set to default for {new_avatar_gender} avatar: {final_voice}")
else:
# Keep current voice if avatar gender hasn't changed
final_voice = current_voice
logger.info(f"Keeping current voice: {final_voice}")
Examples:
| Current Avatar | Current Voice | New Avatar | Gender Change? | Final Voice | Reason |
|---|---|---|---|---|---|
| Eva (Female) | Female_1 | Shayla (Female) | No | Female_1 | Keep user's choice |
| Eva (Female) | Female_3 | Chris (Male) | Yes | Male_2 | Gender changed, reset to male default |
| None | None | Emma (Female) | N/A | Female_2 | First selection, use default |
| Jack (Male) | Male_IND | Chris (Male) | No | Male_IND | Keep user's choice |
Model Selection & System Prompts¶
System Prompt Collections¶
1. system_prompts_default - Template prompts
Schema:
{
"_id": ObjectId("..."),
"chatbot_purpose": "Service Bot",
"model": "openai-35", // Lowercase
"content": "You are a helpful service bot. Your primary goal is to assist customers with their inquiries...",
"created_at": "2024-01-01T00:00:00Z"
}
2. system_prompts_user - User/project-specific prompts
Schema:
{
"_id": ObjectId("..."),
"user_id": "User-123456",
"project_id": "User-123456_Project_1",
"chatbot_purpose": "Service Bot",
"model": "openai-35",
"system_prompt": "You are a helpful service bot...", // Can be customized
"created_at": "2025-01-15T14:00:00Z",
"sys_prompt": [] // Additional prompts?
}
Azure TTS Integration¶
⚠️ HARDCODED CREDENTIALS¶
Location: Lines 409-410, 881-882
speech_config = speechsdk.SpeechConfig(
subscription="DnG6HrvZs99ofBUaQ2h8mp1GxP7FkJqrEjzrargPQph8OZCGIkyCJQQJ99BCACGhslBXJ3w3AAAYACOGVsLJ", # ⚠️ HARDCODED!
region="centralindia"
)
Should Be:
subscription = os.getenv("AZURE_SPEECH_KEY")
region = os.getenv("AZURE_SPEECH_REGION", "centralindia")
Text-to-Speech Function¶
Code Location: Lines 398-438
async def text_to_speech_greeting(text, session_id, voice):
"""Convert text to speech using Azure Speech SDK"""
# Clean the input text
cleaned_text = remove_contact_numbers(text)
# Define the output file path
wav_file = os.path.join(OUTPUT_DIR, f"{session_id}.wav")
# Configure Azure Speech Config
speech_config = speechsdk.SpeechConfig(
subscription="...",
region="centralindia"
)
speech_config.speech_synthesis_voice_name = voice # e.g., "en-US-JennyNeural"
# Configure audio output
audio_config = speechsdk.audio.AudioOutputConfig(filename=wav_file)
# Create the speech synthesizer
speech_synthesizer = speechsdk.SpeechSynthesizer(
speech_config=speech_config,
audio_config=audio_config
)
# Synthesize the text to speech synchronously
result = speech_synthesizer.speak_text_async(cleaned_text).get()
# Check the result
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
print("Speech synthesis completed successfully.")
else:
error_details = result.cancellation_details
raise Exception(f"Speech synthesis failed: {error_details.reason}")
return wav_file
Voice Mapping¶
Code Location: Lines 366-377, 855-866
SUPPORTED_VOICES = {
"Male_1": "en-US-EricNeural",
"Male_2": "en-US-GuyNeural",
"Male_3": "en-CA-LiamNeural",
"Male_IND": "en-IN-PrabhatNeural",
"Female_1": "en-US-AvaMultilingualNeural",
"Female_2": "en-US-JennyNeural",
"Female_3": "en-US-AriaNeural",
"Female_4": "en-US-SaraNeural",
"Female_IND": "en-IN-NeerjaNeural",
"Female_IND2": "en-IN-NeerjaExpressiveNeural"
}
Usage:
Rhubarb Lip-Sync Generation¶
Platform Detection¶
Code Location: Lines 380-395
system = platform.system().lower()
current_dir = os.path.dirname(os.path.abspath(__file__))
if system == "windows":
rhubarbExePath = os.path.join(current_dir, "Rhubarb-Lip-Sync-1.13.0-Windows", "rhubarb.exe")
elif system == "darwin": # macOS
rhubarbExePath = os.path.join(current_dir, "Rhubarb-Lip-Sync-1.13.0-macOS", "rhubarb")
elif system == "linux":
rhubarbExePath = os.path.join(current_dir, "Rhubarb", "rhubarb")
else:
raise RuntimeError(f"Unsupported platform: {system}")
# Verify Rhubarb executable exists
if not os.path.exists(rhubarbExePath):
print(f"Warning: Rhubarb executable not found at: {rhubarbExePath}")
rhubarbExePath = None
Lip-Sync Generation¶
Code Location: Lines 454-471
def generate_lip_sync_greeting(wav_file, session_id):
"""Generate lip-sync JSON using Rhubarb"""
if not rhubarbExePath:
print("Rhubarb not available, skipping lip-sync generation")
return None
json_file = os.path.join(OUTPUT_DIR, f"{session_id}.json")
result = subprocess.run(
[rhubarbExePath, "-f", "json", "-o", json_file, wav_file, "-r", "phonetic"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
return json_file if os.path.exists(json_file) else None
Command:
rhubarb.exe -f json -o avatar_greeting_1735214400.json avatar_greeting_1735214400_pcm.wav -r phonetic
Lip-Sync JSON Format¶
Code Location: Lines 473-479
def parse_lip_sync_greeting(json_file, sound_file):
"""Parse lip-sync JSON and add sound file path"""
with open(json_file, "r") as file:
lip_sync_data = json.load(file)
lip_sync_data["metadata"]["soundFile"] = sound_file
return lip_sync_data
Example Rhubarb Output:
{
"metadata": {
"soundFile": "/path/to/audio.wav",
"duration": 3.5
},
"mouthCues": [
{"start": 0.0, "end": 0.1, "value": "X"},
{"start": 0.1, "end": 0.3, "value": "B"},
{"start": 0.3, "end": 0.5, "value": "C"},
{"start": 0.5, "end": 0.7, "value": "D"},
...
]
}
Mouth Shapes (Visemes):
- A - relaxed
- B - lips together
- C - rounded lips
- D - tongue up
- E - smile
- F - lower lip/teeth
- G - tongue back
- H - wide open
- X - closed/silent
Greeting Generation System¶
Automatic Greeting Regeneration¶
Triggered When: Avatar changes (Lines 584-686)
Process:
- Determine greeting text:
sel_doc = selection_collection.find_one({"user_id": user_id, "project_id": project_id}, {"_id": 0, "hidden_name": 1}) or {}
display_name = sel_doc.get("hidden_name") or selection_avatar # Prefer custom name
greeting_text = f"Hello, I'm {display_name}, your virtual chatbot. How can I help you?"
- Generate TTS audio:
timestamp = int(time.time())
session_id = f"avatar_greeting_{timestamp}" # e.g., "avatar_greeting_1735214400"
azure_voice = SUPPORTED_VOICES.get(final_voice) # "en-US-JennyNeural"
wav_file = await text_to_speech_greeting(greeting_text, session_id, azure_voice)
- Convert to PCM format:
pcm_wav_file = os.path.join(OUTPUT_DIR, f"{session_id}_pcm.wav")
converted_file = convert_wav_to_pcm_greeting(wav_file, pcm_wav_file)
# Uses ffmpeg: ffmpeg -i input.wav -acodec pcm_s16le output_pcm.wav
- Generate lip-sync:
json_file = generate_lip_sync_greeting(converted_file, session_id)
lip_sync_data = parse_lip_sync_greeting(json_file, wav_file)
- Encode audio to base64:
- Update greeting collections:
for greeting_type in ["initial_greeting", "form_greeting"]:
update_data = {
"avatar_name": selection_avatar,
"avatar_gender": new_avatar_gender,
"voice_selection": final_voice,
"voice": final_voice,
"text": greeting_text,
"facialExpression": "smiling",
"animation": "Idle",
"audio": audio_base64, # Base64 WAV
"lipsync": lip_sync_data, # Rhubarb JSON
"timestamp": datetime.utcnow()
}
# Update BOTH collections for compatibility
generate_greeting_collection.update_one(...)
generate_greetings_collection.update_one(...)
Greeting Collections¶
1. generate_greeting (singular)
2. generate_greetings (plural)
Both updated for backwards compatibility!
Schema:
{
"_id": ObjectId("..."),
"user_id": "User-123456",
"project_id": "User-123456_Project_1",
"greeting_type": "initial_greeting", // or "form_greeting"
// Avatar info
"avatar_name": "Emma",
"avatar_gender": "female",
// Voice info
"voice_selection": "Female_2",
"voice": "Female_2",
"voice_changed_due_to_avatar": false,
// Content
"text": "Hello, I'm Emma, your virtual chatbot. How can I help you?",
"audio": "UklGRoA8AABXQVZFZm10IBAAAAABAAEAgD4AAAB9AAACABAA...", // Base64 WAV
"lipsync": {
"metadata": {"soundFile": "...", "duration": 3.2},
"mouthCues": [...]
},
// Animation
"facialExpression": "smiling",
"animation": "Idle",
// Metadata
"greeting_updated_by_avatar_selection": true,
"timestamp": ISODate("2025-01-15T14:30:00Z")
}
Database Operations¶
Collections Used (11+)¶
- selection_history - Main configuration state
- chatbot_selections - Deprecated but still updated
- files - Uploaded data
- organisation_data - Organization info
- system_prompts_user - Custom prompts
- system_prompts_default - Template prompts
- chatbot_guardrails - Safety rules
- projectid_creation - Projects
- generate_greeting - Greeting data (singular)
- generate_greetings - Greeting data (plural)
- files_secondary - Secondary data
selection_history Schema¶
Complete Document:
{
"_id": ObjectId("..."),
"user_id": "User-123456",
"project_id": "User-123456_Project_1",
// Chatbot type
"chatbot_type": "3D-chatbot", // or "text-chatbot", "voice-chatbot"
"chatbot_purpose": "Service Bot",
// Avatar (3D only)
"selection_avatar": "Emma",
"avatar_type": "Avatar_Emma",
"avatar_gender": "female",
// Voice
"selection_voice": "Female_2",
"voice_changed_due_to_avatar": false,
// Model
"selection_model": "openai-35",
// Timestamps
"timestamp": ISODate("2025-01-15T14:30:00Z"),
// Optional
"hidden_name": "Support Assistant", // Custom display name
"sitemap_urls": ["https://example.com/page1"],
"database_type": "milvus"
}
Security Analysis¶
Critical Issues¶
1. ⚠️ HARDCODED AZURE SPEECH KEY
Locations: Lines 410, 882
Impact: Anyone with code access can use Azure Speech API (costs money)
Fix:
AZURE_SPEECH_KEY = os.getenv("AZURE_SPEECH_KEY")
AZURE_SPEECH_REGION = os.getenv("AZURE_SPEECH_REGION", "centralindia")
if not AZURE_SPEECH_KEY:
raise ValueError("AZURE_SPEECH_KEY must be set")
2. ⚠️ No Input Sanitization
Problem: greeting_text uses user input without sanitization
Code (Line 594):
display_name = sel_doc.get("hidden_name") or selection_avatar
greeting_text = f"Hello, I'm {display_name}, your virtual chatbot..."
If hidden_name contains SSML: Could inject Azure TTS commands
Fix:
import re
def sanitize_display_name(name):
# Remove SSML tags, limit length
name = re.sub(r'<[^>]+>', '', name)
return name[:50] # Max 50 characters
display_name = sanitize_display_name(sel_doc.get("hidden_name") or selection_avatar)
3. ⚠️ File System Race Conditions
Problem: Temp files not properly cleaned on errors
Lines 648-654:
# Cleanup temporary files
for file in [wav_file, pcm_wav_file, json_file]:
if file and os.path.exists(file):
try:
os.remove(file)
except Exception:
pass # Silently fails
Better:
import atexit
@atexit.register
def cleanup_temp_files():
for file in glob.glob(os.path.join(OUTPUT_DIR, "avatar_greeting_*")):
try:
os.remove(file)
except:
pass
4. ⚠️ CORS Allows All Origins (same as other services)
Deployment¶
Docker Configuration¶
Dockerfile:
FROM python:3.9-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
# Copy Rhubarb executable
COPY Rhubarb/ ./Rhubarb/
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy source code
COPY src/ .
EXPOSE 8004
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8004"]
Requirements.txt¶
fastapi>=0.95.0
uvicorn[standard]>=0.22.0
pymongo>=4.3.3
python-multipart>=0.0.6
python-dotenv>=1.0.0
# Azure TTS
azure-cognitiveservices-speech>=1.31.0
# Monitoring
ddtrace>=1.19.0
Environment Variables¶
# Database
MONGO_URI=mongodb://...
MONGO_DB_NAME=Machine_agent_demo
# Azure Speech (should be added!)
AZURE_SPEECH_KEY=your_key_here
AZURE_SPEECH_REGION=centralindia
# DataDog
DD_SERVICE=selection-chatbot-service
DD_ENV=production
Performance Metrics¶
| Operation | Latency | Notes |
|---|---|---|
| Select chatbot type | 20-50ms | Simple DB update |
| Select voice | 30-60ms | 2 DB updates |
| Select model | 100-200ms | Fetch prompt + update |
| Select avatar (no greeting) | 50-100ms | DB updates only |
| Select avatar (with greeting) | 8-15 seconds | TTS + Rhubarb! |
Greeting Generation Breakdown:
- TTS generation: 2-3 seconds
- FFmpeg conversion: 0.5-1 second
- Rhubarb lip-sync: 3-8 seconds
- Base64 encoding: 0.5 second
- DB updates: 0.5 second
- Total: 8-15 seconds
Related Documentation¶
- Create Chatbot Service - Creates projects first
- System Prompt Service - Manages prompts
- Response 3D Chatbot Service - Uses these settings
- System Architecture
Recommendations¶
Critical¶
- ⚠️ Move Azure Speech Key to Environment
- ⚠️ Add Input Sanitization for display names
- ⚠️ Improve File Cleanup with atexit or background tasks
- ⚠️ Restrict CORS
Improvements¶
- Cache TTS Audio - Don't regenerate same greeting
- Async Greeting Generation - Use background tasks
- Webhook for Greeting Ready - Notify client when complete
- Voice Preview - Generate 5-second sample
- Rhubarb Fallback - Work without lip-sync if Rhubarb missing
Code Quality¶
- Extract Greeting Logic - Separate function/module
- Reduce Duplicate Collections - Use one greeting collection
- Add Type Hints
- Unit Tests - Test gender detection, voice mapping
Last Updated: 2025-12-26
Code Version: selection-chatbot-service/src/main.py (952 lines)
Total Endpoints: 20+
Review Cycle: Monthly
"Where chatbots get their personality."