Skip to content

Selection Chatbot Service - Complete Developer Documentation

Service: Chatbot Configuration & Selection Management
Port: 8004
Purpose: Configure chatbot type, avatar, voice, model, and purpose
Technology: FastAPI (Python 3.9+), Azure TTS, Rhubarb Lip-Sync
Code Location: /selection-chatbot-service/src/main.py (952 lines, 20+ endpoints)
Owner: Backend Team
Last Updated: 2025-12-26


Table of Contents

  1. Service Overview
  2. Complete Endpoints
  3. Avatar & Voice System
  4. Model Selection & System Prompts
  5. Azure TTS Integration
  6. Rhubarb Lip-Sync Generation
  7. Greeting Generation System
  8. Database Operations
  9. Security Analysis
  10. Deployment

Service Overview

The Selection Chatbot Service manages all chatbot configuration after creation. Users select chatbot type, avatar, voice, LLM model, and purpose. The service generates greeting audio with lip-sync data for 3D avatars.

Key Responsibilities

Chatbot Type Selection - 3D, Text, Voice
Avatar Selection - 7 avatars (Eva, Shayla, Myra, Chris, Jack, Anu, Emma)
Voice Selection - 10 Azure Neural voices
Model Selection - 11 LLM models
Purpose Selection - Sales Bot, Service Bot, Custom Bot
TTS Generation - Azure Speech SDK
Lip-Sync Generation - Rhubarb for 3D avatars
Greeting Audio - Auto-generated on avatar change

Statistics

  • Total Lines: 952
  • Endpoints: 20+
  • Azure TTS: ⚠️ Hardcoded subscription key
  • Rhubarb: Platform-specific (Windows/macOS/Linux)

Complete Endpoints

1. POST /v2/select-chatbot

Purpose: Select chatbot type (3D, Text, or Voice)

Code Location: Lines 67-88

Request:

async def select_chatbot(
    user_id: str = Form(...),
    project_id: str = Form(...),
    chatbot_type: str = Form(...)  # "3D-chatbot", "text-chatbot", "voice-chatbot"
)

Validation:

if chatbot_type not in ["text-chatbot", "voice-chatbot", "3D-chatbot"]:
    raise HTTPException(status_code=400, detail="Invalid chatbot type selected")

Database Operation:

db.selection_history.updateOne(
  { user_id: "User-123456", project_id: "User-123456_Project_1" },
  {
    $set: {
      chatbot_type: "3D-chatbot",
      timestamp: ISODate("2025-01-15T14:00:00Z"),
    },
  },
  { upsert: true }
);

Response:

{
  "message": "Chatbot selection saved successfully",
  "user_id": "User-123456",
  "project_id": "User-123456_Project_1",
  "chatbot_type": "3D-chatbot"
}

2. POST /v2/select-purpose

Purpose: Set chatbot purpose (determines system prompt template)

Code Location: Lines 90-111

Valid Purposes:

  • Sales Bot - Persuasive, product-focused
  • Service Bot - Helpful, problem-solving
  • Custom Bot - User-defined

Request:

POST /v2/select-purpose
Content-Type: multipart/form-data

user_id=User-123456
project_id=User-123456_Project_1
chatbot_purpose=Service Bot

Response:

{
  "message": "Chatbot purpose selected successfully",
  "user_id": "User-123456",
  "project_id": "User-123456_Project_1",
  "chatbot_purpose": "Service Bot"
}

3. GET /v2/select-purpose

Purpose: Retrieve current purpose

Code Location: Lines 113-131


4. POST /v2/select-voice

Purpose: Select voice for chatbot (updates both selection & chatbot collections)

Code Location: Lines 157-200

Valid Voices:

  • Male_1 (Eric) - en-US
  • Male_2 (Guy) - en-US
  • Male_3 (Liam) - en-CA
  • Male_IND (Prabhat) - en-IN
  • Female_1 (Ava) - en-US Multilingual
  • Female_2 (Jenny) - en-US
  • Female_3 (Aria) - en-US
  • Female_4 (Sara) - en-US
  • Female_IND (Neerja) - en-IN
  • Female_IND2 (Neerja Expressive) - en-IN

Code:

valid_voices = [
    "Male_1", "Male_2", "Male_3", "Male_IND",
    "Female_1", "Female_2", "Female_3", "Female_4", "Female_IND", "Female_IND2"
]
if selection_voice not in valid_voices:
    raise HTTPException(status_code=400, detail="Invalid voice type selected")

# Update selection_history
selection_collection.update_one(query, {
    "$set": {
        "selection_voice": selection_voice,
        "timestamp": datetime.utcnow()
    }
}, upsert=True)

# Update chatbot_collection if exists
existing = chatbot_collection.find_one(query)
if existing:
    chatbot_collection.update_one(query, {"$set": {"voice": selection_voice}})

5. POST /v2/select-model

Purpose: Select LLM model and auto-configure system prompt

Code Location: Lines 226-301

Supported Models:

if selection_model not in [
    "openai-4", "openai-4o", "openai-35", "mistral", "deepseek",
    "llama", "phi", "openai-o1mini", "gemini-flash-25",
    "claude-sonnet-4", "grok-3"
]:
    raise HTTPException(status_code=400, detail="Invalid model type selected")

System Prompt Assignment Process:

  1. Check chatbot exists:
selection_doc = selection_collection.find_one({"user_id": user_id, "project_id": project_id})
chatbot_purpose = selection_doc.get("chatbot_purpose")  # e.g., "Service Bot"
  1. Fetch default system prompt:
# system_prompts_default collection
system_prompt = system_prompt_collection.find_one({
    "chatbot_purpose": chatbot_purpose,  # "Service Bot"
    "model": selection_model.lower()      # "openai-35"
})
content = system_prompt.get("content", "")
  1. Delete old prompts & insert new:
# Delete ALL existing prompts for this user/project
user_system_prompt_collection.delete_many({
    "user_id": user_id,
    "project_id": project_id
})

# Insert new prompt
user_system_prompt_collection.insert_one({
    "user_id": user_id,
    "project_id": project_id,
    "chatbot_purpose": chatbot_purpose,
    "model": selection_model,
    "system_prompt": content,  # From default
    "created_at": datetime.utcnow().isoformat(),
    "sys_prompt": []
})

Response:

{
  "message": "Chatbot model selected successfully (new prompt inserted)",
  "user_id": "User-123456",
  "project_id": "User-123456_Project_1",
  "selection_model": "openai-35",
  "system_prompt": "You are a helpful service bot..."
}

6. POST /v2/reset-system-prompt

Purpose: Reset system prompt to default for current model/purpose

Code Location: Lines 304-348


7. POST /v2/select-avatar

Purpose: Select 3D avatar (auto-regenerates greeting with TTS & lip-sync)

Code Location: Lines 502-721 (220 lines!)

This is the MOST COMPLEX endpoint!


Avatar & Voice System

Available Avatars

Code Location: Lines 352-360

AVATAR_TYPES = {
    "Eva": "Seo-optimization-service",
    "Shayla": "Ai-ml-services",
    "Myra": "Ai-chatbot-services",
    "Chris": "Ai-Portfolio",
    "Jack": "computer-vision",
    "Anu": "Avatar_remote_physio",
    "Emma": "Avatar_Emma"
}

Gender Detection

Code Location: Lines 481-491

def get_avatar_gender(avatar_name: str) -> str:
    """Determine avatar gender based on avatar name"""
    male_avatars = {"Chris", "Jack"}
    female_avatars = {"Eva", "Shayla", "Myra", "Anu", "Emma"}

    if avatar_name in male_avatars:
        return "male"
    elif avatar_name in female_avatars:
        return "female"
    else:
        return "unknown"

Default Voice Assignment

Code Location: Lines 493-500

def get_default_voice_for_gender(gender: str) -> str:
    """Get default voice based on gender"""
    if gender == "male":
        return "Male_2"  # Default male voice (Guy)
    elif gender == "female":
        return "Female_2"  # Default female voice (Jenny)
    else:
        return "Female_2"  # Default fallback

Avatar Selection Logic

Code (Lines 515-543):

# Get current selection
current_selection = selection_collection.find_one(query)
current_voice = current_selection.get("selection_voice") if current_selection else None
current_avatar = current_selection.get("selection_avatar") if current_selection else None

# Check if avatar actually changed
avatar_changed = (current_avatar != selection_avatar) if current_avatar else True

# Determine new avatar gender and default voice
new_avatar_gender = get_avatar_gender(selection_avatar)
default_voice_for_new_avatar = get_default_voice_for_gender(new_avatar_gender)

# Check if avatar GENDER is changing (important!)
avatar_gender_changed = False
if current_avatar and current_avatar != selection_avatar:
    current_avatar_gender = get_avatar_gender(current_avatar)
    avatar_gender_changed = current_avatar_gender != new_avatar_gender

# Voice logic: When avatar GENDER changes, reset voice to default
if avatar_gender_changed or not current_voice:
    final_voice = default_voice_for_new_avatar
    logger.info(f"Voice set to default for {new_avatar_gender} avatar: {final_voice}")
else:
    # Keep current voice if avatar gender hasn't changed
    final_voice = current_voice
    logger.info(f"Keeping current voice: {final_voice}")

Examples:

Current Avatar Current Voice New Avatar Gender Change? Final Voice Reason
Eva (Female) Female_1 Shayla (Female) No Female_1 Keep user's choice
Eva (Female) Female_3 Chris (Male) Yes Male_2 Gender changed, reset to male default
None None Emma (Female) N/A Female_2 First selection, use default
Jack (Male) Male_IND Chris (Male) No Male_IND Keep user's choice

Model Selection & System Prompts

System Prompt Collections

1. system_prompts_default - Template prompts

Schema:

{
    "_id": ObjectId("..."),
    "chatbot_purpose": "Service Bot",
    "model": "openai-35",  // Lowercase
    "content": "You are a helpful service bot. Your primary goal is to assist customers with their inquiries...",
    "created_at": "2024-01-01T00:00:00Z"
}

2. system_prompts_user - User/project-specific prompts

Schema:

{
    "_id": ObjectId("..."),
    "user_id": "User-123456",
    "project_id": "User-123456_Project_1",
    "chatbot_purpose": "Service Bot",
    "model": "openai-35",
    "system_prompt": "You are a helpful service bot...",  // Can be customized
    "created_at": "2025-01-15T14:00:00Z",
    "sys_prompt": []  // Additional prompts?
}

Azure TTS Integration

⚠️ HARDCODED CREDENTIALS

Location: Lines 409-410, 881-882

speech_config = speechsdk.SpeechConfig(
    subscription="DnG6HrvZs99ofBUaQ2h8mp1GxP7FkJqrEjzrargPQph8OZCGIkyCJQQJ99BCACGhslBXJ3w3AAAYACOGVsLJ",  # ⚠️ HARDCODED!
    region="centralindia"
)

Should Be:

subscription = os.getenv("AZURE_SPEECH_KEY")
region = os.getenv("AZURE_SPEECH_REGION", "centralindia")

Text-to-Speech Function

Code Location: Lines 398-438

async def text_to_speech_greeting(text, session_id, voice):
    """Convert text to speech using Azure Speech SDK"""
    # Clean the input text
    cleaned_text = remove_contact_numbers(text)

    # Define the output file path
    wav_file = os.path.join(OUTPUT_DIR, f"{session_id}.wav")

    # Configure Azure Speech Config
    speech_config = speechsdk.SpeechConfig(
        subscription="...",
        region="centralindia"
    )
    speech_config.speech_synthesis_voice_name = voice  # e.g., "en-US-JennyNeural"

    # Configure audio output
    audio_config = speechsdk.audio.AudioOutputConfig(filename=wav_file)

    # Create the speech synthesizer
    speech_synthesizer = speechsdk.SpeechSynthesizer(
        speech_config=speech_config,
        audio_config=audio_config
    )

    # Synthesize the text to speech synchronously
    result = speech_synthesizer.speak_text_async(cleaned_text).get()

    # Check the result
    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        print("Speech synthesis completed successfully.")
    else:
        error_details = result.cancellation_details
        raise Exception(f"Speech synthesis failed: {error_details.reason}")

    return wav_file

Voice Mapping

Code Location: Lines 366-377, 855-866

SUPPORTED_VOICES = {
    "Male_1": "en-US-EricNeural",
    "Male_2": "en-US-GuyNeural",
    "Male_3": "en-CA-LiamNeural",
    "Male_IND": "en-IN-PrabhatNeural",
    "Female_1": "en-US-AvaMultilingualNeural",
    "Female_2": "en-US-JennyNeural",
    "Female_3": "en-US-AriaNeural",
    "Female_4": "en-US-SaraNeural",
    "Female_IND": "en-IN-NeerjaNeural",
    "Female_IND2": "en-IN-NeerjaExpressiveNeural"
}

Usage:

user_selects = "Female_2"
azure_voice = SUPPORTED_VOICES["Female_2"]  # "en-US-JennyNeural"

Rhubarb Lip-Sync Generation

Platform Detection

Code Location: Lines 380-395

system = platform.system().lower()
current_dir = os.path.dirname(os.path.abspath(__file__))

if system == "windows":
    rhubarbExePath = os.path.join(current_dir, "Rhubarb-Lip-Sync-1.13.0-Windows", "rhubarb.exe")
elif system == "darwin":  # macOS
    rhubarbExePath = os.path.join(current_dir, "Rhubarb-Lip-Sync-1.13.0-macOS", "rhubarb")
elif system == "linux":
    rhubarbExePath = os.path.join(current_dir, "Rhubarb", "rhubarb")
else:
    raise RuntimeError(f"Unsupported platform: {system}")

# Verify Rhubarb executable exists
if not os.path.exists(rhubarbExePath):
    print(f"Warning: Rhubarb executable not found at: {rhubarbExePath}")
    rhubarbExePath = None

Lip-Sync Generation

Code Location: Lines 454-471

def generate_lip_sync_greeting(wav_file, session_id):
    """Generate lip-sync JSON using Rhubarb"""
    if not rhubarbExePath:
        print("Rhubarb not available, skipping lip-sync generation")
        return None

    json_file = os.path.join(OUTPUT_DIR, f"{session_id}.json")
    result = subprocess.run(
        [rhubarbExePath, "-f", "json", "-o", json_file, wav_file, "-r", "phonetic"],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
    )
    return json_file if os.path.exists(json_file) else None

Command:

rhubarb.exe -f json -o avatar_greeting_1735214400.json avatar_greeting_1735214400_pcm.wav -r phonetic

Lip-Sync JSON Format

Code Location: Lines 473-479

def parse_lip_sync_greeting(json_file, sound_file):
    """Parse lip-sync JSON and add sound file path"""
    with open(json_file, "r") as file:
        lip_sync_data = json.load(file)

    lip_sync_data["metadata"]["soundFile"] = sound_file
    return lip_sync_data

Example Rhubarb Output:

{
    "metadata": {
        "soundFile": "/path/to/audio.wav",
        "duration": 3.5
    },
    "mouthCues": [
        {"start": 0.0, "end": 0.1, "value": "X"},
        {"start": 0.1, "end": 0.3, "value": "B"},
        {"start": 0.3, "end": 0.5, "value": "C"},
        {"start": 0.5, "end": 0.7, "value": "D"},
        ...
    ]
}

Mouth Shapes (Visemes):

  • A - relaxed
  • B - lips together
  • C - rounded lips
  • D - tongue up
  • E - smile
  • F - lower lip/teeth
  • G - tongue back
  • H - wide open
  • X - closed/silent

Greeting Generation System

Automatic Greeting Regeneration

Triggered When: Avatar changes (Lines 584-686)

Process:

  1. Determine greeting text:
sel_doc = selection_collection.find_one({"user_id": user_id, "project_id": project_id}, {"_id": 0, "hidden_name": 1}) or {}
display_name = sel_doc.get("hidden_name") or selection_avatar  # Prefer custom name
greeting_text = f"Hello, I'm {display_name}, your virtual chatbot. How can I help you?"
  1. Generate TTS audio:
timestamp = int(time.time())
session_id = f"avatar_greeting_{timestamp}"  # e.g., "avatar_greeting_1735214400"
 azure_voice = SUPPORTED_VOICES.get(final_voice)  # "en-US-JennyNeural"
wav_file = await text_to_speech_greeting(greeting_text, session_id, azure_voice)
  1. Convert to PCM format:
pcm_wav_file = os.path.join(OUTPUT_DIR, f"{session_id}_pcm.wav")
converted_file = convert_wav_to_pcm_greeting(wav_file, pcm_wav_file)
# Uses ffmpeg: ffmpeg -i input.wav -acodec pcm_s16le output_pcm.wav
  1. Generate lip-sync:
json_file = generate_lip_sync_greeting(converted_file, session_id)
lip_sync_data = parse_lip_sync_greeting(json_file, wav_file)
  1. Encode audio to base64:
with open(wav_file, "rb") as f:
    audio_base64 = base64.b64encode(f.read()).decode("utf-8")
  1. Update greeting collections:
for greeting_type in ["initial_greeting", "form_greeting"]:
    update_data = {
        "avatar_name": selection_avatar,
        "avatar_gender": new_avatar_gender,
        "voice_selection": final_voice,
        "voice": final_voice,
        "text": greeting_text,
        "facialExpression": "smiling",
        "animation": "Idle",
        "audio": audio_base64,  # Base64 WAV
        "lipsync": lip_sync_data,  # Rhubarb JSON
        "timestamp": datetime.utcnow()
    }

    # Update BOTH collections for compatibility
    generate_greeting_collection.update_one(...)
    generate_greetings_collection.update_one(...)

Greeting Collections

1. generate_greeting (singular)

2. generate_greetings (plural)

Both updated for backwards compatibility!

Schema:

{
    "_id": ObjectId("..."),
    "user_id": "User-123456",
    "project_id": "User-123456_Project_1",
    "greeting_type": "initial_greeting",  // or "form_greeting"

    // Avatar info
    "avatar_name": "Emma",
    "avatar_gender": "female",

    // Voice info
    "voice_selection": "Female_2",
    "voice": "Female_2",
    "voice_changed_due_to_avatar": false,

    // Content
    "text": "Hello, I'm Emma, your virtual chatbot. How can I help you?",
    "audio": "UklGRoA8AABXQVZFZm10IBAAAAABAAEAgD4AAAB9AAACABAA...",  // Base64 WAV
    "lipsync": {
        "metadata": {"soundFile": "...", "duration": 3.2},
        "mouthCues": [...]
    },

    // Animation
    "facialExpression": "smiling",
    "animation": "Idle",

    // Metadata
    "greeting_updated_by_avatar_selection": true,
    "timestamp": ISODate("2025-01-15T14:30:00Z")
}

Database Operations

Collections Used (11+)

  1. selection_history - Main configuration state
  2. chatbot_selections - Deprecated but still updated
  3. files - Uploaded data
  4. organisation_data - Organization info
  5. system_prompts_user - Custom prompts
  6. system_prompts_default - Template prompts
  7. chatbot_guardrails - Safety rules
  8. projectid_creation - Projects
  9. generate_greeting - Greeting data (singular)
  10. generate_greetings - Greeting data (plural)
  11. files_secondary - Secondary data

selection_history Schema

Complete Document:

{
    "_id": ObjectId("..."),
    "user_id": "User-123456",
    "project_id": "User-123456_Project_1",

    // Chatbot type
    "chatbot_type": "3D-chatbot",  // or "text-chatbot", "voice-chatbot"
    "chatbot_purpose": "Service Bot",

    // Avatar (3D only)
    "selection_avatar": "Emma",
    "avatar_type": "Avatar_Emma",
    "avatar_gender": "female",

    // Voice
    "selection_voice": "Female_2",
    "voice_changed_due_to_avatar": false,

    // Model
    "selection_model": "openai-35",

    // Timestamps
    "timestamp": ISODate("2025-01-15T14:30:00Z"),

    // Optional
    "hidden_name": "Support Assistant",  // Custom display name
    "sitemap_urls": ["https://example.com/page1"],
    "database_type": "milvus"
}

Security Analysis

Critical Issues

1. ⚠️ HARDCODED AZURE SPEECH KEY

Locations: Lines 410, 882

subscription="DnG6HrvZs99ofBUaQ2h8mp1GxP7FkJqrEjzrargPQph8OZCGIkyCJQQJ99BCACGhslBXJ3w3AAAYACOGVsLJ"

Impact: Anyone with code access can use Azure Speech API (costs money)

Fix:

AZURE_SPEECH_KEY = os.getenv("AZURE_SPEECH_KEY")
AZURE_SPEECH_REGION = os.getenv("AZURE_SPEECH_REGION", "centralindia")

if not AZURE_SPEECH_KEY:
    raise ValueError("AZURE_SPEECH_KEY must be set")

2. ⚠️ No Input Sanitization

Problem: greeting_text uses user input without sanitization

Code (Line 594):

display_name = sel_doc.get("hidden_name") or selection_avatar
greeting_text = f"Hello, I'm {display_name}, your virtual chatbot..."

If hidden_name contains SSML: Could inject Azure TTS commands

Fix:

import re

def sanitize_display_name(name):
    # Remove SSML tags, limit length
    name = re.sub(r'<[^>]+>', '', name)
    return name[:50]  # Max 50 characters

display_name = sanitize_display_name(sel_doc.get("hidden_name") or selection_avatar)

3. ⚠️ File System Race Conditions

Problem: Temp files not properly cleaned on errors

Lines 648-654:

# Cleanup temporary files
for file in [wav_file, pcm_wav_file, json_file]:
    if file and os.path.exists(file):
        try:
            os.remove(file)
        except Exception:
            pass  # Silently fails

Better:

import atexit

@atexit.register
def cleanup_temp_files():
    for file in glob.glob(os.path.join(OUTPUT_DIR, "avatar_greeting_*")):
        try:
            os.remove(file)
        except:
            pass

4. ⚠️ CORS Allows All Origins (same as other services)


Deployment

Docker Configuration

Dockerfile:

FROM python:3.9-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    && rm -rf /var/lib/apt/lists/*

# Copy Rhubarb executable
COPY Rhubarb/ ./Rhubarb/

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy source code
COPY src/ .

EXPOSE 8004

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8004"]

Requirements.txt

fastapi>=0.95.0
uvicorn[standard]>=0.22.0
pymongo>=4.3.3
python-multipart>=0.0.6
python-dotenv>=1.0.0

# Azure TTS
azure-cognitiveservices-speech>=1.31.0

# Monitoring
ddtrace>=1.19.0

Environment Variables

# Database
MONGO_URI=mongodb://...
MONGO_DB_NAME=Machine_agent_demo

# Azure Speech (should be added!)
AZURE_SPEECH_KEY=your_key_here
AZURE_SPEECH_REGION=centralindia

# DataDog
DD_SERVICE=selection-chatbot-service
DD_ENV=production

Performance Metrics

Operation Latency Notes
Select chatbot type 20-50ms Simple DB update
Select voice 30-60ms 2 DB updates
Select model 100-200ms Fetch prompt + update
Select avatar (no greeting) 50-100ms DB updates only
Select avatar (with greeting) 8-15 seconds TTS + Rhubarb!

Greeting Generation Breakdown:

  • TTS generation: 2-3 seconds
  • FFmpeg conversion: 0.5-1 second
  • Rhubarb lip-sync: 3-8 seconds
  • Base64 encoding: 0.5 second
  • DB updates: 0.5 second
  • Total: 8-15 seconds


Recommendations

Critical

  1. ⚠️ Move Azure Speech Key to Environment
  2. ⚠️ Add Input Sanitization for display names
  3. ⚠️ Improve File Cleanup with atexit or background tasks
  4. ⚠️ Restrict CORS

Improvements

  1. Cache TTS Audio - Don't regenerate same greeting
  2. Async Greeting Generation - Use background tasks
  3. Webhook for Greeting Ready - Notify client when complete
  4. Voice Preview - Generate 5-second sample
  5. Rhubarb Fallback - Work without lip-sync if Rhubarb missing

Code Quality

  1. Extract Greeting Logic - Separate function/module
  2. Reduce Duplicate Collections - Use one greeting collection
  3. Add Type Hints
  4. Unit Tests - Test gender detection, voice mapping

Last Updated: 2025-12-26
Code Version: selection-chatbot-service/src/main.py (952 lines)
Total Endpoints: 20+
Review Cycle: Monthly


"Where chatbots get their personality."