Skip to content

EncryptionΒΆ

Section: 5-security-architecture
Document: Data Encryption
Status: Comprehensive Encryption Documentation
Audience: Security teams, compliance officers, infrastructure engineers


🎯 Overview¢

MachineAvatars implements comprehensive encryption for data at rest and in transit, ensuring all sensitive information is protected using industry-standard encryption algorithms and protocols.

Encryption Standards:

  • Data at Rest: AES-256 (Advanced Encryption Standard)
  • Data in Transit: TLS 1.3 (Transport Layer Security)
  • Key Management: Environment variables (migrating to Azure Key Vault)
  • Customer-Managed Keys: Enterprise only (planned)

πŸ”’ Data at Rest EncryptionΒΆ

MongoDB Database EncryptionΒΆ

Provider: Azure Cosmos DB for MongoDB
Encryption: Automatic Azure-managed encryption

Encryption DetailsΒΆ

Algorithm: AES-256
Key Management: Microsoft-managed keys
Coverage: All data in MongoDB collections

Collections Encrypted:

  1. users_multichatbot_v2 - User accounts, credentials
  2. chatbot_selections - Chatbot configurations
  3. chatbot_history - All conversation data
  4. files / files_secondary - Uploaded documents metadata
  5. system_prompts_user - Custom system prompts
  6. projectid_creation - Project metadata
  7. organisation_data - Enterprise organization info
  8. trash_collection_name - Soft-deleted chatbots
  9. features_per_user - Subscription features

Azure Cosmos DB Encryption:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Application Layer                 β”‚
β”‚   (Backend Services)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚ HTTPS/TLS 1.3
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Azure Cosmos DB API               β”‚
β”‚   (MongoDB-compatible)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Encryption Layer                  β”‚
β”‚   AES-256 (Microsoft-managed keys)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Physical Storage                  β”‚
β”‚   (Azure Data Centers - India)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Benefits:

  • βœ… Automatic encryption (no configuration needed)
  • βœ… Transparent to applications
  • βœ… Zero performance impact
  • βœ… Compliance with GDPR, DPDPA, HIPAA
  • βœ… Azure manages key rotation

Verification:

# Azure Portal: Cosmos DB > Settings > Encryption
# Status: "Enabled (Microsoft-managed keys)"

Milvus Vector Database EncryptionΒΆ

Deployment: Azure Container Instance
Encryption: Storage-level encryption

What's Encrypted:

  • Vector embeddings (1536 dimensions)
  • Metadata fields
  • Index structures
  • Collection partitions

Encryption Method:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Milvus Application                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Azure Blob Storage                β”‚
β”‚   (Persistent volume for Milvus)    β”‚
β”‚   AES-256 encryption (automatic)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Azure Physical Storage            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Storage Encryption:

  • Azure Blob Storage automatically encrypts all data
  • AES-256 encryption
  • Microsoft-managed keys
  • No configuration required

In-Memory Data:

  • ⚠️ Not encrypted (standard for vector databases)
  • Data encrypted when written to disk
  • Recommendation: Use network isolation for additional security

File Storage Encryption (GridFS)ΒΆ

Storage: MongoDB GridFS
Encryption: Inherited from MongoDB encryption

File Types Stored:

  • PDF documents
  • Word files (.docx)
  • Excel spreadsheets (.xlsx)
  • PowerPoint presentations (.pptx)
  • Text files (.txt)
  • User-uploaded content

Encryption Flow:

User Upload β†’ Backend Service β†’ GridFS β†’ MongoDB β†’ AES-256 Encryption β†’ Azure Storage

File Metadata Collection:

{
    "_id": ObjectId("..."),
    "filename": "product_catalog.pdf",
    "contentType": "application/pdf",
    "length": 2048576,  // bytes
    "uploadDate": "2025-01-15T10:00:00Z",
    "metadata": {
        "user_id": "User-123456",
        "project_id": "chatbot_abc",
        "original_name": "Product Catalog.pdf"
    }
    // File contents encrypted at rest by MongoDB
}

File Security:

  • βœ… Encrypted at rest (AES-256)
  • βœ… Access control via user_id/project_id
  • βœ… No public URLs
  • ⚠️ No client-side encryption (files decrypted in backend)

Backup EncryptionΒΆ

MongoDB Backups:

  • Frequency: Daily automated backups
  • Retention: 35 days (configurable)
  • Encryption: AES-256 (same as primary data)
  • Location: Azure Backup (geo-redundant)

Backup Process:

Primary DB (AES-256) β†’ Azure Backup Service β†’ AES-256 Encrypted Backup β†’ Geo-replicated Storage

Disaster Recovery:

  • Encrypted backups can be restored
  • Point-in-time recovery (PITR) available
  • Encryption keys managed by Azure
  • No key management needed for restore

🌐 Data in Transit Encryption¢

TLS 1.3 ImplementationΒΆ

Protocol: Transport Layer Security 1.3
Coverage: All network communication

Frontend to BackendΒΆ

HTTPS Enforcement:

// Frontend API calls
const API_URL = process.env.NEXT_PUBLIC_BACKEND_URL;
// Example: "https://api.machineavatars.com"

fetch(`${API_URL}/v2/chatbots`, {
  method: "GET",
  headers: {
    Authorization: `Bearer ${token}`,
  },
});

Requirements:

  • βœ… HTTPS only (no HTTP allowed)
  • βœ… TLS 1.3 or TLS 1.2 minimum
  • βœ… Valid SSL certificate
  • ❌ Self-signed certificates rejected

TLS Configuration (Azure App Service):

Minimum TLS Version: 1.2
Preferred: TLS 1.3
Cipher Suites: Modern, secure ciphers only
HSTS: Enabled (HTTP Strict Transport Security)

Backend to DatabaseΒΆ

MongoDB Connection:

# Backend services
from pymongo import MongoClient

mongo_uri = os.getenv("MONGO_URI")
# Format: "mongodb+srv://username:password@cluster.mongodb.net/?ssl=true"

client = MongoClient(mongo_uri)

Connection Security:

  • βœ… SSL/TLS enforced (ssl=true parameter)
  • βœ… Certificate validation
  • βœ… Encrypted credentials in connection string
  • ⚠️ Connection string hardcoded in some services (migrating to env vars)

TLS Handshake:

Backend Service β†’ TLS 1.2/1.3 Handshake β†’ Azure Cosmos DB
                 ↓
          Certificate Validation
                 ↓
          Encrypted Connection Established
                 ↓
          All Data Encrypted (AES-256 in transit)

Backend to MilvusΒΆ

Vector Database Connection:

from pymilvus import connections

connections.connect(
    alias="default",
    host="milvus.example.com",
    port="19530",
    secure=True  # Enable TLS
)

Current Status:

  • ⚠️ TLS not enforced (internal network)
  • βœ… Network isolation (Azure VNet)
  • ⏳ TLS planned for production

Recommendation:

connections.connect(
    alias="default",
    host="milvus.example.com",
    port="19530",
    secure=True,
    server_pem_path="/path/to/server.pem",  # TLS certificate
    server_name="milvus.example.com"
)

Backend to Azure ServicesΒΆ

1. Azure Communication Email:

from azure.communication.email import EmailClient

email_client = EmailClient(
    endpoint="https://mailing-sevice.india.communication.azure.com/",
    credential="..." # Access key
)
# All requests over HTTPS (TLS 1.2+)

2. Azure OpenAI:

import openai

openai.api_base = "https://YOUR_RESOURCE.openai.azure.com/"
openai.api_key = "..."  # ⚠️ Hardcoded in many services
# All API calls over HTTPS

3. Azure TTS (Text-to-Speech):

import azure.cognitiveservices.speech as speechsdk

speech_config = speechsdk.SpeechConfig(
    subscription="...",
    region="centralindia"
)
# WebSocket secure (WSS) for streaming

API Communication EncryptionΒΆ

External API Calls:

  • βœ… HTTPS for all external APIs
  • βœ… Certificate pinning (recommended, not implemented)
  • βœ… Timeout enforcement
  • ⚠️ No mutual TLS (mTLS) yet

Internal Microservice Communication:

Service A β†’ Gateway (Port 8000) β†’ Service B
   ↓                ↓                  ↓
 HTTPS            HTTPS              HTTPS

Current:

  • ⚠️ HTTP used for internal communication (Docker network)
  • βœ… Network isolation prevents external access
  • ⏳ HTTPS for internal services (planned for production)

Recommended (Production):

Service A β†’ Gateway β†’ Service B
   ↓          ↓          ↓
  TLS 1.3   TLS 1.3   TLS 1.3

WebSocket EncryptionΒΆ

Analytics Real-Time Updates:

// Frontend WebSocket connection
const ws = new WebSocket("wss://api.machineavatars.com/ws/analytics");

Security:

  • βœ… WSS (WebSocket Secure) protocol
  • βœ… TLS 1.3 encryption
  • βœ… Authentication token validation
  • ⏳ Connection rate limiting (planned)

πŸ”‘ Key ManagementΒΆ

Current Implementation (Environment Variables)ΒΆ

Storage: Docker environment variables, .env files

Services Using Env Vars:

# .env file structure
MONGO_URI=mongodb+srv://...
JWT_SECRET=your_jwt_secret_256_bit
AZURE_OPENAI_KEY=sk-...
AZURE_TTS_KEY=...
GROQ_API_KEY=gsk_...
TOGETHER_API_KEY=...

Benefits:

  • βœ… Not committed to git (.env in .gitignore)
  • βœ… Easy to rotate (update env and restart)
  • βœ… Per environment (dev, staging, prod)

Limitations:

  • ⚠️ Secrets visible in process environment
  • ⚠️ No automatic rotation
  • ⚠️ No audit trail
  • ⚠️ Risk of accidental exposure

Azure Key Vault Integration (PLANNED)ΒΆ

Status: Q1 2025 implementation

Architecture:

graph TB
    subgraph "Azure Key Vault"
        KV[Secrets Storage]
    end

    subgraph "Backend Services"
        S1[Auth Service]
        S2[User Service]
        S3[Response Services]
    end

    subgraph "Managed Identity"
        MI[Azure Managed Identity]
    end

    S1 --> MI
    S2 --> MI
    S3 --> MI
    MI --> KV

    style KV fill:#E3F2FD
    style MI fill:#FFF3E0

Implementation Plan:

from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

# Initialize Key Vault client
key_vault_url = "https://machineavatars-kv.vault.azure.net/"
credential = DefaultAzureCredential()
secret_client = SecretClient(vault_url=key_vault_url, credential=credential)

# Retrieve secrets
def get_secret(secret_name):
    try:
        secret = secret_client.get_secret(secret_name)
        return secret.value
    except Exception as e:
        logger.error(f"Failed to retrieve secret {secret_name}: {e}")
        raise

# Usage in services
MONGO_URI = get_secret("mongo-connection-string")
JWT_SECRET = get_secret("jwt-secret-key")
AZURE_OPENAI_KEY = get_secret("azure-openai-api-key")

Secrets to Migrate:

Secret Name Current Storage Target Key Vault Name
MongoDB URI .env mongo-connection-string
JWT Secret .env (weak default) jwt-secret-key
Azure OpenAI Key Hardcoded! azure-openai-api-key
Azure TTS Key Hardcoded! azure-tts-api-key
Azure Email Key Hardcoded! azure-email-access-key
Groq API Key Hardcoded! groq-api-key
Together AI Key Hardcoded! together-api-key
Llama API Key Hardcoded! llama-api-key
reCAPTCHA Secret .env recaptcha-secret-key
Razorpay Key .env razorpay-api-key
Razorpay Secret .env razorpay-api-secret

Benefits:

  • βœ… Centralized secret management
  • βœ… Automatic secret rotation
  • βœ… Access audit logs
  • βœ… RBAC for secret access
  • βœ… Encryption at rest (FIPS 140-2 Level 2)
  • βœ… Managed identities (no credentials in code)

Timeline: Q1 2025 (CRITICAL PRIORITY)


Secret Rotation PolicyΒΆ

Current: Manual rotation (when compromised)

Planned:

  • JWT Secret: 90 days
  • API Keys: 180 days (or as required by provider)
  • Database Credentials: 365 days
  • Emergency Rotation: Immediate on detection of compromise

Rotation Process:

  1. Generate new secret in Key Vault
  2. Update secret reference (no code changes)
  3. Restart affected services
  4. Verify functionality
  5. Revoke old secret after 24-hour grace period

🏒 Customer-Managed Encryption Keys (Enterprise)¢

Availability: Premium Plan only
Status: Planned (Q3 2025)

Bring Your Own Key (BYOK):

Customers can provide their own encryption keys for:

  • Database encryption (Cosmos DB)
  • File storage encryption
  • Backup encryption

Process:

graph LR
    A[Customer] -->|1. Generate Key| B[Customer Key Vault]
    B -->|2. Grant Access| C[MachineAvatars<br/>Managed Identity]
    C -->|3. Use Key| D[Azure Cosmos DB]
    D -->|4. Encrypt Data| E[Encrypted Storage]

    style B fill:#FFE082
    style E fill:#E3F2FD

Requirements:

  • Customer must have Azure subscription
  • Customer creates and manages their own Key Vault
  • Grant wrap and unwrap permissions to MachineAvatars Managed Identity
  • Customer retains full control over key

Benefits:

  • βœ… Customer controls encryption keys
  • βœ… Can revoke access at any time
  • βœ… Compliance with data sovereignty requirements
  • βœ… Additional layer of security

Limitations:

  • ⚠️ Customer responsible for key management
  • ⚠️ Key loss = permanent data loss
  • ⚠️ Additional complexity and cost

πŸ” Encryption Standards & ComplianceΒΆ

Algorithms UsedΒΆ

Purpose Algorithm Key Length Standard
Data at Rest AES-256 256-bit FIPS 197
Data in Transit TLS 1.3 256-bit (min) RFC 8446
Password Hashing None (⚠️) N/A NON-COMPLIANT
JWT Signing HMAC-SHA256 256-bit RFC 7519

⚠️ CRITICAL ISSUE: Passwords stored in plain text (no hashing)!

Required: bcrypt with 12 rounds minimum


Compliance MappingΒΆ

GDPR (Article 32 - Security of Processing):

  • βœ… Encryption of personal data at rest
  • βœ… Encryption in transit
  • ⚠️ Password hashing (not implemented)

DPDPA 2023 (Section 8 - Data Security):

  • βœ… Reasonable security practices
  • βœ… Encryption for sensitive data
  • ⚠️ Password protection inadequate

HIPAA (if applicable for on-premise):

  • βœ… Encryption at rest (Β§ 164.312(a)(2)(iv))
  • βœ… Encryption in transit (Β§ 164.312(e)(1))
  • ⚠️ Access controls need improvement

PCI DSS (via Razorpay):

  • βœ… No card data stored (handled by Razorpay)
  • βœ… TLS for payment API calls
  • βœ… Razorpay is PCI DSS Level 1 certified

πŸ”§ Implementation Best PracticesΒΆ

Encryption ChecklistΒΆ

Data at Rest:

  • MongoDB encryption enabled
  • Milvus storage encrypted
  • File uploads encrypted via GridFS
  • Backups encrypted
  • Customer-managed keys (Enterprise, planned)

Data in Transit:

  • HTTPS enforced for frontend-backend
  • TLS for backend-database
  • TLS for internal microservices (planned)
  • WSS for WebSocket connections
  • Certificate pinning (recommended)

Key Management:

  • Secrets in environment variables
  • Azure Key Vault integration (Q1 2025)
  • Automatic key rotation (Q1 2025)
  • Audit logs for key access (Q1 2025)

Password Security:

  • bcrypt hashing (CRITICAL - Q1 2025)
  • 12 rounds minimum
  • Password migration plan

Security RecommendationsΒΆ

Immediate (Q1 2025):

  1. Implement bcrypt for passwords - CRITICAL
  2. Migrate to Azure Key Vault - HIGH
  3. Remove hardcoded secrets - HIGH
  4. Enable TLS for Milvus - MEDIUM

Short-term (Q2 2025): 5. TLS for internal microservices - MEDIUM 6. Certificate pinning - MEDIUM 7. Automatic key rotation - LOW

Long-term (Q3-Q4 2025): 8. Customer-managed keys - Enterprise feature 9. End-to-end chat encryption - Optional feature 10. Hardware Security Modules (HSM) - For key storage


πŸ“Š Encryption Performance ImpactΒΆ

BenchmarksΒΆ

MongoDB Encryption:

  • Read Latency: +0-5ms (negligible)
  • Write Latency: +0-10ms (negligible)
  • Throughput: No significant impact

TLS 1.3:

  • Handshake: ~30-50ms (initial connection)
  • Data Transfer: +0-2% overhead
  • Resume Session: <10ms (cached)

Azure Key Vault:

  • Secret Retrieval: 50-200ms (first request)
  • Caching: <1ms (subsequent requests)
  • Recommendation: Cache secrets in memory

Security:

Infrastructure:

Compliance:


"Encryption is not optional. It's a requirement." πŸ”βœ