EncryptionΒΆ
Section: 5-security-architecture
Document: Data Encryption
Status: Comprehensive Encryption Documentation
Audience: Security teams, compliance officers, infrastructure engineers
π― OverviewΒΆ
MachineAvatars implements comprehensive encryption for data at rest and in transit, ensuring all sensitive information is protected using industry-standard encryption algorithms and protocols.
Encryption Standards:
- Data at Rest: AES-256 (Advanced Encryption Standard)
- Data in Transit: TLS 1.3 (Transport Layer Security)
- Key Management: Environment variables (migrating to Azure Key Vault)
- Customer-Managed Keys: Enterprise only (planned)
π Data at Rest EncryptionΒΆ
MongoDB Database EncryptionΒΆ
Provider: Azure Cosmos DB for MongoDB
Encryption: Automatic Azure-managed encryption
Encryption DetailsΒΆ
Algorithm: AES-256
Key Management: Microsoft-managed keys
Coverage: All data in MongoDB collections
Collections Encrypted:
users_multichatbot_v2- User accounts, credentialschatbot_selections- Chatbot configurationschatbot_history- All conversation datafiles/files_secondary- Uploaded documents metadatasystem_prompts_user- Custom system promptsprojectid_creation- Project metadataorganisation_data- Enterprise organization infotrash_collection_name- Soft-deleted chatbotsfeatures_per_user- Subscription features
Azure Cosmos DB Encryption:
βββββββββββββββββββββββββββββββββββββββ
β Application Layer β
β (Backend Services) β
βββββββββββββββββββ¬ββββββββββββββββββββ
β HTTPS/TLS 1.3
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Azure Cosmos DB API β
β (MongoDB-compatible) β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Encryption Layer β
β AES-256 (Microsoft-managed keys) β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Physical Storage β
β (Azure Data Centers - India) β
βββββββββββββββββββββββββββββββββββββββ
Benefits:
- β Automatic encryption (no configuration needed)
- β Transparent to applications
- β Zero performance impact
- β Compliance with GDPR, DPDPA, HIPAA
- β Azure manages key rotation
Verification:
Milvus Vector Database EncryptionΒΆ
Deployment: Azure Container Instance
Encryption: Storage-level encryption
What's Encrypted:
- Vector embeddings (1536 dimensions)
- Metadata fields
- Index structures
- Collection partitions
Encryption Method:
βββββββββββββββββββββββββββββββββββββββ
β Milvus Application β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Azure Blob Storage β
β (Persistent volume for Milvus) β
β AES-256 encryption (automatic) β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Azure Physical Storage β
βββββββββββββββββββββββββββββββββββββββ
Storage Encryption:
- Azure Blob Storage automatically encrypts all data
- AES-256 encryption
- Microsoft-managed keys
- No configuration required
In-Memory Data:
- β οΈ Not encrypted (standard for vector databases)
- Data encrypted when written to disk
- Recommendation: Use network isolation for additional security
File Storage Encryption (GridFS)ΒΆ
Storage: MongoDB GridFS
Encryption: Inherited from MongoDB encryption
File Types Stored:
- PDF documents
- Word files (.docx)
- Excel spreadsheets (.xlsx)
- PowerPoint presentations (.pptx)
- Text files (.txt)
- User-uploaded content
Encryption Flow:
File Metadata Collection:
{
"_id": ObjectId("..."),
"filename": "product_catalog.pdf",
"contentType": "application/pdf",
"length": 2048576, // bytes
"uploadDate": "2025-01-15T10:00:00Z",
"metadata": {
"user_id": "User-123456",
"project_id": "chatbot_abc",
"original_name": "Product Catalog.pdf"
}
// File contents encrypted at rest by MongoDB
}
File Security:
- β Encrypted at rest (AES-256)
- β Access control via user_id/project_id
- β No public URLs
- β οΈ No client-side encryption (files decrypted in backend)
Backup EncryptionΒΆ
MongoDB Backups:
- Frequency: Daily automated backups
- Retention: 35 days (configurable)
- Encryption: AES-256 (same as primary data)
- Location: Azure Backup (geo-redundant)
Backup Process:
Primary DB (AES-256) β Azure Backup Service β AES-256 Encrypted Backup β Geo-replicated Storage
Disaster Recovery:
- Encrypted backups can be restored
- Point-in-time recovery (PITR) available
- Encryption keys managed by Azure
- No key management needed for restore
π Data in Transit EncryptionΒΆ
TLS 1.3 ImplementationΒΆ
Protocol: Transport Layer Security 1.3
Coverage: All network communication
Frontend to BackendΒΆ
HTTPS Enforcement:
// Frontend API calls
const API_URL = process.env.NEXT_PUBLIC_BACKEND_URL;
// Example: "https://api.machineavatars.com"
fetch(`${API_URL}/v2/chatbots`, {
method: "GET",
headers: {
Authorization: `Bearer ${token}`,
},
});
Requirements:
- β HTTPS only (no HTTP allowed)
- β TLS 1.3 or TLS 1.2 minimum
- β Valid SSL certificate
- β Self-signed certificates rejected
TLS Configuration (Azure App Service):
Minimum TLS Version: 1.2
Preferred: TLS 1.3
Cipher Suites: Modern, secure ciphers only
HSTS: Enabled (HTTP Strict Transport Security)
Backend to DatabaseΒΆ
MongoDB Connection:
# Backend services
from pymongo import MongoClient
mongo_uri = os.getenv("MONGO_URI")
# Format: "mongodb+srv://username:password@cluster.mongodb.net/?ssl=true"
client = MongoClient(mongo_uri)
Connection Security:
- β
SSL/TLS enforced (
ssl=trueparameter) - β Certificate validation
- β Encrypted credentials in connection string
- β οΈ Connection string hardcoded in some services (migrating to env vars)
TLS Handshake:
Backend Service β TLS 1.2/1.3 Handshake β Azure Cosmos DB
β
Certificate Validation
β
Encrypted Connection Established
β
All Data Encrypted (AES-256 in transit)
Backend to MilvusΒΆ
Vector Database Connection:
from pymilvus import connections
connections.connect(
alias="default",
host="milvus.example.com",
port="19530",
secure=True # Enable TLS
)
Current Status:
- β οΈ TLS not enforced (internal network)
- β Network isolation (Azure VNet)
- β³ TLS planned for production
Recommendation:
connections.connect(
alias="default",
host="milvus.example.com",
port="19530",
secure=True,
server_pem_path="/path/to/server.pem", # TLS certificate
server_name="milvus.example.com"
)
Backend to Azure ServicesΒΆ
1. Azure Communication Email:
from azure.communication.email import EmailClient
email_client = EmailClient(
endpoint="https://mailing-sevice.india.communication.azure.com/",
credential="..." # Access key
)
# All requests over HTTPS (TLS 1.2+)
2. Azure OpenAI:
import openai
openai.api_base = "https://YOUR_RESOURCE.openai.azure.com/"
openai.api_key = "..." # β οΈ Hardcoded in many services
# All API calls over HTTPS
3. Azure TTS (Text-to-Speech):
import azure.cognitiveservices.speech as speechsdk
speech_config = speechsdk.SpeechConfig(
subscription="...",
region="centralindia"
)
# WebSocket secure (WSS) for streaming
API Communication EncryptionΒΆ
External API Calls:
- β HTTPS for all external APIs
- β Certificate pinning (recommended, not implemented)
- β Timeout enforcement
- β οΈ No mutual TLS (mTLS) yet
Internal Microservice Communication:
Current:
- β οΈ HTTP used for internal communication (Docker network)
- β Network isolation prevents external access
- β³ HTTPS for internal services (planned for production)
Recommended (Production):
WebSocket EncryptionΒΆ
Analytics Real-Time Updates:
// Frontend WebSocket connection
const ws = new WebSocket("wss://api.machineavatars.com/ws/analytics");
Security:
- β WSS (WebSocket Secure) protocol
- β TLS 1.3 encryption
- β Authentication token validation
- β³ Connection rate limiting (planned)
π Key ManagementΒΆ
Current Implementation (Environment Variables)ΒΆ
Storage: Docker environment variables, .env files
Services Using Env Vars:
# .env file structure
MONGO_URI=mongodb+srv://...
JWT_SECRET=your_jwt_secret_256_bit
AZURE_OPENAI_KEY=sk-...
AZURE_TTS_KEY=...
GROQ_API_KEY=gsk_...
TOGETHER_API_KEY=...
Benefits:
- β
Not committed to git (
.envin.gitignore) - β Easy to rotate (update env and restart)
- β Per environment (dev, staging, prod)
Limitations:
- β οΈ Secrets visible in process environment
- β οΈ No automatic rotation
- β οΈ No audit trail
- β οΈ Risk of accidental exposure
Azure Key Vault Integration (PLANNED)ΒΆ
Status: Q1 2025 implementation
Architecture:
graph TB
subgraph "Azure Key Vault"
KV[Secrets Storage]
end
subgraph "Backend Services"
S1[Auth Service]
S2[User Service]
S3[Response Services]
end
subgraph "Managed Identity"
MI[Azure Managed Identity]
end
S1 --> MI
S2 --> MI
S3 --> MI
MI --> KV
style KV fill:#E3F2FD
style MI fill:#FFF3E0
Implementation Plan:
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
# Initialize Key Vault client
key_vault_url = "https://machineavatars-kv.vault.azure.net/"
credential = DefaultAzureCredential()
secret_client = SecretClient(vault_url=key_vault_url, credential=credential)
# Retrieve secrets
def get_secret(secret_name):
try:
secret = secret_client.get_secret(secret_name)
return secret.value
except Exception as e:
logger.error(f"Failed to retrieve secret {secret_name}: {e}")
raise
# Usage in services
MONGO_URI = get_secret("mongo-connection-string")
JWT_SECRET = get_secret("jwt-secret-key")
AZURE_OPENAI_KEY = get_secret("azure-openai-api-key")
Secrets to Migrate:
| Secret Name | Current Storage | Target Key Vault Name |
|---|---|---|
| MongoDB URI | .env |
mongo-connection-string |
| JWT Secret | .env (weak default) |
jwt-secret-key |
| Azure OpenAI Key | Hardcoded! | azure-openai-api-key |
| Azure TTS Key | Hardcoded! | azure-tts-api-key |
| Azure Email Key | Hardcoded! | azure-email-access-key |
| Groq API Key | Hardcoded! | groq-api-key |
| Together AI Key | Hardcoded! | together-api-key |
| Llama API Key | Hardcoded! | llama-api-key |
| reCAPTCHA Secret | .env |
recaptcha-secret-key |
| Razorpay Key | .env |
razorpay-api-key |
| Razorpay Secret | .env |
razorpay-api-secret |
Benefits:
- β Centralized secret management
- β Automatic secret rotation
- β Access audit logs
- β RBAC for secret access
- β Encryption at rest (FIPS 140-2 Level 2)
- β Managed identities (no credentials in code)
Timeline: Q1 2025 (CRITICAL PRIORITY)
Secret Rotation PolicyΒΆ
Current: Manual rotation (when compromised)
Planned:
- JWT Secret: 90 days
- API Keys: 180 days (or as required by provider)
- Database Credentials: 365 days
- Emergency Rotation: Immediate on detection of compromise
Rotation Process:
- Generate new secret in Key Vault
- Update secret reference (no code changes)
- Restart affected services
- Verify functionality
- Revoke old secret after 24-hour grace period
π’ Customer-Managed Encryption Keys (Enterprise)ΒΆ
Availability: Premium Plan only
Status: Planned (Q3 2025)
Bring Your Own Key (BYOK):
Customers can provide their own encryption keys for:
- Database encryption (Cosmos DB)
- File storage encryption
- Backup encryption
Process:
graph LR
A[Customer] -->|1. Generate Key| B[Customer Key Vault]
B -->|2. Grant Access| C[MachineAvatars<br/>Managed Identity]
C -->|3. Use Key| D[Azure Cosmos DB]
D -->|4. Encrypt Data| E[Encrypted Storage]
style B fill:#FFE082
style E fill:#E3F2FD
Requirements:
- Customer must have Azure subscription
- Customer creates and manages their own Key Vault
- Grant
wrapandunwrappermissions to MachineAvatars Managed Identity - Customer retains full control over key
Benefits:
- β Customer controls encryption keys
- β Can revoke access at any time
- β Compliance with data sovereignty requirements
- β Additional layer of security
Limitations:
- β οΈ Customer responsible for key management
- β οΈ Key loss = permanent data loss
- β οΈ Additional complexity and cost
π Encryption Standards & ComplianceΒΆ
Algorithms UsedΒΆ
| Purpose | Algorithm | Key Length | Standard |
|---|---|---|---|
| Data at Rest | AES-256 | 256-bit | FIPS 197 |
| Data in Transit | TLS 1.3 | 256-bit (min) | RFC 8446 |
| Password Hashing | None (β οΈ) | N/A | NON-COMPLIANT |
| JWT Signing | HMAC-SHA256 | 256-bit | RFC 7519 |
β οΈ CRITICAL ISSUE: Passwords stored in plain text (no hashing)!
Required: bcrypt with 12 rounds minimum
Compliance MappingΒΆ
GDPR (Article 32 - Security of Processing):
- β Encryption of personal data at rest
- β Encryption in transit
- β οΈ Password hashing (not implemented)
DPDPA 2023 (Section 8 - Data Security):
- β Reasonable security practices
- β Encryption for sensitive data
- β οΈ Password protection inadequate
HIPAA (if applicable for on-premise):
- β Encryption at rest (Β§ 164.312(a)(2)(iv))
- β Encryption in transit (Β§ 164.312(e)(1))
- β οΈ Access controls need improvement
PCI DSS (via Razorpay):
- β No card data stored (handled by Razorpay)
- β TLS for payment API calls
- β Razorpay is PCI DSS Level 1 certified
π§ Implementation Best PracticesΒΆ
Encryption ChecklistΒΆ
Data at Rest:
- MongoDB encryption enabled
- Milvus storage encrypted
- File uploads encrypted via GridFS
- Backups encrypted
- Customer-managed keys (Enterprise, planned)
Data in Transit:
- HTTPS enforced for frontend-backend
- TLS for backend-database
- TLS for internal microservices (planned)
- WSS for WebSocket connections
- Certificate pinning (recommended)
Key Management:
- Secrets in environment variables
- Azure Key Vault integration (Q1 2025)
- Automatic key rotation (Q1 2025)
- Audit logs for key access (Q1 2025)
Password Security:
- bcrypt hashing (CRITICAL - Q1 2025)
- 12 rounds minimum
- Password migration plan
Security RecommendationsΒΆ
Immediate (Q1 2025):
- Implement bcrypt for passwords - CRITICAL
- Migrate to Azure Key Vault - HIGH
- Remove hardcoded secrets - HIGH
- Enable TLS for Milvus - MEDIUM
Short-term (Q2 2025): 5. TLS for internal microservices - MEDIUM 6. Certificate pinning - MEDIUM 7. Automatic key rotation - LOW
Long-term (Q3-Q4 2025): 8. Customer-managed keys - Enterprise feature 9. End-to-end chat encryption - Optional feature 10. Hardware Security Modules (HSM) - For key storage
π Encryption Performance ImpactΒΆ
BenchmarksΒΆ
MongoDB Encryption:
- Read Latency: +0-5ms (negligible)
- Write Latency: +0-10ms (negligible)
- Throughput: No significant impact
TLS 1.3:
- Handshake: ~30-50ms (initial connection)
- Data Transfer: +0-2% overhead
- Resume Session: <10ms (cached)
Azure Key Vault:
- Secret Retrieval: 50-200ms (first request)
- Caching: <1ms (subsequent requests)
- Recommendation: Cache secrets in memory
π Related DocumentationΒΆ
Security:
- Authentication & Authorization - JWT, session management
- Secret Management - API key management, migration plan
- Network Security - Firewall, VNet isolation
Infrastructure:
- Azure Services - Cloud infrastructure
- Database Architecture - MongoDB, Milvus setup
Compliance:
- Compliance Controls - GDPR, DPDPA, HIPAA requirements
"Encryption is not optional. It's a requirement." πβ