Skip to content

Database Operations

Section: 8-deployment-operations
Document: MongoDB & Milvus Operations Guide
Audience: Database Administrators, DevOps Engineers
Last Updated: 2025-12-30


🎯 Overview

Complete guide for MongoDB (Cosmos DB) and Milvus database operations including migrations, backups, restores, and maintenance.


💾 MongoDB Operations

Database Migrations

Migration Framework: Custom Python migration scripts

Location: migrations/ directory in backend services

Migration File Naming:

migrations/
├── 001_add_password_hash.py
├── 002_add_subscription_fields.py
├── 003_create_organisation_collection.py
└── ...

Migration Template:

"""
Migration: [Description]
Date: 2025-12-30
Author: DevOps Team
"""

def up(db):
    """Apply migration"""
    # Forward migration logic
    pass

def down(db):
    """Rollback migration"""
    # Rollback logic
    pass

def validate(db):
    """Validate migration applied correctly"""
    # Validation logic
    return True

Example Migration - Add password_hash field:

import bcrypt
from pymongo import MongoClient

def up(db):
    """Add password_hash field to all users"""
    users = db.users_multichatbot_v2

    # Find users without password_hash
    cursor = users.find({"password_hash": {"$exists": False}})

    count = 0
    for user in cursor:
        # Hash existing password
        if "password" in user:
            hashed = bcrypt.hashpw(
                user["password"].encode('utf-8'),
                bcrypt.gensalt(rounds=12)
            )

            users.update_one(
                {"_id": user["_id"]},
                {
                    "$set": {"password_hash": hashed.decode('utf-8')},
                    "$unset": {"password": ""}  # Remove plain text
                }
            )
            count += 1

    print(f"Updated {count} users with password_hash")

def down(db):
    """Remove password_hash field"""
    db.users_multichatbot_v2.update_many(
        {},
        {"$unset": {"password_hash": ""}}
    )

def validate(db):
    """Verify all users have password_hash"""
    total = db.users_multichatbot_v2.count_documents({})
    with_hash = db.users_multichatbot_v2.count_documents(
        {"password_hash": {"$exists": True}}
    )
    return total == with_hash

Run Migration:

# Run single migration
python run_migration.py --migration 001_add_password_hash

# Run all pending migrations
python run_migration.py --all

# Rollback last migration
python run_migration.py --rollback

# Dry run (no changes)
python run_migration.py --migration 001_add_password_hash --dry-run

Backup Strategies

Automated Backups (Cosmos DB):

  • Frequency: Continuous
  • Retention: 30 days (point-in-time restore)
  • Type: Automatic, no configuration needed
  • Cost: Included in Cosmos DB pricing

Manual Backups:

# Export single collection
mongoexport --uri="$MONGO_URI" \
  --db=Machine_agent_prod \
  --collection=users_multichatbot_v2 \
  --out=users_$(date +%Y%m%d_%H%M%S).json

# Export all collections with script
#!/bin/bash
BACKUP_DIR="backups/$(date +%Y%m%d_%H%M%S)"
mkdir -p $BACKUP_DIR

collections=(
  "users_multichatbot_v2"
  "chatbot_selections"
  "chatbot_history"
  "files"
  "files_secondary"
  "system_prompts_user"
  "projectid_creation"
  "organisation_data"
  "trash_collection_name"
)

for collection in "${collections[@]}"; do
  echo "Backing up $collection..."
  mongoexport --uri="$MONGO_URI" \
    --db=Machine_agent_prod \
    --collection=$collection \
    --out=$BACKUP_DIR/${collection}.json
done

# Upload to Azure Blob
az storage blob upload-batch \
  --destination backups \
  --source $BACKUP_DIR \
  --account-name qablobmachineagents

Restore Procedures

Point-in-Time Restore (Last 30 days):

# Via Azure CLI
az cosmosdb mongodb database restore \
  --account-name machineagents-cosmosdb-prod \
  --resource-group machineagents-data-rg \
  --database-name Machine_agent_prod \
  --restore-timestamp "2025-12-30T10:00:00Z" \
  --target-database-name Machine_agent_prod_restored

# Verify restored database
mongosh "$MONGO_URI/Machine_agent_prod_restored" \
  --eval "db.getCollectionNames()"

# Switch applications to restored database
# Update Key Vault secret with new database name

Import from Manual Backup:

# Download backup from Azure Blob
az storage blob download-batch \
  --destination ./restore \
  --source backups/20251230_100000 \
  --account-name qablobmachineagents

# Import collections
for file in restore/*.json; do
  collection=$(basename $file .json)
  echo "Importing $collection..."

  mongoimport --uri="$MONGO_URI" \
    --db=Machine_agent_prod \
    --collection=$collection \
    --file=$file \
    --mode=upsert
done

Performance Tuning

Monitor RU/s Consumption:

# Check current throughput
az cosmosdb mongodb collection throughput show \
  --account-name machineagents-cosmosdb-prod \
  --database-name Machine_agent_prod \
  --name chatbot_history \
  --resource-group machineagents-data-rg

# Check if throttled
az monitor metrics list \
  --resource /subscriptions/{sub}/resourceGroups/machineagents-data-rg/providers/Microsoft.DocumentDB/databaseAccounts/machineagents-cosmosdb-prod \
  --metric "TotalRequestUnits" \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
  | jq '.value[].timeseries[].data[] | select(.maximum > 4000)'

Optimize Queries:

// Before: Slow query (table scan)
db.chatbot_history.find({ user_id: "User-123" });

// After: Use index
db.chatbot_history.createIndex({ user_id: 1, created_at: -1 });
db.chatbot_history.find({ user_id: "User-123" }).sort({ created_at: -1 });

// Check query performance
db.chatbot_history.find({ user_id: "User-123" }).explain("executionStats");

🔍 Milvus Operations

Collection Management

Create Collection:

from pymilvus import Collection, FieldSchema, CollectionSchema, DataType

# Define schema
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=384),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=2000),
    FieldSchema(name="user_id", dtype=DataType.VARCHAR, max_length=100),
    FieldSchema(name="project_id", dtype=DataType.VARCHAR, max_length=100),
]

schema = CollectionSchema(fields=fields, description="Chatbot vectors")

# Create collection
collection = Collection(name="chatbot_vectors_User123_Project1", schema=schema)

# Create IVF_FLAT index
index_params = {
    "index_type": "IVF_FLAT",
    "metric_type": "COSINE",
    "params": {"nlist": 128}
}
collection.create_index(field_name="embedding", index_params=index_params)

Drop Collection:

from pymilvus import utility

# Drop collection (PERMANENT!)
utility.drop_collection("chatbot_vectors_User123_Project1")

Backup & Restore

Backup Milvus Data:

# Stop writes (set read-only mode)
# Via application feature flag

# Create snapshot
docker exec milvus-standalone bash -c "
  cd /var/lib/milvus &&
  tar -czf /tmp/milvus_snapshot_$(date +%Y%m%d).tar.gz db/ wal/
"

# Copy to host
docker cp milvus-standalone:/tmp/milvus_snapshot_$(date +%Y%m%d).tar.gz ./

# Upload to Azure Blob
az storage blob upload \
  --container-name milvus-backups \
  --file milvus_snapshot_$(date +%Y%m%d).tar.gz \
  --name $(date +%Y%m%d)/milvus_snapshot.tar.gz \
  --account-name qablobmachineagents

Restore from Backup:

# Download backup
az storage blob download \
  --container-name milvus-backups \
  --name 20251230/milvus_snapshot.tar.gz \
  --file milvus_restore.tar.gz \
  --account-name qablobmachineagents

# Stop Milvus
docker stop milvus-standalone

# Clear existing data
docker exec milvus-standalone rm -rf /var/lib/milvus/db /var/lib/milvus/wal

# Restore backup
docker cp milvus_restore.tar.gz milvus-standalone:/tmp/
docker exec milvus-standalone tar -xzf /tmp/milvus_restore.tar.gz -C /var/lib/milvus/

# Restart Milvus
docker start milvus-standalone

# Verify collections
python -c "
from pymilvus import connections, utility
connections.connect(host='localhost', port='19530')
print(utility.list_collections())
"

Performance Optimization

Index Tuning:

# IVF_FLAT (current) - Good for small-medium datasets
index_params = {
    "index_type": "IVF_FLAT",
    "metric_type": "COSINE",
    "params": {"nlist": 128}  # Number of clusters
}

# HNSW - Better for large datasets, faster search
index_params = {
    "index_type": "HNSW",
    "metric_type": "COSINE",
    "params": {
        "M": 16,        # Number of bi-directional links
        "efConstruction": 200  # Build-time search scope
    }
}

# Search parameters
search_params = {
    "metric_type": "COSINE",
    "params": {"ef": 100}  # Search-time scope (higher = more accurate, slower)
}

Connection Pooling:

from pymilvus import connections

# Configure connection pool
connections.connect(
    alias="default",
    host="milvus-prod.eastus.azurecontainer.io",
    port="19530",
    pool_size=10,  # Connection pool size
    timeout=30
)


"Backups are only good if you test restores." 💾✅