Skip to content

Superadmin Service (Port 8020)

Overview

The Superadmin Service is a backend API service that provides administrative capabilities for the MachineAgents platform. It serves the superadmin dashboard with user management, subscription plan configuration, legal document distribution (Terms & Conditions, Privacy Policy), and admin authentication features.

Port: 8020
Path: machineagents-be/superadmin-service/src/main.py
Lines of Code: 570
Framework: FastAPI
Database: MongoDB (Cosmos DB)
Storage: Azure Blob Storage (for legal documents)


Core Functionality

The service provides 7 main endpoints:

  1. GET /v2/get-latest-terms-conditions - Fetch latest Terms & Conditions PDF URL
  2. GET /v2/get-latest-privacy-policy - Fetch latest Privacy Policy PDF URL
  3. GET /v2/subscriptions/plans - Dynamically generate subscription plans from 4 collections
  4. POST /v2/admin-login - Admin authentication (⚠️ CRITICAL: Plain-text passwords!)
  5. GET /v2/admin/get-user-history - Fetch chat history for a specific user
  6. GET /v2/admin/users - List all admin users
  7. GET /v2/get-user-data/{user_id} - Get detailed user data by user_id or _id
  8. GET /healthcheck - Health check endpoint

Architecture

graph TB
    subgraph "Superadmin Service (Port 8020)"
        API[FastAPI App]

        subgraph "Endpoints"
            Legal[Legal Docs Endpoints]
            Sub[Subscription Plans]
            Auth[Admin Auth]
            Users[User Management]
        end

        subgraph "Data Sources"
            Mongo[(MongoDB<br/>8 Collections)]
            Blob[Azure Blob Storage<br/>Legal PDFs]
        end
    end

    Dashboard[Superadmin Dashboard] -->|HTTPS| API

    API --> Legal
    API --> Sub
    API --> Auth
    API --> Users

    Legal -->|Fetch PDFs| Blob
    Sub -->|Query| Mongo
    Auth -->|Query| Mongo
    Users -->|Query| Mongo

    Blob -->|machineagents/<br/>termandcondition/| LegalDocs[terms_conditions_TIMESTAMP.pdf]
    Blob -->|machineagents/<br/>privacypolicy/| PrivacyDocs[privacy_policy_TIMESTAMP.pdf]

    Mongo -->|users_multichatbot_v2| UserData[Admin Users]
    Mongo -->|subscriptions| SubData[Plan Pricing]
    Mongo -->|subscriptionID| SubID[Plan Metadata]
    Mongo -->|baseFeatures| BaseF[Quotas]
    Mongo -->|featuresGlobal| FeatG[Feature Definitions]
    Mongo -->|chatbot_history| History[Chat History]
    Mongo -->|projectid_creation| Projects[Projects]
    Mongo -->|chatbot_selections| Chatbots[Chatbots]

    style API fill:#2196F3,color:#fff
    style Blob fill:#FFA726,color:#fff
    style Mongo fill:#4CAF50,color:#fff

Environment Variables

Loaded from .env or Docker Compose:

Variable Purpose Example Value
MONGO_URI MongoDB connection string mongodb://...
MONGO_DB_NAME Database name Machine_agent_dev
AZURE_STORAGE_CONNECTION_STRING Azure Blob Storage connection DefaultEndpointsProtocol=https;...
AZURE_CONTAINER_NAME Blob container name machineagents

Environment Loading (Lines 21-28):

load_dotenv(dotenv_path=Path(".env"))

MONGO_URI = os.getenv("MONGO_URI")
MONGO_DB_NAME = os.getenv("MONGO_DB_NAME")
AZURE_STORAGE_CONNECTION_STRING = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
AZURE_CONTAINER_NAME = os.getenv("AZURE_CONTAINER_NAME")

Database Schema

8 MongoDB Collections Used

1. users_multichatbot_v2 (admin users):

{
    "_id": ObjectId("..."),
    "user_id": "User_123456_Project_1",
    "email": "admin@machineagents.ai",
    "password": "plaintext_password_here",  // ⚠️ SECURITY ISSUE!
    "name": "Admin User",
    "created_at": "2024-01-15T10:30:00Z",
    "subscription_plan": "enterprise",
    "billing_cycle": "yearly"
}

2. subscriptions (plan pricing):

{
    "_id": ObjectId("..."),
    "subscription_id": "SUB_001",
    "pricing": 999,              // or "Custom"
    "offers": 899,               // discounted price
    "baseFeatureID": "BASE_001",
    "featureIDs": ["FT_001", "FT_002", "FT_003"]
}

3. subscriptionID (plan metadata):

{
    "_id": ObjectId("..."),
    "subscription_id": "SUB_001",
    "subscription_plan": "starter",   // free, starter, professional, enterprise
    "billing_cycle": "monthly"        // monthly, yearly
}

4. baseFeatures (quotas):

{
    "_id": ObjectId("..."),
    "baseFeatureID": "BASE_001",
    "no_of_chatbots": 3,              // or "Custom"
    "no_of_chat_sessions": 1000,     // per billing cycle
    "no_of_linkcrawls": 500,          // website pages crawl cap
    "grace_period": 14                // days
}

5. featuresGlobal (feature definitions):

{
    "_id": ObjectId("..."),
    "featureID": "FT_001",
    "feature": "Custom Branding"      // Converted to "custom_branding" key
}

6. chatbot_history (chat logs):

{
    "_id": ObjectId("..."),
    "user_id": "User_123456_Project_1",
    "project_id": "Project_123456",
    "session_id": "session_abc123",
    "messages": [...],
    "timestamp": "2024-01-15T10:30:00Z"
}

7. projectid_creation (projects):

Used for user project tracking (not directly queried by this service, but collection referenced).

8. chatbot_selections (chatbot configs):

Used for chatbot configuration (not directly queried by this service, but collection referenced).


Endpoint Details

1. GET /v2/get-latest-terms-conditions

Purpose: Fetch the URL of the latest Terms & Conditions PDF from Azure Blob Storage.

Request:

GET /v2/get-latest-terms-conditions

Response (Success - 200):

{
  "message": "Latest terms and conditions PDF URL retrieved successfully",
  "pdf_url": "https://machineagentsstoragedev.blob.core.windows.net/machineagents/machineagents/termandcondition/terms_conditions_20240315_143022.pdf",
  "blob_name": "machineagents/termandcondition/terms_conditions_20240315_143022.pdf",
  "timestamp": "20240315_143022",
  "size_bytes": 245632,
  "last_modified": "2024-03-15T14:30:22Z",
  "original_filename": "terms_and_conditions.pdf"
}

Response (Error - 404):

{
  "detail": "No terms and conditions PDF found"
}

Implementation (Lines 117-178):

  1. Connect to Azure Blob Storage (Line 126)
  2. List all blobs with prefix machineagents/termandcondition/ (Line 132)
  3. Find latest blob by comparing timestamp strings in filename (Lines 138-146)
  4. Extract timestamp from filename pattern terms_conditions_TIMESTAMP.pdf (Line 143)
  5. Get blob properties for metadata (Line 156)
  6. Return blob URL and metadata (Lines 163-171)

Filename Pattern:

machineagents/termandcondition/terms_conditions_20240315_143022.pdf
                                                 └─ YYYYMMDD_HHMMSS

Timestamp Comparison: String-based lexicographic comparison (works because format is YYYYMMDD_HHMMSS)

Commented Out Code (Lines 83-114):

Original implementation returned PDF binary data via StreamingResponse. This was changed to return PDF URL only for better performance and separation of concerns.


2. GET /v2/get-latest-privacy-policy

Purpose: Fetch the URL of the latest Privacy Policy PDF from Azure Blob Storage.

Implementation: Identical to Terms & Conditions endpoint, but with different blob prefix.

Request:

GET /v2/get-latest-privacy-policy

Response (Success - 200):

{
  "message": "Latest privacy policy PDF URL retrieved successfully",
  "pdf_url": "https://machineagentsstoragedev.blob.core.windows.net/machineagents/machineagents/privacypolicy/privacy_policy_20240315_143022.pdf",
  "blob_name": "machineagents/privacypolicy/privacy_policy_20240315_143022.pdf",
  "timestamp": "20240315_143022",
  "size_bytes": 198456,
  "last_modified": "2024-03-15T14:30:22Z",
  "original_filename": "privacy_policy.pdf"
}

Implementation (Lines 214-275):

Same logic as Terms & Conditions, but uses:

  • Blob prefix: machineagents/privacypolicy/
  • Filename pattern: privacy_policy_TIMESTAMP.pdf

Commented Out Code (Lines 180-211):

Original implementation returned PDF binary data via StreamingResponse.


3. GET /v2/subscriptions/plans

Purpose: Dynamically generate subscription plans by aggregating data from 4 collections (subscriptions, subscriptionID, baseFeatures, featuresGlobal).

⚠️ CRITICAL: This is the MOST COMPLEX endpoint (192 lines, Lines 279-470).

Request:

GET /v2/subscriptions/plans

Response (Success - 200):

{
  "plans": [
    {
      "_id": { "$oid": "5e6f8b4a3d2c1a0012345678" },
      "category": "free",
      "pricing": {
        "monthly": 0,
        "yearly": 0
      },
      "offers": {
        "monthly": 0,
        "yearly": 0
      },
      "booster_plan": {
        "monthly": 0,
        "yearly": 0
      },
      "grace_plann": {
        "monthly": 14,
        "yearly": 14
      },
      "no_of_chatbots": 1,
      "no_of_chat_sessions": {
        "monthly": 100,
        "yearly": 1200
      },
      "website_pages_crawl_cap": {
        "monthly": 50,
        "yearly": 600
      },
      "features": {
        "custom_branding": false,
        "priority_support": false,
        "advanced_analytics": false,
        "white_labeling": false,
        "api_access": false,
        "multi_language_support": true
      }
    },
    {
      "_id": { "$oid": "a1b2c3d4e5f6789012345678" },
      "category": "starter",
      "pricing": {
        "monthly": 999,
        "yearly": 9999
      },
      "offers": {
        "monthly": 899,
        "yearly": 8999
      },
      "booster_plan": {
        "monthly": 500, // 50% of monthly pricing
        "yearly": 4500 // 45% of yearly pricing
      },
      "grace_plann": {
        "monthly": 14,
        "yearly": 14
      },
      "no_of_chatbots": 3,
      "no_of_chat_sessions": {
        "monthly": 1000,
        "yearly": 12000
      },
      "website_pages_crawl_cap": {
        "monthly": 500,
        "yearly": 6000
      },
      "features": {
        "custom_branding": true,
        "priority_support": false,
        "advanced_analytics": true,
        "white_labeling": false,
        "api_access": false,
        "multi_language_support": true
      }
    },
    {
      "_id": { "$oid": "f9e8d7c6b5a4321098765432" },
      "category": "enterprise",
      "pricing": {
        "monthly": "Custom",
        "yearly": "Custom"
      },
      "offers": {
        "monthly": "Custom",
        "yearly": "Custom"
      },
      "booster_plan": {
        "monthly": "Custom",
        "yearly": "Custom"
      },
      "grace_plann": {
        "monthly": 30,
        "yearly": 30
      },
      "no_of_chatbots": null, // null = unlimited/custom
      "no_of_chat_sessions": {
        "monthly": null,
        "yearly": null
      },
      "website_pages_crawl_cap": {
        "monthly": null,
        "yearly": null
      },
      "features": {
        "custom_branding": true,
        "priority_support": true,
        "advanced_analytics": true,
        "white_labeling": true,
        "api_access": true,
        "multi_language_support": true
      }
    }
  ]
}

Dynamic Plan Generation Algorithm:

Step 1: Fetch All Data (Lines 284-287)

subscriptions = list(subscriptions_collection.find({}))
subscription_ids = list(subscription_id_collection.find({}))
base_features = list(base_features_collection.find({}))
features_global = list(features_global_collection.find({}))

Step 2: Create Lookup Dictionaries (Lines 293-295)

subscription_id_lookup = {item["subscription_id"]: item for item in subscription_ids}
base_features_lookup = {item["baseFeatureID"]: item for item in base_features}
features_global_lookup = {item["featureID"]: item["feature"] for item in features_global}

Step 3: Extract All Unique Feature Names (Lines 298-306)

all_feature_names = set()
for subscription in subscriptions:
    feature_ids = subscription.get("featureIDs", [])
    for feature_id in feature_ids:
        if feature_id in features_global_lookup:
            feature_name = features_global_lookup[feature_id]
            # Convert "Custom Branding" → "custom_branding"
            key = feature_name.lower().replace(" ", "_").replace("-", "_")
            all_feature_names.add(key)

Step 4: Group Subscriptions by Plan Category (Lines 309-432)

For each subscription document:

  1. Get category and billing cycle from subscriptionID lookup (Lines 313-321)
  2. Initialize plan structure if not exists (Lines 328-339)
  3. Set pricing and offers for this billing cycle (Lines 342-350)
  4. Convert to int if numeric, keep as "Custom" if custom
  5. Fetch base features from baseFeatures lookup (Lines 353-397)
  6. no_of_chatbots: Set once per plan (not per billing cycle)
  7. no_of_chat_sessions: Set per billing cycle
  8. website_pages_crawl_cap: Set per billing cycle
  9. grace_plann: Set per billing cycle (default 14 days)
  10. Calculate booster_plan dynamically (Lines 399-415)
  11. Monthly: 50% of monthly pricing
  12. Yearly: 45% of yearly pricing
  13. Free plans: 0
  14. Custom pricing: "Custom"
  15. Build features object from featureIDs (Lines 417-431)
  16. Reset all features to False
  17. Set matched features to True

Step 5: Convert to List with Ordering (Lines 433-462)

  1. Add "free" plan first if it exists (Lines 438-447)
  2. Add other plans in database order (Lines 450-462)
  3. Generate consistent _id using MD5 hash of category name (Lines 440-441, 455-456)

Step 6: Serialize and Return (Lines 464-466)

serialized_plans = [serialize_datetime(plan) for plan in plans_list]
return {"plans": serialized_plans}

Custom vs Numeric Handling:

# Pricing
if pricing_value in ["Custom", "custom"]:
    pricing = "Custom"
else:
    pricing = int(pricing_value)

# Quotas
if no_of_chatbots in ["custom", "Custom"]:
    no_of_chatbots = None     # Rendered as unlimited in UI
else:
    no_of_chatbots = int(no_of_chatbots)

Plan Ordering:

  1. Free plan always first
  2. Other plans in order they appear in subscriptions collection

⚠️ Key Observation: This endpoint does NOT cache results. Every request re-aggregates all 4 collections. For better performance, consider:

  • Redis caching with TTL
  • Materialized views
  • Invalidation only on plan updates

4. POST /v2/admin-login

Purpose: Admin authentication for superadmin dashboard access.

[!CAUTION] > CRITICAL SECURITY VULNERABILITY: This endpoint authenticates using plain-text password comparison! Passwords are stored in MongoDB without hashing. This violates basic security practices and poses a severe risk.

Request:

POST /v2/admin-login
Content-Type: application/json

{
    "email": "admin@machineagents.ai",
    "password": "mysecretpassword"
}

Response (Success - 200):

{
  "message": "Login successful",
  "user_id": "64a1b2c3d4e5f6789012345",
  "email": "admin@machineagents.ai"
}

Response (Error - 401):

{
  "detail": "Invalid credentials"
}

Response (Error - 400):

{
  "detail": "Email and password required"
}

Implementation (Lines 475-498):

@app.post("/v2/admin-login")
def admin_login(data: dict = Body(...)):
    email = data.get("email")
    password = data.get("password")

    if not email or not password:
        raise HTTPException(status_code=400, detail="Email and password required")

    # ⚠️ CRITICAL: Plain-text password comparison!
    user = users_collection.find_one({"email": email, "password": password})

    if not user:
        raise HTTPException(status_code=401, detail="Invalid credentials")

    return {"message": "Login successful", "user_id": str(user["_id"]), "email": user["email"]}

Security Issues:

  1. No password hashing - Passwords stored in plain text
  2. No JWT tokens - No session management
  3. No rate limiting - Vulnerable to brute force attacks
  4. No MFA - Single factor authentication only

MongoDB Query:

db.users_multichatbot_v2.findOne({
  email: "admin@machineagents.ai",
  password: "mysecretpassword", // ⚠️ Plain text!
});

5. GET /v2/admin/get-user-history

Purpose: Fetch chat history for a specific user from chatbot_history collection.

Request:

GET /v2/admin/get-user-history?user_id=User_123456_Project_1

Query Parameters:

Parameter Type Required Description
user_id string User ID to fetch history for

Response (Success - 200):

{
  "history": [
    {
      "_id": "64a1b2c3d4e5f6789012345",
      "user_id": "User_123456_Project_1",
      "project_id": "Project_123456",
      "session_id": "session_abc123",
      "messages": [
        {
          "role": "user",
          "content": "What are your pricing plans?"
        },
        {
          "role": "assistant",
          "content": "We offer 4 plans: Free, Starter, Professional, and Enterprise..."
        }
      ],
      "timestamp": "2024-01-15T10:30:00Z"
    }
  ]
}

Response (Empty - 200):

{
  "history": []
}

Implementation (Lines 502-516):

@app.get("/v2/admin/get-user-history")
def get_user_history(user_id: str = Query(..., description="User ID to fetch history for")):
    history = list(history_collection.find({"user_id": user_id}))
    if not history:
        return {"history": []}

    return {"history": [serialize_datetime(entry) for entry in history]}

MongoDB Query:

db.chatbot_history.find({ user_id: "User_123456_Project_1" });

6. GET /v2/admin/users

Purpose: List all admin users from users_multichatbot_v2 collection.

Request:

GET /v2/admin/users

Response (Success - 200):

{
  "users": [
    {
      "_id": "64a1b2c3d4e5f6789012345",
      "user_id": "User_123456_Project_1",
      "email": "admin@machineagents.ai",
      "password": "plaintext_password", // ⚠️ Exposed in response!
      "name": "Admin User",
      "created_at": "2024-01-15T10:30:00Z",
      "subscription_plan": "enterprise",
      "billing_cycle": "yearly"
    },
    {
      "_id": "64b2c3d4e5f6789012346",
      "user_id": "User_234567_Project_2",
      "email": "admin2@machineagents.ai",
      "password": "another_plaintext_password", // ⚠️ Exposed!
      "name": "Admin User 2",
      "created_at": "2024-02-20T14:45:00Z",
      "subscription_plan": "professional",
      "billing_cycle": "monthly"
    }
  ]
}

Implementation (Lines 519-529):

@app.get("/v2/admin/users")
def get_all_admin_users():
    users = list(users_collection.find({}))
    return {"users": [serialize_datetime(user) for user in users]}

⚠️ CRITICAL SECURITY ISSUE:

This endpoint returns ALL user data including plain-text passwords! This is a severe security vulnerability. The response should:

  1. Exclude password field entirely
  2. Only return necessary fields
  3. Implement proper access control

Recommended Fix:

users = list(users_collection.find({}, {"password": 0}))  # Exclude password

7. GET /v2/get-user-data/{user_id}

Purpose: Get detailed user data for a specific user by user_id or _id.

Request:

GET /v2/get-user-data/User_123456_Project_1

Response (Success - 200):

{
  "user_data": {
    "_id": "64a1b2c3d4e5f6789012345",
    "user_id": "User_123456_Project_1",
    "email": "admin@machineagents.ai",
    "password": "plaintext_password", // ⚠️ Exposed!
    "name": "Admin User",
    "created_at": "2024-01-15T10:30:00Z",
    "subscription_plan": "enterprise",
    "billing_cycle": "yearly",
    "subscription_status": "active",
    "last_login": "2024-03-15T14:30:00Z"
  }
}

Response (Error - 404):

{
  "detail": "User not found"
}

Implementation (Lines 532-561):

@app.get("/v2/get-user-data/{user_id}")
def get_user_data(user_id: str):
    # Try finding by user_id field first
    user_data = users_collection.find_one({"user_id": user_id})

    if not user_data:
        # Fallback: Try finding by _id (ObjectId)
        try:
            user_data = users_collection.find_one({"_id": ObjectId(user_id)})
        except:
            pass

    if not user_data:
        raise HTTPException(status_code=404, detail="User not found")

    # Convert ObjectId to string
    if "_id" in user_data:
        user_data["_id"] = str(user_data["_id"])

    serialized_data = serialize_datetime(user_data)
    return {"user_data": serialized_data}

Dual Lookup Logic:

  1. First attempt: Find by user_id field (Lines 537)
  2. Fallback: If not found, try _id as ObjectId (Lines 541-543)
  3. Error handling: Silent catch if ObjectId conversion fails

⚠️ SECURITY ISSUE: Same as endpoint #6 - password is exposed in response.


8. GET /healthcheck

Purpose: Basic health check for container orchestration (Docker, Kubernetes).

Request:

GET /healthcheck

Response (Success - 200):

{
  "status": "running",
  "service": "superadmin-service"
}

Implementation (Lines 564-567):

@app.get("/healthcheck")
def healthcheck():
    """Basic health check endpoint for container status."""
    return {"status": "running", "service": "superadmin-service"}

⚠️ LIMITATION: This health check does NOT verify:

  • MongoDB connectivity
  • Azure Blob Storage connectivity
  • Actual service functionality

Recommended Enhancement:

@app.get("/healthcheck")
def healthcheck():
    try:
        # Test MongoDB
        users_collection.find_one({})

        # Test Azure Blob (optional)
        get_blob_client().list_blobs(max_results=1)

        return {"status": "healthy", "service": "superadmin-service"}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}

Utility Functions

1. get_blob_client() (Lines 59-65)

Purpose: Initialize Azure Blob Storage client for legal document retrieval.

Implementation:

def get_blob_client():
    if not AZURE_STORAGE_CONNECTION_STRING or not AZURE_CONTAINER_NAME:
        logger.error("Azure Storage configuration is missing")
        raise ValueError("Azure Storage configuration is missing.")
    blob_service = BlobServiceClient.from_connection_string(AZURE_STORAGE_CONNECTION_STRING)
    logger.debug(f"Azure Blob client initialized for container: {AZURE_CONTAINER_NAME}")
    return blob_service.get_container_client(AZURE_CONTAINER_NAME)

Returns: ContainerClient for the specified container

Error Handling: Raises ValueError if environment variables are missing


2. serialize_datetime(obj) (Lines 68-77)

Purpose: Recursively convert datetime and ObjectId objects to JSON-serializable formats.

Implementation:

def serialize_datetime(obj):
    if isinstance(obj, datetime):
        return obj.isoformat()              # "2024-01-15T10:30:00Z"
    elif isinstance(obj, ObjectId):
        return str(obj)                     # "64a1b2c3d4e5f6789012345"
    elif isinstance(obj, dict):
        return {key: serialize_datetime(value) for key, value in obj.items()}
    elif isinstance(obj, list):
        return [serialize_datetime(item) for item in obj]
    return obj

Transformations:

Input Type Output Type Example
datetime(2024, 1, 15, 10, 30) str (ISO 8601) "2024-01-15T10:30:00"
ObjectId("64a1b2...") str "64a1b2c3d4e5f6789012345"
{"_id": ObjectId(...)} {"_id": "64a1..."} Recursive dict processing
[ObjectId(...), ...] ["64a1...", ...] Recursive list processing

Used In:

  • All user-facing endpoints to ensure MongoDB data is JSON-compatible
  • Subscription plans serialization (Line 464)
  • User history serialization (Line 512)
  • User data serialization (Line 555)

CORS Configuration

Lines 32-38:

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],        # ⚠️ SECURITY ISSUE!
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

[!WARNING] > SECURITY VULNERABILITY: Permissive CORS allows requests from ANY origin (allow_origins=["*"]). This should be restricted to:

  • Superadmin dashboard domain (e.g., https://admin.machineagents.ai)
  • Local development (http://localhost:3000)

Recommended Fix:

ALLOWED_ORIGINS = os.getenv("ALLOWED_ORIGINS", "https://admin.machineagents.ai").split(",")

app.add_middleware(
    CORSMiddleware,
    allow_origins=ALLOWED_ORIGINS,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

Security Issues Summary

🔴 CRITICAL Security Vulnerabilities

1. Plain-Text Password Storage & Authentication (Lines 487, 523)

Issue: Passwords are stored in MongoDB without hashing and compared directly.

Code:

user = users_collection.find_one({"email": email, "password": password})

Database:

{
  "email": "admin@machineagents.ai",
  "password": "mysecretpassword" // ⚠️ Plain text!
}

Impact:

  • If database is compromised, all passwords are exposed
  • Violates OWASP, PCI DSS, GDPR, and all security standards
  • No protection against rainbow table attacks

Recommended Fix:

import bcrypt

# During signup/password change
hashed_password = bcrypt.hashpw(password.encode('utf-8'), bcrypt.gensalt())
users_collection.insert_one({"email": email, "password": hashed_password})

# During login
user = users_collection.find_one({"email": email})
if user and bcrypt.checkpw(password.encode('utf-8'), user["password"]):
    # Login successful

2. Password Exposure in API Responses (Lines 523-525)

Issue: /v2/admin/users endpoint returns ALL user data including plain-text passwords.

Code:

users = list(users_collection.find({}))  # Includes password field!
return {"users": [serialize_datetime(user) for user in users]}

Response:

{
  "users": [
    {
      "email": "admin@machineagents.ai",
      "password": "mysecretpassword" // ⚠️ Exposed to frontend!
    }
  ]
}

Impact:

  • Any authenticated admin can see ALL admin passwords
  • Even with HTTPS, this is a severe vulnerability
  • Passwords could be logged by proxies, CDNs, or browser extensions

Recommended Fix:

users = list(users_collection.find({}, {"password": 0}))  # Exclude password

3. No Session Management (Lines 475-498)

Issue: No JWT tokens, session IDs, or any form of session tracking after login.

Impact:

  • No way to track authenticated users
  • No logout functionality
  • No session expiration
  • Every request requires re-authentication

Recommended Fix:

import jwt
from datetime import datetime, timedelta

SECRET_KEY = os.getenv("JWT_SECRET_KEY")

@app.post("/v2/admin-login")
def admin_login(data: dict = Body(...)):
    # ... authentication logic ...

    # Generate JWT token
    payload = {
        "user_id": str(user["_id"]),
        "email": user["email"],
        "exp": datetime.utcnow() + timedelta(hours=24)
    }
    token = jwt.encode(payload, SECRET_KEY, algorithm="HS256")

    return {"message": "Login successful", "token": token}

🟡 Medium Severity Issues

4. Overly Permissive CORS (Lines 32-38)

Issue: allow_origins=["*"] allows requests from any domain.

Recommended Fix: Whitelist specific domains (covered in CORS Configuration section).


5. No Rate Limiting

Issue: No rate limiting on login endpoint - vulnerable to brute force attacks.

Recommended Fix:

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.post("/v2/admin-login")
@limiter.limit("5/minute")  # 5 attempts per minute
def admin_login(...):
    ...

6. No Input Validation

Issue: No validation of email format, password strength, or user_id format.

Recommended Fix:

from pydantic import BaseModel, EmailStr, validator

class AdminLoginRequest(BaseModel):
    email: EmailStr
    password: str

    @validator('password')
    def password_min_length(cls, v):
        if len(v) < 8:
            raise ValueError('Password must be at least 8 characters')
        return v

@app.post("/v2/admin-login")
def admin_login(data: AdminLoginRequest):
    ...

Code Quality Issues

1. Inconsistent Error Handling

Some endpoints return detailed error messages, others return generic ones:

Detailed (Good):

raise HTTPException(status_code=404, detail=f"Latest PDF URL retrieval failed: {str(e)}")

Generic (Bad):

raise HTTPException(status_code=500, detail="Login failed")

Recommendation: Standardize error responses across all endpoints.


2. No Pagination

Endpoints like /v2/admin/users and /v2/admin/get-user-history return ALL results without pagination.

Issue: Could cause memory issues or slow responses with large datasets.

Recommended Fix:

@app.get("/v2/admin/users")
def get_all_admin_users(skip: int = 0, limit: int = 50):
    users = list(users_collection.find({}).skip(skip).limit(limit))
    total_count = users_collection.count_documents({})
    return {
        "users": [serialize_datetime(user) for user in users],
        "total": total_count,
        "skip": skip,
        "limit": limit
    }

3. Silent Exception Handling (Lines 541-544)

Code:

if not user_data:
    try:
        user_data = users_collection.find_one({"_id": ObjectId(user_id)})
    except:
        pass  # ⚠️ Silent catch!

Issue: Swallows all exceptions without logging, making debugging difficult.

Recommended Fix:

if not user_data:
    try:
        user_data = users_collection.find_one({"_id": ObjectId(user_id)})
    except Exception as e:
        logger.warning(f"Failed to parse user_id as ObjectId: {e}")

4. No Caching for Subscription Plans

The /v2/subscriptions/plans endpoint re-aggregates 4 collections on EVERY request.

Performance Impact:

  • 4 database queries per request
  • Complex aggregation logic (192 lines)
  • No caching layer

Recommended Fix:

from functools import lru_cache
from datetime import datetime, timedelta

# Cache for 5 minutes
@lru_cache(maxsize=1)
def get_subscription_plans_cached(cache_time: int):
    # ... existing logic ...
    return plans

@app.get("/v2/subscriptions/plans")
def get_subscription_plans():
    # Cache key changes every 5 minutes
    cache_key = int(datetime.now().timestamp() / 300)
    return get_subscription_plans_cached(cache_key)

Or use Redis:

import redis
import json

redis_client = redis.Redis(host='localhost', port=6379, db=0)

@app.get("/v2/subscriptions/plans")
def get_subscription_plans():
    # Try cache first
    cached = redis_client.get("subscription_plans")
    if cached:
        return json.loads(cached)

    # Generate plans
    plans = ... # existing logic

    # Cache for 5 minutes
    redis_client.setex("subscription_plans", 300, json.dumps(plans))

    return plans

Deployment Configuration

Dockerfile (Lines 1-12)

Path: superadmin-service/Dockerfile

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY src/ /app/src/

EXPOSE 8020

CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8020"]

Key Points:

  • Base image: python:3.9-slim (minimal image)
  • Working directory: /app
  • Port exposed: 8020
  • No build-time dependencies (no FFmpeg, etc.)

Requirements.txt (Lines 1-12)

Path: superadmin-service/requirements.txt

fastapi
pymongo
pytz
tiktoken
uvicorn
python-dotenv
python-multipart
pydantic
httpx
pydantic-settings
azure-storage-blob==12.18.2
pymilvus==2.3.4

Observations:

  1. tiktoken - Not used in this service (likely copy-paste from other services)
  2. pymilvus - Not used (service only uses MongoDB, not Milvus)
  3. azure-storage-blob - ✅ Used for legal document retrieval
  4. httpx - Not used in current code

Recommended Cleanup:

fastapi
pymongo
uvicorn
python-dotenv
pydantic
pydantic-settings
azure-storage-blob==12.18.2

Remove: tiktoken, pymilvus, httpx, pytz, python-multipart (unless needed for future features)


Docker Compose Configuration

Path: machineagents-be/docker-compose_dev.yaml (Lines 338-350)

superadmin-service:
  build: ./superadmin-service
  container_name: superadmin-service
  ports:
    - "8020:8020"
  networks:
    - app-network
  environment:
    - MONGO_URI=mongodb://dev-machineagent-rc:...
    - MONGO_DB_NAME=Machine_agent_dev
    - DEFAULT_DATABASE=milvus # ⚠️ Not used by this service
    - MILVUS_HOST=milvus-standalone # ⚠️ Not used
    - MILVUS_PORT=19530 # ⚠️ Not used

Missing Environment Variables:

- AZURE_STORAGE_CONNECTION_STRING=...
- AZURE_CONTAINER_NAME=machineagents

⚠️ CRITICAL: Without these, the legal document endpoints will fail!


Data Flow Diagrams

sequenceDiagram
    participant Dashboard as Superadmin Dashboard
    participant API as Superadmin Service
    participant Blob as Azure Blob Storage

    Dashboard->>+API: GET /v2/get-latest-terms-conditions
    API->>+Blob: List blobs with prefix<br/>"machineagents/termandcondition/"
    Blob-->>-API: Blob list

    Note over API: Find latest by timestamp<br/>(lexicographic comparison)

    API->>+Blob: Get blob properties<br/>for "terms_conditions_20240315_143022.pdf"
    Blob-->>-API: Blob metadata (size, last_modified, URL)

    API-->>-Dashboard: JSON response with PDF URL

    Dashboard->>Blob: Direct download from blob URL

Subscription Plans Generation Flow

sequenceDiagram
    participant Dashboard as Superadmin Dashboard
    participant API as Superadmin Service
    participant DB as MongoDB (4 collections)

    Dashboard->>+API: GET /v2/subscriptions/plans

    par Fetch All Collections
        API->>+DB: Find all in "subscriptions"
        DB-->>-API: Subscription documents
        API->>+DB: Find all in "subscriptionID"
        DB-->>-API: Subscription metadata
        API->>+DB: Find all in "baseFeatures"
        DB-->>-API: Quota data
        API->>+DB: Find all in "featuresGlobal"
        DB-->>-API: Feature definitions
    end

    Note over API: Create lookup dictionaries<br/>(subscription_id, baseFeatureID, featureID)

    Note over API: Extract all unique feature names<br/>(convert to snake_case)

    loop For each subscription
        Note over API: Get category + billing cycle<br/>from subscriptionID lookup
        Note over API: Set pricing[billing_cycle]<br/>and offers[billing_cycle]
        Note over API: Get quotas from baseFeatures<br/>(chatbots, sessions, crawl_cap)
        Note over API: Calculate booster_plan<br/>(50% monthly, 45% yearly)
        Note over API: Build features object<br/>from featureIDs
    end

    Note over API: Sort: "free" first,<br/>then database order

    Note over API: Generate _id hashes<br/>using MD5 of category

    Note over API: Serialize all datetimes<br/>and ObjectIds

    API-->>-Dashboard: JSON with plans array

Admin Login Flow

sequenceDiagram
    participant Dashboard as Superadmin Dashboard
    participant API as Superadmin Service
    participant DB as MongoDB (users_multichatbot_v2)

    Dashboard->>+API: POST /v2/admin-login<br/>{email, password}

    alt Missing credentials
        API-->>Dashboard: 400 Bad Request<br/>"Email and password required"
    end

    API->>+DB: Find user with email + password<br/>(plain-text comparison!)
    DB-->>-API: User document or null

    alt User not found
        API-->>Dashboard: 401 Unauthorized<br/>"Invalid credentials"
    else User found
        Note over API: Extract user_id and email
        API-->>-Dashboard: 200 OK<br/>{message, user_id, email}
    end

    Note over Dashboard: ⚠️ No JWT token returned!<br/>No session tracking!

Integration Points

1. Superadmin Dashboard (Frontend)

Consumers:

  • Superadmin web dashboard (likely React/Next.js)
  • Mobile admin app (if exists)

Authentication Flow:

  1. Dashboard calls POST /v2/admin-login with email/password
  2. On success, stores user_id and email locally
  3. ⚠️ No token-based auth - unclear how subsequent requests are authenticated

Subscription Management:

  1. Dashboard calls GET /v2/subscriptions/plans
  2. Displays plans in UI
  3. Users can select plans (but subscription creation not in this service)

User Management:

  1. Dashboard calls GET /v2/admin/users to list all admins
  2. Calls GET /v2/get-user-data/{user_id} for details
  3. Calls GET /v2/admin/get-user-history?user_id=... for chat logs

2. Azure Blob Storage

Container: machineagents

Directory Structure:

machineagents/
├── termandcondition/
│   ├── terms_conditions_20240115_103000.pdf
│   ├── terms_conditions_20240220_144500.pdf
│   └── terms_conditions_20240315_143022.pdf   ← Latest (selected)
├── privacypolicy/
│   ├── privacy_policy_20240115_103000.pdf
│   ├── privacy_policy_20240220_144500.pdf
│   └── privacy_policy_20240315_143022.pdf     ← Latest (selected)
└── (other files...)

Blob Metadata:

Each PDF blob can have custom metadata:

{
  "original_filename": "terms_and_conditions_v2.pdf",
  "uploaded_by": "admin@machineagents.ai",
  "version": "2.0"
}

3. MongoDB Collections Usage

Collection Used By Endpoints Purpose
users_multichatbot_v2 /v2/admin-login, /v2/admin/users, /v2/get-user-data/{user_id} Admin user authentication and management
subscriptions /v2/subscriptions/plans Plan pricing and feature IDs
subscriptionID /v2/subscriptions/plans Plan metadata (category, billing cycle)
baseFeatures /v2/subscriptions/plans Quotas (chatbots, sessions, crawl cap, grace period)
featuresGlobal /v2/subscriptions/plans Feature definitions (names mapped to IDs)
chatbot_history /v2/admin/get-user-history User chat logs for debugging/support
projectid_creation (Referenced but not queried) User projects
chatbot_selections (Referenced but not queried) Chatbot configurations

Summary

Service Statistics

  • Total Lines: 570
  • Total Functions: 8 endpoints + 2 utilities = 10 functions
  • Active Endpoints: 8 (7 main + 1 healthcheck)
  • Commented Out Code: 66 lines (original PDF streaming implementations)
  • Database Collections Used: 8
  • External Dependencies: Azure Blob Storage

Key Features

Admin Authentication - Login endpoint (⚠️ with plain-text passwords)
User Management - List users, get user data, fetch chat history
Subscription Plans - Dynamic plan generation from 4 collections
Legal Documents - Fetch latest Terms & Conditions and Privacy Policy PDFs
Health Check - Container status endpoint

Critical Issues

🔴 CRITICAL - Plain-Text Passwords - No password hashing
🔴 CRITICAL - Password Exposure - Passwords returned in API responses
🔴 CRITICAL - No Session Management - No JWT tokens or session tracking
🟡 MEDIUM - Permissive CORS - Allows requests from any origin
🟡 MEDIUM - No Rate Limiting - Vulnerable to brute force
🟡 MEDIUM - No Caching - Subscription plans re-aggregated on every request

Code Quality Issues

⚠️ Unused Dependencies - tiktoken, pymilvus, httpx not used
⚠️ Silent Exception Handling - Try/except with pass (Line 541-544)
⚠️ No Pagination - User list endpoint returns all results
⚠️ No Input Validation - Email/password validation missing

Performance Characteristics

  • Subscription Plans Generation: ~100-500ms (4 DB queries + aggregation)
  • Legal Document Retrieval: ~200-800ms (Blob Storage listing + metadata fetch)
  • Admin Login: ~50-200ms (Single DB query)
  • User List/History: ~50-300ms (Single DB query, unbounded)

Recommendations

Immediate Actions (Security):

  1. ✅ Implement bcrypt password hashing
  2. ✅ Exclude passwords from API responses
  3. ✅ Add JWT token-based authentication
  4. ✅ Restrict CORS to specific domains
  5. ✅ Add rate limiting to login endpoint

Performance Improvements:

  1. ✅ Add Redis caching for subscription plans
  2. ✅ Implement pagination for user lists
  3. ✅ Add database indexes on frequently queried fields

Code Quality:

  1. ✅ Remove unused dependencies from requirements.txt
  2. ✅ Add input validation with Pydantic models
  3. ✅ Improve error handling with proper logging
  4. ✅ Add unit tests for subscription plan generation logic

This service integrates with:

  • User Service (Port 8004) - User account creation and management
  • Auth Service (Port 8005) - (Should be used for authentication instead of custom logic)
  • Gateway Service (Port 8002) - API gateway for routing requests
  • All other services - Manages users who own chatbots across all services

Documentation Complete: Superadmin Service (Port 8020)