Superadmin Service (Port 8020)¶
Overview¶
The Superadmin Service is a backend API service that provides administrative capabilities for the MachineAgents platform. It serves the superadmin dashboard with user management, subscription plan configuration, legal document distribution (Terms & Conditions, Privacy Policy), and admin authentication features.
Port: 8020
Path: machineagents-be/superadmin-service/src/main.py
Lines of Code: 570
Framework: FastAPI
Database: MongoDB (Cosmos DB)
Storage: Azure Blob Storage (for legal documents)
Core Functionality¶
The service provides 7 main endpoints:
- GET
/v2/get-latest-terms-conditions- Fetch latest Terms & Conditions PDF URL - GET
/v2/get-latest-privacy-policy- Fetch latest Privacy Policy PDF URL - GET
/v2/subscriptions/plans- Dynamically generate subscription plans from 4 collections - POST
/v2/admin-login- Admin authentication (⚠️ CRITICAL: Plain-text passwords!) - GET
/v2/admin/get-user-history- Fetch chat history for a specific user - GET
/v2/admin/users- List all admin users - GET
/v2/get-user-data/{user_id}- Get detailed user data by user_id or _id - GET
/healthcheck- Health check endpoint
Architecture¶
graph TB
subgraph "Superadmin Service (Port 8020)"
API[FastAPI App]
subgraph "Endpoints"
Legal[Legal Docs Endpoints]
Sub[Subscription Plans]
Auth[Admin Auth]
Users[User Management]
end
subgraph "Data Sources"
Mongo[(MongoDB<br/>8 Collections)]
Blob[Azure Blob Storage<br/>Legal PDFs]
end
end
Dashboard[Superadmin Dashboard] -->|HTTPS| API
API --> Legal
API --> Sub
API --> Auth
API --> Users
Legal -->|Fetch PDFs| Blob
Sub -->|Query| Mongo
Auth -->|Query| Mongo
Users -->|Query| Mongo
Blob -->|machineagents/<br/>termandcondition/| LegalDocs[terms_conditions_TIMESTAMP.pdf]
Blob -->|machineagents/<br/>privacypolicy/| PrivacyDocs[privacy_policy_TIMESTAMP.pdf]
Mongo -->|users_multichatbot_v2| UserData[Admin Users]
Mongo -->|subscriptions| SubData[Plan Pricing]
Mongo -->|subscriptionID| SubID[Plan Metadata]
Mongo -->|baseFeatures| BaseF[Quotas]
Mongo -->|featuresGlobal| FeatG[Feature Definitions]
Mongo -->|chatbot_history| History[Chat History]
Mongo -->|projectid_creation| Projects[Projects]
Mongo -->|chatbot_selections| Chatbots[Chatbots]
style API fill:#2196F3,color:#fff
style Blob fill:#FFA726,color:#fff
style Mongo fill:#4CAF50,color:#fff
Environment Variables¶
Loaded from .env or Docker Compose:
| Variable | Purpose | Example Value |
|---|---|---|
MONGO_URI |
MongoDB connection string | mongodb://... |
MONGO_DB_NAME |
Database name | Machine_agent_dev |
AZURE_STORAGE_CONNECTION_STRING |
Azure Blob Storage connection | DefaultEndpointsProtocol=https;... |
AZURE_CONTAINER_NAME |
Blob container name | machineagents |
Environment Loading (Lines 21-28):
load_dotenv(dotenv_path=Path(".env"))
MONGO_URI = os.getenv("MONGO_URI")
MONGO_DB_NAME = os.getenv("MONGO_DB_NAME")
AZURE_STORAGE_CONNECTION_STRING = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
AZURE_CONTAINER_NAME = os.getenv("AZURE_CONTAINER_NAME")
Database Schema¶
8 MongoDB Collections Used¶
1. users_multichatbot_v2 (admin users):
{
"_id": ObjectId("..."),
"user_id": "User_123456_Project_1",
"email": "admin@machineagents.ai",
"password": "plaintext_password_here", // ⚠️ SECURITY ISSUE!
"name": "Admin User",
"created_at": "2024-01-15T10:30:00Z",
"subscription_plan": "enterprise",
"billing_cycle": "yearly"
}
2. subscriptions (plan pricing):
{
"_id": ObjectId("..."),
"subscription_id": "SUB_001",
"pricing": 999, // or "Custom"
"offers": 899, // discounted price
"baseFeatureID": "BASE_001",
"featureIDs": ["FT_001", "FT_002", "FT_003"]
}
3. subscriptionID (plan metadata):
{
"_id": ObjectId("..."),
"subscription_id": "SUB_001",
"subscription_plan": "starter", // free, starter, professional, enterprise
"billing_cycle": "monthly" // monthly, yearly
}
4. baseFeatures (quotas):
{
"_id": ObjectId("..."),
"baseFeatureID": "BASE_001",
"no_of_chatbots": 3, // or "Custom"
"no_of_chat_sessions": 1000, // per billing cycle
"no_of_linkcrawls": 500, // website pages crawl cap
"grace_period": 14 // days
}
5. featuresGlobal (feature definitions):
{
"_id": ObjectId("..."),
"featureID": "FT_001",
"feature": "Custom Branding" // Converted to "custom_branding" key
}
6. chatbot_history (chat logs):
{
"_id": ObjectId("..."),
"user_id": "User_123456_Project_1",
"project_id": "Project_123456",
"session_id": "session_abc123",
"messages": [...],
"timestamp": "2024-01-15T10:30:00Z"
}
7. projectid_creation (projects):
Used for user project tracking (not directly queried by this service, but collection referenced).
8. chatbot_selections (chatbot configs):
Used for chatbot configuration (not directly queried by this service, but collection referenced).
Endpoint Details¶
1. GET /v2/get-latest-terms-conditions¶
Purpose: Fetch the URL of the latest Terms & Conditions PDF from Azure Blob Storage.
Request:
Response (Success - 200):
{
"message": "Latest terms and conditions PDF URL retrieved successfully",
"pdf_url": "https://machineagentsstoragedev.blob.core.windows.net/machineagents/machineagents/termandcondition/terms_conditions_20240315_143022.pdf",
"blob_name": "machineagents/termandcondition/terms_conditions_20240315_143022.pdf",
"timestamp": "20240315_143022",
"size_bytes": 245632,
"last_modified": "2024-03-15T14:30:22Z",
"original_filename": "terms_and_conditions.pdf"
}
Response (Error - 404):
Implementation (Lines 117-178):
- Connect to Azure Blob Storage (Line 126)
- List all blobs with prefix
machineagents/termandcondition/(Line 132) - Find latest blob by comparing timestamp strings in filename (Lines 138-146)
- Extract timestamp from filename pattern
terms_conditions_TIMESTAMP.pdf(Line 143) - Get blob properties for metadata (Line 156)
- Return blob URL and metadata (Lines 163-171)
Filename Pattern:
Timestamp Comparison: String-based lexicographic comparison (works because format is YYYYMMDD_HHMMSS)
Commented Out Code (Lines 83-114):
Original implementation returned PDF binary data via StreamingResponse. This was changed to return PDF URL only for better performance and separation of concerns.
2. GET /v2/get-latest-privacy-policy¶
Purpose: Fetch the URL of the latest Privacy Policy PDF from Azure Blob Storage.
Implementation: Identical to Terms & Conditions endpoint, but with different blob prefix.
Request:
Response (Success - 200):
{
"message": "Latest privacy policy PDF URL retrieved successfully",
"pdf_url": "https://machineagentsstoragedev.blob.core.windows.net/machineagents/machineagents/privacypolicy/privacy_policy_20240315_143022.pdf",
"blob_name": "machineagents/privacypolicy/privacy_policy_20240315_143022.pdf",
"timestamp": "20240315_143022",
"size_bytes": 198456,
"last_modified": "2024-03-15T14:30:22Z",
"original_filename": "privacy_policy.pdf"
}
Implementation (Lines 214-275):
Same logic as Terms & Conditions, but uses:
- Blob prefix:
machineagents/privacypolicy/ - Filename pattern:
privacy_policy_TIMESTAMP.pdf
Commented Out Code (Lines 180-211):
Original implementation returned PDF binary data via StreamingResponse.
3. GET /v2/subscriptions/plans¶
Purpose: Dynamically generate subscription plans by aggregating data from 4 collections (subscriptions, subscriptionID, baseFeatures, featuresGlobal).
⚠️ CRITICAL: This is the MOST COMPLEX endpoint (192 lines, Lines 279-470).
Request:
Response (Success - 200):
{
"plans": [
{
"_id": { "$oid": "5e6f8b4a3d2c1a0012345678" },
"category": "free",
"pricing": {
"monthly": 0,
"yearly": 0
},
"offers": {
"monthly": 0,
"yearly": 0
},
"booster_plan": {
"monthly": 0,
"yearly": 0
},
"grace_plann": {
"monthly": 14,
"yearly": 14
},
"no_of_chatbots": 1,
"no_of_chat_sessions": {
"monthly": 100,
"yearly": 1200
},
"website_pages_crawl_cap": {
"monthly": 50,
"yearly": 600
},
"features": {
"custom_branding": false,
"priority_support": false,
"advanced_analytics": false,
"white_labeling": false,
"api_access": false,
"multi_language_support": true
}
},
{
"_id": { "$oid": "a1b2c3d4e5f6789012345678" },
"category": "starter",
"pricing": {
"monthly": 999,
"yearly": 9999
},
"offers": {
"monthly": 899,
"yearly": 8999
},
"booster_plan": {
"monthly": 500, // 50% of monthly pricing
"yearly": 4500 // 45% of yearly pricing
},
"grace_plann": {
"monthly": 14,
"yearly": 14
},
"no_of_chatbots": 3,
"no_of_chat_sessions": {
"monthly": 1000,
"yearly": 12000
},
"website_pages_crawl_cap": {
"monthly": 500,
"yearly": 6000
},
"features": {
"custom_branding": true,
"priority_support": false,
"advanced_analytics": true,
"white_labeling": false,
"api_access": false,
"multi_language_support": true
}
},
{
"_id": { "$oid": "f9e8d7c6b5a4321098765432" },
"category": "enterprise",
"pricing": {
"monthly": "Custom",
"yearly": "Custom"
},
"offers": {
"monthly": "Custom",
"yearly": "Custom"
},
"booster_plan": {
"monthly": "Custom",
"yearly": "Custom"
},
"grace_plann": {
"monthly": 30,
"yearly": 30
},
"no_of_chatbots": null, // null = unlimited/custom
"no_of_chat_sessions": {
"monthly": null,
"yearly": null
},
"website_pages_crawl_cap": {
"monthly": null,
"yearly": null
},
"features": {
"custom_branding": true,
"priority_support": true,
"advanced_analytics": true,
"white_labeling": true,
"api_access": true,
"multi_language_support": true
}
}
]
}
Dynamic Plan Generation Algorithm:
Step 1: Fetch All Data (Lines 284-287)
subscriptions = list(subscriptions_collection.find({}))
subscription_ids = list(subscription_id_collection.find({}))
base_features = list(base_features_collection.find({}))
features_global = list(features_global_collection.find({}))
Step 2: Create Lookup Dictionaries (Lines 293-295)
subscription_id_lookup = {item["subscription_id"]: item for item in subscription_ids}
base_features_lookup = {item["baseFeatureID"]: item for item in base_features}
features_global_lookup = {item["featureID"]: item["feature"] for item in features_global}
Step 3: Extract All Unique Feature Names (Lines 298-306)
all_feature_names = set()
for subscription in subscriptions:
feature_ids = subscription.get("featureIDs", [])
for feature_id in feature_ids:
if feature_id in features_global_lookup:
feature_name = features_global_lookup[feature_id]
# Convert "Custom Branding" → "custom_branding"
key = feature_name.lower().replace(" ", "_").replace("-", "_")
all_feature_names.add(key)
Step 4: Group Subscriptions by Plan Category (Lines 309-432)
For each subscription document:
- Get category and billing cycle from
subscriptionIDlookup (Lines 313-321) - Initialize plan structure if not exists (Lines 328-339)
- Set pricing and offers for this billing cycle (Lines 342-350)
- Convert to
intif numeric, keep as"Custom"if custom - Fetch base features from
baseFeatureslookup (Lines 353-397) no_of_chatbots: Set once per plan (not per billing cycle)no_of_chat_sessions: Set per billing cyclewebsite_pages_crawl_cap: Set per billing cyclegrace_plann: Set per billing cycle (default 14 days)- Calculate
booster_plandynamically (Lines 399-415) - Monthly: 50% of monthly pricing
- Yearly: 45% of yearly pricing
- Free plans: 0
- Custom pricing: "Custom"
- Build features object from
featureIDs(Lines 417-431) - Reset all features to
False - Set matched features to
True
Step 5: Convert to List with Ordering (Lines 433-462)
- Add "free" plan first if it exists (Lines 438-447)
- Add other plans in database order (Lines 450-462)
- Generate consistent _id using MD5 hash of category name (Lines 440-441, 455-456)
Step 6: Serialize and Return (Lines 464-466)
serialized_plans = [serialize_datetime(plan) for plan in plans_list]
return {"plans": serialized_plans}
Custom vs Numeric Handling:
# Pricing
if pricing_value in ["Custom", "custom"]:
pricing = "Custom"
else:
pricing = int(pricing_value)
# Quotas
if no_of_chatbots in ["custom", "Custom"]:
no_of_chatbots = None # Rendered as unlimited in UI
else:
no_of_chatbots = int(no_of_chatbots)
Plan Ordering:
- Free plan always first
- Other plans in order they appear in
subscriptionscollection
⚠️ Key Observation: This endpoint does NOT cache results. Every request re-aggregates all 4 collections. For better performance, consider:
- Redis caching with TTL
- Materialized views
- Invalidation only on plan updates
4. POST /v2/admin-login¶
Purpose: Admin authentication for superadmin dashboard access.
[!CAUTION] > CRITICAL SECURITY VULNERABILITY: This endpoint authenticates using plain-text password comparison! Passwords are stored in MongoDB without hashing. This violates basic security practices and poses a severe risk.
Request:
POST /v2/admin-login
Content-Type: application/json
{
"email": "admin@machineagents.ai",
"password": "mysecretpassword"
}
Response (Success - 200):
{
"message": "Login successful",
"user_id": "64a1b2c3d4e5f6789012345",
"email": "admin@machineagents.ai"
}
Response (Error - 401):
Response (Error - 400):
Implementation (Lines 475-498):
@app.post("/v2/admin-login")
def admin_login(data: dict = Body(...)):
email = data.get("email")
password = data.get("password")
if not email or not password:
raise HTTPException(status_code=400, detail="Email and password required")
# ⚠️ CRITICAL: Plain-text password comparison!
user = users_collection.find_one({"email": email, "password": password})
if not user:
raise HTTPException(status_code=401, detail="Invalid credentials")
return {"message": "Login successful", "user_id": str(user["_id"]), "email": user["email"]}
Security Issues:
- No password hashing - Passwords stored in plain text
- No JWT tokens - No session management
- No rate limiting - Vulnerable to brute force attacks
- No MFA - Single factor authentication only
MongoDB Query:
db.users_multichatbot_v2.findOne({
email: "admin@machineagents.ai",
password: "mysecretpassword", // ⚠️ Plain text!
});
5. GET /v2/admin/get-user-history¶
Purpose: Fetch chat history for a specific user from chatbot_history collection.
Request:
Query Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
user_id |
string | ✅ | User ID to fetch history for |
Response (Success - 200):
{
"history": [
{
"_id": "64a1b2c3d4e5f6789012345",
"user_id": "User_123456_Project_1",
"project_id": "Project_123456",
"session_id": "session_abc123",
"messages": [
{
"role": "user",
"content": "What are your pricing plans?"
},
{
"role": "assistant",
"content": "We offer 4 plans: Free, Starter, Professional, and Enterprise..."
}
],
"timestamp": "2024-01-15T10:30:00Z"
}
]
}
Response (Empty - 200):
Implementation (Lines 502-516):
@app.get("/v2/admin/get-user-history")
def get_user_history(user_id: str = Query(..., description="User ID to fetch history for")):
history = list(history_collection.find({"user_id": user_id}))
if not history:
return {"history": []}
return {"history": [serialize_datetime(entry) for entry in history]}
MongoDB Query:
6. GET /v2/admin/users¶
Purpose: List all admin users from users_multichatbot_v2 collection.
Request:
Response (Success - 200):
{
"users": [
{
"_id": "64a1b2c3d4e5f6789012345",
"user_id": "User_123456_Project_1",
"email": "admin@machineagents.ai",
"password": "plaintext_password", // ⚠️ Exposed in response!
"name": "Admin User",
"created_at": "2024-01-15T10:30:00Z",
"subscription_plan": "enterprise",
"billing_cycle": "yearly"
},
{
"_id": "64b2c3d4e5f6789012346",
"user_id": "User_234567_Project_2",
"email": "admin2@machineagents.ai",
"password": "another_plaintext_password", // ⚠️ Exposed!
"name": "Admin User 2",
"created_at": "2024-02-20T14:45:00Z",
"subscription_plan": "professional",
"billing_cycle": "monthly"
}
]
}
Implementation (Lines 519-529):
@app.get("/v2/admin/users")
def get_all_admin_users():
users = list(users_collection.find({}))
return {"users": [serialize_datetime(user) for user in users]}
⚠️ CRITICAL SECURITY ISSUE:
This endpoint returns ALL user data including plain-text passwords! This is a severe security vulnerability. The response should:
- Exclude
passwordfield entirely - Only return necessary fields
- Implement proper access control
Recommended Fix:
7. GET /v2/get-user-data/{user_id}¶
Purpose: Get detailed user data for a specific user by user_id or _id.
Request:
Response (Success - 200):
{
"user_data": {
"_id": "64a1b2c3d4e5f6789012345",
"user_id": "User_123456_Project_1",
"email": "admin@machineagents.ai",
"password": "plaintext_password", // ⚠️ Exposed!
"name": "Admin User",
"created_at": "2024-01-15T10:30:00Z",
"subscription_plan": "enterprise",
"billing_cycle": "yearly",
"subscription_status": "active",
"last_login": "2024-03-15T14:30:00Z"
}
}
Response (Error - 404):
Implementation (Lines 532-561):
@app.get("/v2/get-user-data/{user_id}")
def get_user_data(user_id: str):
# Try finding by user_id field first
user_data = users_collection.find_one({"user_id": user_id})
if not user_data:
# Fallback: Try finding by _id (ObjectId)
try:
user_data = users_collection.find_one({"_id": ObjectId(user_id)})
except:
pass
if not user_data:
raise HTTPException(status_code=404, detail="User not found")
# Convert ObjectId to string
if "_id" in user_data:
user_data["_id"] = str(user_data["_id"])
serialized_data = serialize_datetime(user_data)
return {"user_data": serialized_data}
Dual Lookup Logic:
- First attempt: Find by
user_idfield (Lines 537) - Fallback: If not found, try
_idas ObjectId (Lines 541-543) - Error handling: Silent catch if ObjectId conversion fails
⚠️ SECURITY ISSUE: Same as endpoint #6 - password is exposed in response.
8. GET /healthcheck¶
Purpose: Basic health check for container orchestration (Docker, Kubernetes).
Request:
Response (Success - 200):
Implementation (Lines 564-567):
@app.get("/healthcheck")
def healthcheck():
"""Basic health check endpoint for container status."""
return {"status": "running", "service": "superadmin-service"}
⚠️ LIMITATION: This health check does NOT verify:
- MongoDB connectivity
- Azure Blob Storage connectivity
- Actual service functionality
Recommended Enhancement:
@app.get("/healthcheck")
def healthcheck():
try:
# Test MongoDB
users_collection.find_one({})
# Test Azure Blob (optional)
get_blob_client().list_blobs(max_results=1)
return {"status": "healthy", "service": "superadmin-service"}
except Exception as e:
return {"status": "unhealthy", "error": str(e)}
Utility Functions¶
1. get_blob_client() (Lines 59-65)¶
Purpose: Initialize Azure Blob Storage client for legal document retrieval.
Implementation:
def get_blob_client():
if not AZURE_STORAGE_CONNECTION_STRING or not AZURE_CONTAINER_NAME:
logger.error("Azure Storage configuration is missing")
raise ValueError("Azure Storage configuration is missing.")
blob_service = BlobServiceClient.from_connection_string(AZURE_STORAGE_CONNECTION_STRING)
logger.debug(f"Azure Blob client initialized for container: {AZURE_CONTAINER_NAME}")
return blob_service.get_container_client(AZURE_CONTAINER_NAME)
Returns: ContainerClient for the specified container
Error Handling: Raises ValueError if environment variables are missing
2. serialize_datetime(obj) (Lines 68-77)¶
Purpose: Recursively convert datetime and ObjectId objects to JSON-serializable formats.
Implementation:
def serialize_datetime(obj):
if isinstance(obj, datetime):
return obj.isoformat() # "2024-01-15T10:30:00Z"
elif isinstance(obj, ObjectId):
return str(obj) # "64a1b2c3d4e5f6789012345"
elif isinstance(obj, dict):
return {key: serialize_datetime(value) for key, value in obj.items()}
elif isinstance(obj, list):
return [serialize_datetime(item) for item in obj]
return obj
Transformations:
| Input Type | Output Type | Example |
|---|---|---|
datetime(2024, 1, 15, 10, 30) |
str (ISO 8601) |
"2024-01-15T10:30:00" |
ObjectId("64a1b2...") |
str |
"64a1b2c3d4e5f6789012345" |
{"_id": ObjectId(...)} |
{"_id": "64a1..."} |
Recursive dict processing |
[ObjectId(...), ...] |
["64a1...", ...] |
Recursive list processing |
Used In:
- All user-facing endpoints to ensure MongoDB data is JSON-compatible
- Subscription plans serialization (Line 464)
- User history serialization (Line 512)
- User data serialization (Line 555)
CORS Configuration¶
Lines 32-38:
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # ⚠️ SECURITY ISSUE!
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
[!WARNING] > SECURITY VULNERABILITY: Permissive CORS allows requests from ANY origin (
allow_origins=["*"]). This should be restricted to:
- Superadmin dashboard domain (e.g.,
https://admin.machineagents.ai)- Local development (
http://localhost:3000)
Recommended Fix:
ALLOWED_ORIGINS = os.getenv("ALLOWED_ORIGINS", "https://admin.machineagents.ai").split(",")
app.add_middleware(
CORSMiddleware,
allow_origins=ALLOWED_ORIGINS,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
Security Issues Summary¶
🔴 CRITICAL Security Vulnerabilities¶
1. Plain-Text Password Storage & Authentication (Lines 487, 523)
Issue: Passwords are stored in MongoDB without hashing and compared directly.
Code:
Database:
Impact:
- If database is compromised, all passwords are exposed
- Violates OWASP, PCI DSS, GDPR, and all security standards
- No protection against rainbow table attacks
Recommended Fix:
import bcrypt
# During signup/password change
hashed_password = bcrypt.hashpw(password.encode('utf-8'), bcrypt.gensalt())
users_collection.insert_one({"email": email, "password": hashed_password})
# During login
user = users_collection.find_one({"email": email})
if user and bcrypt.checkpw(password.encode('utf-8'), user["password"]):
# Login successful
2. Password Exposure in API Responses (Lines 523-525)
Issue: /v2/admin/users endpoint returns ALL user data including plain-text passwords.
Code:
users = list(users_collection.find({})) # Includes password field!
return {"users": [serialize_datetime(user) for user in users]}
Response:
{
"users": [
{
"email": "admin@machineagents.ai",
"password": "mysecretpassword" // ⚠️ Exposed to frontend!
}
]
}
Impact:
- Any authenticated admin can see ALL admin passwords
- Even with HTTPS, this is a severe vulnerability
- Passwords could be logged by proxies, CDNs, or browser extensions
Recommended Fix:
3. No Session Management (Lines 475-498)
Issue: No JWT tokens, session IDs, or any form of session tracking after login.
Impact:
- No way to track authenticated users
- No logout functionality
- No session expiration
- Every request requires re-authentication
Recommended Fix:
import jwt
from datetime import datetime, timedelta
SECRET_KEY = os.getenv("JWT_SECRET_KEY")
@app.post("/v2/admin-login")
def admin_login(data: dict = Body(...)):
# ... authentication logic ...
# Generate JWT token
payload = {
"user_id": str(user["_id"]),
"email": user["email"],
"exp": datetime.utcnow() + timedelta(hours=24)
}
token = jwt.encode(payload, SECRET_KEY, algorithm="HS256")
return {"message": "Login successful", "token": token}
🟡 Medium Severity Issues¶
4. Overly Permissive CORS (Lines 32-38)
Issue: allow_origins=["*"] allows requests from any domain.
Recommended Fix: Whitelist specific domains (covered in CORS Configuration section).
5. No Rate Limiting
Issue: No rate limiting on login endpoint - vulnerable to brute force attacks.
Recommended Fix:
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.post("/v2/admin-login")
@limiter.limit("5/minute") # 5 attempts per minute
def admin_login(...):
...
6. No Input Validation
Issue: No validation of email format, password strength, or user_id format.
Recommended Fix:
from pydantic import BaseModel, EmailStr, validator
class AdminLoginRequest(BaseModel):
email: EmailStr
password: str
@validator('password')
def password_min_length(cls, v):
if len(v) < 8:
raise ValueError('Password must be at least 8 characters')
return v
@app.post("/v2/admin-login")
def admin_login(data: AdminLoginRequest):
...
Code Quality Issues¶
1. Inconsistent Error Handling¶
Some endpoints return detailed error messages, others return generic ones:
Detailed (Good):
Generic (Bad):
Recommendation: Standardize error responses across all endpoints.
2. No Pagination¶
Endpoints like /v2/admin/users and /v2/admin/get-user-history return ALL results without pagination.
Issue: Could cause memory issues or slow responses with large datasets.
Recommended Fix:
@app.get("/v2/admin/users")
def get_all_admin_users(skip: int = 0, limit: int = 50):
users = list(users_collection.find({}).skip(skip).limit(limit))
total_count = users_collection.count_documents({})
return {
"users": [serialize_datetime(user) for user in users],
"total": total_count,
"skip": skip,
"limit": limit
}
3. Silent Exception Handling (Lines 541-544)¶
Code:
if not user_data:
try:
user_data = users_collection.find_one({"_id": ObjectId(user_id)})
except:
pass # ⚠️ Silent catch!
Issue: Swallows all exceptions without logging, making debugging difficult.
Recommended Fix:
if not user_data:
try:
user_data = users_collection.find_one({"_id": ObjectId(user_id)})
except Exception as e:
logger.warning(f"Failed to parse user_id as ObjectId: {e}")
4. No Caching for Subscription Plans¶
The /v2/subscriptions/plans endpoint re-aggregates 4 collections on EVERY request.
Performance Impact:
- 4 database queries per request
- Complex aggregation logic (192 lines)
- No caching layer
Recommended Fix:
from functools import lru_cache
from datetime import datetime, timedelta
# Cache for 5 minutes
@lru_cache(maxsize=1)
def get_subscription_plans_cached(cache_time: int):
# ... existing logic ...
return plans
@app.get("/v2/subscriptions/plans")
def get_subscription_plans():
# Cache key changes every 5 minutes
cache_key = int(datetime.now().timestamp() / 300)
return get_subscription_plans_cached(cache_key)
Or use Redis:
import redis
import json
redis_client = redis.Redis(host='localhost', port=6379, db=0)
@app.get("/v2/subscriptions/plans")
def get_subscription_plans():
# Try cache first
cached = redis_client.get("subscription_plans")
if cached:
return json.loads(cached)
# Generate plans
plans = ... # existing logic
# Cache for 5 minutes
redis_client.setex("subscription_plans", 300, json.dumps(plans))
return plans
Deployment Configuration¶
Dockerfile (Lines 1-12)¶
Path: superadmin-service/Dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ /app/src/
EXPOSE 8020
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8020"]
Key Points:
- Base image:
python:3.9-slim(minimal image) - Working directory:
/app - Port exposed:
8020 - No build-time dependencies (no FFmpeg, etc.)
Requirements.txt (Lines 1-12)¶
Path: superadmin-service/requirements.txt
fastapi
pymongo
pytz
tiktoken
uvicorn
python-dotenv
python-multipart
pydantic
httpx
pydantic-settings
azure-storage-blob==12.18.2
pymilvus==2.3.4
Observations:
- tiktoken - Not used in this service (likely copy-paste from other services)
- pymilvus - Not used (service only uses MongoDB, not Milvus)
- azure-storage-blob - ✅ Used for legal document retrieval
- httpx - Not used in current code
Recommended Cleanup:
Remove: tiktoken, pymilvus, httpx, pytz, python-multipart (unless needed for future features)
Docker Compose Configuration¶
Path: machineagents-be/docker-compose_dev.yaml (Lines 338-350)
superadmin-service:
build: ./superadmin-service
container_name: superadmin-service
ports:
- "8020:8020"
networks:
- app-network
environment:
- MONGO_URI=mongodb://dev-machineagent-rc:...
- MONGO_DB_NAME=Machine_agent_dev
- DEFAULT_DATABASE=milvus # ⚠️ Not used by this service
- MILVUS_HOST=milvus-standalone # ⚠️ Not used
- MILVUS_PORT=19530 # ⚠️ Not used
Missing Environment Variables:
⚠️ CRITICAL: Without these, the legal document endpoints will fail!
Data Flow Diagrams¶
Legal Document Retrieval Flow¶
sequenceDiagram
participant Dashboard as Superadmin Dashboard
participant API as Superadmin Service
participant Blob as Azure Blob Storage
Dashboard->>+API: GET /v2/get-latest-terms-conditions
API->>+Blob: List blobs with prefix<br/>"machineagents/termandcondition/"
Blob-->>-API: Blob list
Note over API: Find latest by timestamp<br/>(lexicographic comparison)
API->>+Blob: Get blob properties<br/>for "terms_conditions_20240315_143022.pdf"
Blob-->>-API: Blob metadata (size, last_modified, URL)
API-->>-Dashboard: JSON response with PDF URL
Dashboard->>Blob: Direct download from blob URL
Subscription Plans Generation Flow¶
sequenceDiagram
participant Dashboard as Superadmin Dashboard
participant API as Superadmin Service
participant DB as MongoDB (4 collections)
Dashboard->>+API: GET /v2/subscriptions/plans
par Fetch All Collections
API->>+DB: Find all in "subscriptions"
DB-->>-API: Subscription documents
API->>+DB: Find all in "subscriptionID"
DB-->>-API: Subscription metadata
API->>+DB: Find all in "baseFeatures"
DB-->>-API: Quota data
API->>+DB: Find all in "featuresGlobal"
DB-->>-API: Feature definitions
end
Note over API: Create lookup dictionaries<br/>(subscription_id, baseFeatureID, featureID)
Note over API: Extract all unique feature names<br/>(convert to snake_case)
loop For each subscription
Note over API: Get category + billing cycle<br/>from subscriptionID lookup
Note over API: Set pricing[billing_cycle]<br/>and offers[billing_cycle]
Note over API: Get quotas from baseFeatures<br/>(chatbots, sessions, crawl_cap)
Note over API: Calculate booster_plan<br/>(50% monthly, 45% yearly)
Note over API: Build features object<br/>from featureIDs
end
Note over API: Sort: "free" first,<br/>then database order
Note over API: Generate _id hashes<br/>using MD5 of category
Note over API: Serialize all datetimes<br/>and ObjectIds
API-->>-Dashboard: JSON with plans array
Admin Login Flow¶
sequenceDiagram
participant Dashboard as Superadmin Dashboard
participant API as Superadmin Service
participant DB as MongoDB (users_multichatbot_v2)
Dashboard->>+API: POST /v2/admin-login<br/>{email, password}
alt Missing credentials
API-->>Dashboard: 400 Bad Request<br/>"Email and password required"
end
API->>+DB: Find user with email + password<br/>(plain-text comparison!)
DB-->>-API: User document or null
alt User not found
API-->>Dashboard: 401 Unauthorized<br/>"Invalid credentials"
else User found
Note over API: Extract user_id and email
API-->>-Dashboard: 200 OK<br/>{message, user_id, email}
end
Note over Dashboard: ⚠️ No JWT token returned!<br/>No session tracking!
Integration Points¶
1. Superadmin Dashboard (Frontend)¶
Consumers:
- Superadmin web dashboard (likely React/Next.js)
- Mobile admin app (if exists)
Authentication Flow:
- Dashboard calls
POST /v2/admin-loginwith email/password - On success, stores
user_idandemaillocally - ⚠️ No token-based auth - unclear how subsequent requests are authenticated
Subscription Management:
- Dashboard calls
GET /v2/subscriptions/plans - Displays plans in UI
- Users can select plans (but subscription creation not in this service)
User Management:
- Dashboard calls
GET /v2/admin/usersto list all admins - Calls
GET /v2/get-user-data/{user_id}for details - Calls
GET /v2/admin/get-user-history?user_id=...for chat logs
2. Azure Blob Storage¶
Container: machineagents
Directory Structure:
machineagents/
├── termandcondition/
│ ├── terms_conditions_20240115_103000.pdf
│ ├── terms_conditions_20240220_144500.pdf
│ └── terms_conditions_20240315_143022.pdf ← Latest (selected)
├── privacypolicy/
│ ├── privacy_policy_20240115_103000.pdf
│ ├── privacy_policy_20240220_144500.pdf
│ └── privacy_policy_20240315_143022.pdf ← Latest (selected)
└── (other files...)
Blob Metadata:
Each PDF blob can have custom metadata:
{
"original_filename": "terms_and_conditions_v2.pdf",
"uploaded_by": "admin@machineagents.ai",
"version": "2.0"
}
3. MongoDB Collections Usage¶
| Collection | Used By Endpoints | Purpose |
|---|---|---|
users_multichatbot_v2 |
/v2/admin-login, /v2/admin/users, /v2/get-user-data/{user_id} |
Admin user authentication and management |
subscriptions |
/v2/subscriptions/plans |
Plan pricing and feature IDs |
subscriptionID |
/v2/subscriptions/plans |
Plan metadata (category, billing cycle) |
baseFeatures |
/v2/subscriptions/plans |
Quotas (chatbots, sessions, crawl cap, grace period) |
featuresGlobal |
/v2/subscriptions/plans |
Feature definitions (names mapped to IDs) |
chatbot_history |
/v2/admin/get-user-history |
User chat logs for debugging/support |
projectid_creation |
(Referenced but not queried) | User projects |
chatbot_selections |
(Referenced but not queried) | Chatbot configurations |
Summary¶
Service Statistics¶
- Total Lines: 570
- Total Functions: 8 endpoints + 2 utilities = 10 functions
- Active Endpoints: 8 (7 main + 1 healthcheck)
- Commented Out Code: 66 lines (original PDF streaming implementations)
- Database Collections Used: 8
- External Dependencies: Azure Blob Storage
Key Features¶
✅ Admin Authentication - Login endpoint (⚠️ with plain-text passwords)
✅ User Management - List users, get user data, fetch chat history
✅ Subscription Plans - Dynamic plan generation from 4 collections
✅ Legal Documents - Fetch latest Terms & Conditions and Privacy Policy PDFs
✅ Health Check - Container status endpoint
Critical Issues¶
🔴 CRITICAL - Plain-Text Passwords - No password hashing
🔴 CRITICAL - Password Exposure - Passwords returned in API responses
🔴 CRITICAL - No Session Management - No JWT tokens or session tracking
🟡 MEDIUM - Permissive CORS - Allows requests from any origin
🟡 MEDIUM - No Rate Limiting - Vulnerable to brute force
🟡 MEDIUM - No Caching - Subscription plans re-aggregated on every request
Code Quality Issues¶
⚠️ Unused Dependencies - tiktoken, pymilvus, httpx not used
⚠️ Silent Exception Handling - Try/except with pass (Line 541-544)
⚠️ No Pagination - User list endpoint returns all results
⚠️ No Input Validation - Email/password validation missing
Performance Characteristics¶
- Subscription Plans Generation: ~100-500ms (4 DB queries + aggregation)
- Legal Document Retrieval: ~200-800ms (Blob Storage listing + metadata fetch)
- Admin Login: ~50-200ms (Single DB query)
- User List/History: ~50-300ms (Single DB query, unbounded)
Recommendations¶
Immediate Actions (Security):
- ✅ Implement bcrypt password hashing
- ✅ Exclude passwords from API responses
- ✅ Add JWT token-based authentication
- ✅ Restrict CORS to specific domains
- ✅ Add rate limiting to login endpoint
Performance Improvements:
- ✅ Add Redis caching for subscription plans
- ✅ Implement pagination for user lists
- ✅ Add database indexes on frequently queried fields
Code Quality:
- ✅ Remove unused dependencies from requirements.txt
- ✅ Add input validation with Pydantic models
- ✅ Improve error handling with proper logging
- ✅ Add unit tests for subscription plan generation logic
Related Services¶
This service integrates with:
- User Service (Port 8004) - User account creation and management
- Auth Service (Port 8005) - (Should be used for authentication instead of custom logic)
- Gateway Service (Port 8002) - API gateway for routing requests
- All other services - Manages users who own chatbots across all services
Documentation Complete: Superadmin Service (Port 8020) ✅