Employee Management System (EMS)

Employee Management System (EMS)

Employee Management System (EMS)

A comprehensive enterprise-grade HR platform that achieved 95% API performance improvement, 85%+ cache hit rate, and automated workforce management for thousands of employees with real-time tracking, data-driven performance evaluation, and GDPR compliance

Role:Full-Stack Developer / System Architect
Year:2024-2026
Next.js 16.0.7React 19.2.1React Native 0.81.5TypeScript 5.9.2Node.jsExpress 4.18.2MongoDBRedisSocket.io 4.8.1AWS S3Firebase Cloud MessagingDockerTailwind CSS 4

Problem

The Challenge

Context

Organizations were facing significant challenges in managing their workforce efficiently. Traditional HR processes relied heavily on manual paperwork, spreadsheets, and disconnected systems, creating operational inefficiencies and security vulnerabilities. Manual attendance tracking was error-prone, leave management through emails made balance tracking difficult, performance evaluation was subjective, and employee documents were scattered across physical files and multiple systems.

User Pain Points

1

Manual attendance tracking was error-prone and lacked real-time visibility into employee presence

2

Leave management through emails and paper forms made balance tracking and conflict detection difficult

3

Performance evaluation was subjective without data-driven metrics or historical comparisons

4

Employee documents scattered across physical files and multiple systems with no centralized access

5

Task coordination relied on ad-hoc communication channels leading to missed deadlines

6

No centralized audit logging or security policy enforcement for sensitive HR data

7

Lack of real-time communication platform for HR updates and notifications

8

No mobile support for on-the-go employee access to HR services

Why Existing Solutions Failed

Generic HR software lacked customization for specific organizational needs, monolithic legacy solutions were difficult to scale or modify, high licensing costs for enterprise solutions, limited integration capabilities with biometric devices, poor user experience with outdated interfaces, insufficient security features for sensitive employee data, and no support for role-based access with granular permissions.

Goals & Metrics

What We Set Out to Achieve

Objectives

  • 01Automate core HR processes including attendance tracking, leave management, and approval workflows
  • 02Provide role-based access through separate portals (Employee, Admin, SuperAdmin) with granular permissions
  • 03Implement data-driven performance evaluation system with configurable metrics and transparency
  • 04Enable real-time communication via WebSockets and push notifications for instant updates
  • 05Ensure security through IP geofencing, device management, CSRF protection, and comprehensive audit logging
  • 06Support biometric device integration for accurate attendance tracking and deduplication
  • 07Achieve GDPR compliance with privacy consent management and automated data retention policies
  • 08Deliver mobile accessibility for employees through React Native mobile application

Success Metrics

  • 0195% improvement in API response times (reduced from 2-3 seconds to 50-200ms)
  • 0285%+ cache hit rate for frequently accessed data reducing database load
  • 0360% reduction in database queries through aggregation pipeline optimization
  • 0430-40% storage cost reduction via automated data retention and cleanup
  • 0595%+ anomaly detection accuracy in activity tracking using Z-score analysis
  • 06Support for 4 user roles (SuperAdmin, Admin, SemiAdmin, Employee) with distinct permissions
  • 07Real-time WebSocket updates delivered within milliseconds across all connected clients
  • 08ZKTeco biometric device integration with automatic employee PIN mapping

User Flow

User Journey

The system handles seven primary user flows: authentication with multi-layer security, employee check-in/out with geofencing, leave request submission and approval, task assignment and verification, automated performance calculation, document management with S3 storage, and real-time admin dashboards with analytics.

start
User Login
action
JWT Validation
decision
IP Geofencing Check
action
Device Validation
action
Portal Access Granted
action
Attendance Check-In
action
Leave Request Submission
decision
Admin Approval
action
Task Assignment
action
Performance Calculation
end
Real-time Dashboard

Architecture

System Design

Micro-frontend architecture with three separate Next.js portals (Employee, Admin, SuperAdmin) plus React Native mobile app, all connected to a centralized Express backend API. This enables independent deployment cycles, role-specific optimization, security isolation, and team scalability while maintaining consistent business logic through shared services.

Loading diagram...

What It Represents

This diagram shows the complete EMS system at the highest level, illustrating how four separate frontend applications (Employee Portal, Admin Portal, SuperAdmin Portal, and Mobile App) communicate with a centralized Express backend, which in turn interacts with databases, caches, and external services.

How to Read the Flow

Horizontal Layers (Left to Right): 1. Frontend Layer: Four independent Next.js/React Native applications 2. Backend Layer: Express API Gateway + Service modules (Performance, Attendance, Notification, ZKTeco, WebSocket) 3. Data Layer: MongoDB (primary database), Redis (cache), Bull (job queue) 4. External Services: AWS S3, Firebase Cloud Messaging, IPData API, ZKTeco devices Key Connections: • All frontend applications connect to the Express API via HTTP/WebSocket • Express API delegates to specialized service modules • Services query MongoDB for persistent data and Redis for cached data • Notification Service enqueues jobs in Bull queue for background FCM processing • WebSocket Service broadcasts real-time events back to connected frontend clients • External devices (ZKTeco biometric) push data to backend endpoints

Engineering Decision Highlighted

Micro-Frontend + Centralized Backend: This hybrid architecture provides deployment flexibility (frontends can ship independently) while maintaining data consistency and business logic centralization in the backend. The single backend avoids microservices complexity while the separate frontends enable role-specific optimization. Why Redis Sits Between Services and MongoDB: Redis serves as both a cache layer (85%+ hit rate reducing MongoDB load) and a job queue backend (Bull), demonstrating the architectural pattern of using in-memory storage for performance-critical hot paths.

Layer Breakdown

Frontend Layer

Employee Portal (Next.js 16, Port 3000)Admin Portal (Next.js 16, Port 3001)SuperAdmin Portal (Next.js 16, Port 3002)Mobile App (React Native 0.81.5 with Expo)Redux Toolkit + TanStack Query + ZustandSocket.io Client for real-time updatesTailwind CSS 4 / NativeWind 4.2.1

Backend Layer

Express 4.18.2 REST API (Port 5000)TypeScript 5.1.6 for type safetySocket.io 4.8.1 WebSocket serverJWT Authentication + CSRF ProtectionIP Geofencing middlewareDevice validation middlewareBull 4.12.0 job queuenode-cron scheduled tasks

Data Layer

MongoDB 9.0.1 (25+ collections)Mongoose 9.0.1 ODMRedis (in-memory cache)AWS S3 (document storage)Compound indexes on frequent queriesAggregation pipelines for analytics

External Integrations

Firebase Cloud Messaging (push notifications)ZKTeco Biometric Devices (iClock/ADMS API)IPData API (geolocation for audit)AWS S3 SDK v3 (presigned URLs)

Security

Authentication & Authorization

Multi-layer security architecture with JWT authentication, CSRF protection, IP geofencing, and device validation ensuring defense in depth.

Loading diagram...

What It Represents

This diagram traces a complete authentication request through the Express middleware chain, showing how a user's login attempt passes through 11 security layers before reaching the route handler. Each middleware node represents a security checkpoint that validates, enriches, or potentially rejects the request.

How to Read the Flow

Vertical Flow (Top to Bottom): 1. User enters credentials in frontend 2. Frontend sends POST /api/auth/login 3. Request enters Express and flows through middleware chain sequentially (order matters!) 4. Each middleware either passes request to next layer or returns error response Middleware Chain Order (Critical): 1. Helmet: Sets security headers (CSP, X-Frame-Options) 2. CORS: Validates origin against whitelist 3. Rate Limiting: Checks IP request count in Redis (1000/15min general, 100/15min auth) 4. CSRF: Validates token from header against stored token 5. JWT Authentication: Verifies JWT from httpOnly cookie, queries User in MongoDB 6. Role Authorization: Checks if user role matches required role for endpoint 7. IP Geofencing: Validates client IP against office CIDR ranges (employees only) 8. Device Validation: Checks if device ID is in trusted devices (sensitive operations only) Database Interactions: • JWT middleware queries MongoDB to fetch User document (verify account is active) • Geofencing middleware queries Redis for SystemSettings (cached office IP ranges) • Geofencing middleware calls IPData API for IP geolocation (audit logging) • Device validation queries MongoDB to check Employee.trustedDevice array

Engineering Decision Highlighted

Defense in Depth: Instead of a single authentication check, requests pass through 8+ validation layers. This provides security redundancy—if one layer fails (e.g., CSRF tokens disabled for mobile app), other layers (JWT + device validation) still protect the system. IP Geofencing Trade-off: The diagram shows geofencing querying IPData API for geolocation. This external dependency could slow requests or fail, but the system is designed to fail-open (allow access with warning) prioritizing availability over strict security for non-critical operations. Audit logs capture all geofence violations for post-incident review. Trust Proxy Configuration: The diagram shows multiple IP sources (X-Forwarded-For, X-Real-IP, req.ip). This highlights the engineering challenge of accurate IP detection behind proxies/load balancers, solved by configuring Express to trust proxy headers and implementing fallback logic.

Core Feature

Attendance Tracking

Real-time attendance management with IP geofencing, biometric integration, and automatic cache invalidation for instant dashboard updates.

Loading diagram...

What It Represents

This diagram shows the complete journey of a single attendance check-in action, from the moment an employee clicks the "Check-In" button through validation, database storage, cache invalidation, real-time WebSocket updates, background FCM job queuing, and final UI confirmation. It illustrates both the synchronous response path (user gets confirmation) and asynchronous background paths (admin dashboard updates, push notifications sent).

How to Read the Flow

Main Synchronous Path (Left Side, ~80-170ms total): 1. Employee clicks Check-In button 2. Client validates (already checked in?) 3. API request with JWT + Device ID 4. Middleware chain validates (JWT + CSRF + Geofence + Device) 5. Route handler calls Attendance Service 6. Service fetches SystemSettings from Redis (work hours, grace period) 7. Service creates Attendance record in MongoDB 8. MongoDB returns created record 9. JSON response returned to client 10. UI updates (button disabled, confirmation shown) Parallel Asynchronous Paths (Right Side): Path A: Cache Invalidation • MongoDB post-save hook triggers • Redis cache key deleted: attendance:summary:{userId} • Next request will fetch fresh data Path B: Real-Time WebSocket Update • Attendance Service calls WebSocket Service • WebSocket emits event to admin:onboarding room • All connected admin dashboards receive event instantly • Admin dashboard shows new check-in without page refresh Path C: Push Notification (Background) • Attendance Service calls Notification Service • Notification stored in MongoDB • FCM job enqueued in Bull queue • Background worker picks up job • Firebase Cloud Messaging sends push notification (batched, up to 500 tokens) • Admin mobile devices receive push notification

Engineering Decision Highlighted

Synchronous vs. Asynchronous Processing: The diagram clearly separates the critical path (user confirmation must be fast) from nice-to-have features (real-time dashboard updates, push notifications). By making cache invalidation, WebSocket events, and FCM jobs non-blocking, the user receives confirmation in 80-170ms even if Redis is slow, WebSocket clients are disconnected, or FCM API is down. Cache Invalidation Placement: Notice cache invalidation happens after the database write (post-save hook). This ensures cache and database never have conflicting data. The trade-off is a brief window where cached data is stale, but this is acceptable for attendance data (not mission-critical for immediate consistency). Idempotent Design: The diagram shows a "Already checked in?" validation step in the route handler. If the user clicks Check-In twice (network retry, user impatience), the second request returns the same attendance record without creating a duplicate. This idempotent design prevents data corruption from retries.
80-170ms
Response Time
User confirmation delivered in under 200ms
Real-Time
WebSocket Updates
Admin dashboard updates instantly without refresh
Async
Background Jobs
FCM notifications queued for batch processing

Workflow

Task Management

Complete bidirectional workflow from task creation through employee completion and admin verification with S3-based proof file storage.

Loading diagram...

What It Represents

This diagram shows the complete multi-actor workflow of task management, involving three participants (Admin, Backend, Employee) across four distinct phases: Task Creation, Notification, Employee Completion, and Admin Verification. The diagram emphasizes the bidirectional communication (admin ↔ backend ↔ employee) and state transitions of the Task record.

How to Read the Flow

Phase 1: Task Creation (Top Section) 1. Admin fills create task form (title, deadline, priority, assignees) 2. API request: POST /api/tasks 3. Task Service processes assignment logic (individual → specific employees, department → all in department, global → all active employees) 4. Task record created with completedBy array (one entry per assignee) 5. MongoDB stores Task document 6. Notification Service triggers WebSocket events + FCM push to all assignees Phase 2: Employee Work (Middle Section) 7. Employee views task in task list (receives via WebSocket or polls API) 8. Employee clicks "Start Task" → status changes to active in MongoDB 9. Employee works on task (external to system) 10. Employee uploads proof files via presigned S3 URL 11. Employee clicks "Mark Complete" → POST /api/tasks/:id/complete 12. Backend updates completedBy array entry with proof files and verification status Phase 3: Admin Verification (Bottom Section) 13. Admin receives notification: "Task completed by Employee X" 14. Admin reviews proof files (downloads from S3) 15. Admin approves or rejects → PATCH /api/tasks/:id/verify/:employeeId 16. Backend updates verificationStatus 17. If verified, performance calculation cache invalidated (completion rate changes) 18. Employee receives notification of verification result

Engineering Decision Highlighted

Array-Based Completion Tracking: The diagram shows the Task model uses a completedBy array with one entry per assignee. This design enables independent tracking (each employee's completion tracked separately), partial completion visibility (admins see which employees completed vs. pending), and re-submission support (if rejected, employee can re-upload proof without creating new task). Why S3 for Proof Files: Proof files flow directly from employee to S3 (via presigned URL), then S3 URLs stored in MongoDB. This design avoids large file uploads through backend (saves bandwidth, reduces latency), file storage in MongoDB (documents limited to 16MB, file storage not optimal use case), and complicated file serving logic (S3 handles range requests, CDN distribution, durability). Performance Integration: Notice the diagram shows task completion triggering performance calculation cache invalidation. This illustrates how the performance system is event-driven—every task completion potentially changes the Completion (C) and Timeliness (T) metrics, requiring leaderboard recalculation on next request.

Assignment Types

Individual
Assign to specific employees with independent tracking
Department
Assign to all employees in selected departments
Global
Assign to all active employees organization-wide

Algorithm

Performance Evaluation

Data-driven four-metric scoring system (A, P, C, T) with configurable weights and strict_zero policy preventing score manipulation.

Loading diagram...

What It Represents

This diagram shows the algorithmic flow of calculating employee performance scores using the four-metric system (A, P, C, T) with configurable weights. It traces both the cache hit path (fast, ~50ms) and the cache miss path (compute-intensive, ~200-500ms) that fetches data from MongoDB, calculates metrics, applies the strict_zero policy, and generates the final leaderboard.

How to Read the Flow

Cache Hit Path (Fast Path, Left Side): 1. Admin dashboard loads → GET /api/dashboard/leaderboard 2. Check Redis cache for key leaderboard:all 3. Cache hit: Return cached leaderboard (TTL: 3 minutes) 4. Response to frontend in ~50ms Cache Miss Path (Compute Path, Right Side): 1. Cache miss: Calculate fresh leaderboard 2. Fetch all active employees from MongoDB 3. For each employee (loop): - Aggregate attendance data - Aggregate task data - Calculate A, P, C, T metrics - Apply strict_zero policy (missing data = 0%) - Fetch configurable weights from SystemSettings - Calculate final score with weights - Clamp score to 0-100% - Assign performance band 4. Sort employees by score descending 5. Store leaderboard in Redis cache (TTL: 3 minutes) 6. Return leaderboard to frontend (~200-500ms)

Engineering Decision Highlighted

strict_zero Policy: This is the most critical engineering decision in the performance system. If attendance data is missing (new employee), A = 0%, P = 0% (not 100% or excluded). If task data is missing (no tasks assigned), C = 0%, T = 0%. This prevents gaming: new employees with no data would otherwise score 100% (unfair advantage), employees could delete assigned tasks to inflate completion rate, and missing metrics would be excluded, rescaling weights (inconsistent scoring). Cache TTL Trade-off: The diagram shows Redis cache with 3-minute TTL. This means leaderboard updates every 3 minutes (near real-time), reduces MongoDB query load by ~95%, but may show slightly stale data for up to 3 minutes. Cache is invalidated explicitly when attendance or task records saved (post-save hooks). Serial vs. Parallel Calculation: The diagram shows a loop indicating serial calculation (one employee at a time). This is a known performance bottleneck (100 employees = 100 sequential aggregations), but acceptable because leaderboard is cached for 3 minutes, individual aggregations are fast (~2-5ms with proper indexes), and future optimization can use Promise.all() for parallel calculation.

Four-Metric System

A
Attendance
(presentDays / scheduledDays) × 100
Weight: 30%
P
Punctuality
max(0, 100 - avgLateMinutes)
Weight: 20%
C
Completion
(completedTasks / assignedTasks) × 100
Weight: 30%
T
Timeliness
(onTimeTasks / completedTasks) × 100
Weight: 20%

strict_zero Policy

Missing metrics are treated as 0% (not excluded from calculation). This prevents score manipulation where employees could delete tasks to improve completion rate, or new employees with no data would score 100% unfairly. Ensures consistent, transparent scoring across all employees.

Data Flow

How Data Moves

Data flows through multiple layers with security validation at each step. Client requests pass through authentication, authorization, and geofencing middleware before reaching service layer. Services interact with MongoDB for persistence, Redis for caching, and trigger WebSocket events for real-time updates. External integrations handle push notifications (FCM) and file storage (S3).

1
Client Request
User submits HTTP request with JWT cookie and CSRF token from web portal or mobile app
2
Security Middleware
Express middleware validates JWT, CSRF token, IP geofencing (for employees), and device trust
3
Route Handler
Authenticated request routed to appropriate controller based on endpoint and role permissions
4
Service Layer
Business logic executed (attendance calculation, performance scoring, leave validation)
5
Database Operations
MongoDB CRUD operations via Mongoose with indexes and aggregation pipelines
6
Cache Layer
Redis stores frequently accessed data (user sessions, dashboard stats) with automatic invalidation
7
Real-time Events
WebSocket server broadcasts updates to connected clients in role-specific rooms
8
Push Notifications
Firebase Cloud Messaging sends notifications to web and mobile devices in batches
9
Client Response
JSON response returned to client with updated data and success/error status

Core Features

Key Functionality

01

Multi-Role Authentication System

What it does

JWT-based authentication with httpOnly cookies, CSRF protection, IP geofencing (hard/soft modes), device management, and role-based authorization for 4 user roles (SuperAdmin, Admin, SemiAdmin, Employee)

Why it matters

Prevents unauthorized access to sensitive HR data, ensures employees access from approved locations only, protects against XSS and CSRF attacks, and enforces principle of least privilege through granular permissions

Implementation

JWT tokens in httpOnly cookies with 30-day expiry for employees and 1-year for admins. Geofencing middleware validates employee IPs against CIDR ranges. Device fingerprinting with trusted device registration. CSRF tokens on state-changing requests.

02

Attendance Management with Biometric Integration

What it does

IP-geofenced attendance tracking with check-in/out, breaks, late calculation, auto-checkout scheduler, and ZKTeco biometric device integration via push API

Why it matters

Eliminates manual attendance tracking errors, prevents buddy punching through geofencing, provides real-time attendance visibility, and supports legacy biometric hardware already deployed in organizations

Implementation

Attendance records track timestamps with late minute calculation based on configurable grace period. ZKTeco devices push attendance via /iclock endpoint with payload parsing and employee PIN mapping. Auto-checkout scheduler runs daily.

03

Four-Metric Performance Evaluation

What it does

Calculates employee performance using Attendance (A), Punctuality (P), Task Completion (C), and Timeliness (T) metrics with configurable weights, generates performance bands, and creates leaderboards

Why it matters

Replaces subjective performance reviews with data-driven metrics, prevents manipulation through strict_zero policy (missing metrics = 0%), provides transparent scoring for employees, and enables fair comparisons across departments

Implementation

Performance service (477 lines) aggregates attendance data and task completions. Weights configurable per metric. Scores clamped to 0-100% with bands: Outstanding (≥95%), Excellent (≥85%), Good (≥70%), Needs Improvement (≥50%), Unsatisfactory (<50%).

04

Leave Management System

What it does

Leave request submission with full/half-day options, approval workflow, balance tracking, conflict detection, leave type customization, and integration with attendance records

Why it matters

Eliminates email-based leave requests, automatically validates balance availability, prevents scheduling conflicts, maintains accurate balance tracking with annual resets, and provides audit trail for compliance

Implementation

Supports sick, casual, annual, and unpaid leave types. Leave balances tracked per employee with scheduled annual reset. Conflict detection checks overlapping dates. Approved leaves automatically marked in attendance records.

05

Task Assignment and Verification

What it does

Task creation with individual/department/global assignment, priority levels, deadlines, file attachments, proof file uploads, and admin verification workflow with completion tracking

Why it matters

Replaces ad-hoc task coordination, ensures accountability through proof uploads, enables department-wide or company-wide task distribution, and feeds into performance calculation for objective evaluation

Implementation

Tasks support multiple assignment types with notification to relevant employees. Completion records track per-employee status with verification by admin. Task templates enable quick creation of recurring tasks. File storage in S3.

06

Real-Time Notification System

What it does

WebSocket events via Socket.io for instant browser updates plus Firebase Cloud Messaging for push notifications to web and mobile devices with role-based filtering

Why it matters

Ensures employees immediately know about approvals, task assignments, and system events without page refresh, reduces email overload, and maintains engagement through mobile push notifications

Implementation

WebSocket service maintains room-based connections (admin:onboarding, employee:{id}, user:{id}). FCM notifications sent in batches up to 500 tokens. Failed notifications logged for retry. Invalid tokens automatically deactivated.

07

Document Management with S3 Storage

What it does

Secure document storage in AWS S3 with presigned URLs, approval workflows for contracts/certifications/IDs, document change request system, and version tracking

Why it matters

Centralizes employee documents with secure access, replaces physical file storage, enables document approval workflow for compliance, reduces storage costs through S3, and maintains document history

Implementation

Documents uploaded to S3 with presigned URLs (15-minute expiry). Each document type has pending/approved/rejected status. Change requests allow employees to request updates with admin approval. Metadata in MongoDB.

08

Activity Tracking with Anomaly Detection

What it does

Employee session tracking with active/idle time measurement, hourly activity breakdown, and ML-based anomaly detection using Z-score analysis to identify unusual patterns

Why it matters

Provides insights into actual working hours vs. logged time, detects time manipulation attempts, identifies productivity patterns, and alerts admins to potential policy violations

Implementation

Sessions track total active and idle time with hourly breakdowns. Anomaly detection identifies 6 types: excessive active time, excessive idle time, unusual login times, session manipulation, idle anomalies, time fabrication. 95%+ confidence for serious anomalies.

09

Data Retention and GDPR Compliance

What it does

Automated data retention cleanup scheduler, privacy consent management with versioning, GDPR Right to Erasure implementation, and configurable retention periods per data type

Why it matters

Ensures compliance with GDPR regulations, reduces storage costs by removing stale data, maintains audit trail for legal requirements, and respects employee privacy rights

Implementation

Automated cleanup runs daily at 2 AM. Sessions retained 90 days, legal records 7 years. Privacy consent tracked per type with consent date and IP address. Right to Erasure endpoint anonymizes employee data while preserving legal records.

10

Reporting and Analytics

What it does

Comprehensive dashboards with real-time statistics, attendance heatmaps, performance leaderboards, department analytics, and PDF/Excel export functionality for reports

Why it matters

Enables data-driven HR decisions, identifies attendance trends, highlights top performers, provides visual insights through charts, and supports compliance reporting through exports

Implementation

Dashboard service aggregates statistics per department with Redis caching. PDFKit generates PDF reports. ExcelJS creates Excel exports. Recharts and ECharts for visualizations. Attendance charts service provides heatmap data.

Technical Challenges

Problems We Solved

Why This Was Hard

Activity tracking system was generating 2-3 second API response times due to complex MongoDB aggregations on large datasets (hourly activity records), lack of caching strategy, and inefficient query patterns retrieving unnecessary fields

Our Solution

Implemented Redis caching with 85%+ hit rate and automatic cache invalidation on mutations. Optimized MongoDB aggregation pipelines using $match early in pipeline. Added compound indexes on frequently queried fields (employeeId + date). Implemented pagination with offset/limit for large datasets. Used selective field projection with Mongoose .select(). Result: 95% improvement (2-3s to 50-200ms), 60% reduction in database queries.

Why This Was Hard

IP detection was inaccurate behind reverse proxies, CDNs (like Cloudflare), and load balancers, causing false geofence violations for legitimate office users. Express req.ip returned proxy IP instead of actual client IP, breaking the entire geofencing system

Our Solution

Configured Express trust proxy setting to properly handle X-Forwarded-For headers. Implemented custom IP detection utility checking multiple headers (X-Forwarded-For, X-Real-IP, CF-Connecting-IP) in priority order. Used ip-cidr library for CIDR range validation supporting both IPv4 and IPv6. Added geofence audit logging with IP geolocation via IPData API for forensics.

Why This Was Hard

ZKTeco biometric devices send data in specific proprietary formats via push API, requiring proper payload parsing for both iClock and ADMS protocols. Needed employee-device PIN mapping without breaking existing user IDs, plus deduplication to prevent duplicate attendance records from multiple pushes

Our Solution

Created dedicated /iclock route handler registered before CSRF middleware (devices cant send CSRF tokens). Implemented payload parsing for both iClock (cdata format) and ADMS (JSON) formats. Built employee-device PIN mapping with sparse unique index in Employee model. Added ZKTecoLog model for deduplication tracking. Created admin interface for PIN mapping management.

Why This Was Hard

Performance scores needed to be transparent, fair, and prevent manipulation. Early implementation allowed employees to delete tasks to improve their completion rate. Missing metrics (e.g., no tasks assigned) skewed scores by rescaling weights, creating unfair comparisons between employees

Our Solution

Implemented strict_zero policy where missing metrics are treated as 0% instead of being excluded from calculation. Used original weights without rescaling even when metrics are missing. Added soft deletion for tasks instead of hard delete. Scores clamped to 0-100% range with clear performance bands (Outstanding ≥95%, Excellent ≥85%, Good ≥70%, Needs Improvement ≥50%, Unsatisfactory <50%).

Why This Was Hard

WebSocket connections needed to scale with growing user base while ensuring role-filtered notifications reach only the correct recipients. Broadcasting to all clients would leak sensitive admin notifications to employees. FCM has batch limit of 500 tokens per request

Our Solution

Implemented Socket.io with room-based architecture (admin:onboarding, employee:{id}, user:{id}). JWT authentication for WebSocket connections with role extraction. Notifications filtered by role before broadcast. Firebase Cloud Messaging integrated with batch sending (chunks of 500 tokens). Failed notifications stored for manual retry. Invalid tokens automatically deactivated.

Why This Was Hard

Need to comply with GDPR data retention policies while maintaining system functionality and audit trails. Different data types require different retention periods (sessions 90 days, legal documents 7 years). Manual cleanup not scalable, and bulk deletion risked performance impact

Our Solution

Implemented automated daily cleanup running at 2 AM via node-cron scheduler. Configurable retention periods per collection in system settings. Batch deletion with limit to prevent database lock. Privacy consent management with versioning and IP tracking. GDPR Right to Erasure endpoint that anonymizes personal data while preserving legal records. Result: 30-40% storage cost reduction.

Engineering Excellence

Performance, Security & Resilience

Performance

  • Redis caching with 85%+ cache hit rate and automatic invalidation on mutations using cache-aside pattern
  • MongoDB compound indexes on frequently queried fields (employeeId + date, userId + type)
  • Aggregation pipelines optimized with $match early in pipeline to reduce documents processed
  • Batch processing for FCM notifications (up to 500 tokens per request instead of individual sends)
  • Selective field projection using Mongoose .select() to retrieve only needed fields
  • Pagination with offset/limit for large datasets to prevent memory overload
  • Connection pooling via Mongoose for efficient database connection reuse
  • WebSocket room-based broadcasting to avoid sending events to all connected clients
🛡️

Error Handling

  • Express global error handling middleware with secure error responses (no stack traces in production)
  • Try-catch blocks in all async route handlers with detailed error logging to console and files
  • Graceful fallback for external service failures (FCM, Redis, S3) with retry logic and error queues
  • Failed notification storage in FailedNotification model for manual retry capability
  • Mongoose validation errors caught and transformed into user-friendly messages
  • JWT verification errors handled with specific error codes (expired, invalid, missing)
  • Geofencing errors logged but allow access (fail-open) to prevent lockout if IP detection fails
  • Database transaction rollback on partial failures to maintain data consistency
🔒

Security

  • JWT tokens in httpOnly cookies to prevent XSS attacks (client JavaScript cannot access)
  • CSRF token validation using csurf middleware for all state-changing requests
  • Password hashing with bcryptjs using salt rounds of 10 for secure password storage
  • Rate limiting: 1000 requests/15min (general), 100 requests/15min (auth routes)
  • IP geofencing with CIDR validation supporting both hard block and soft log modes
  • Trusted device registration and validation for sensitive operations
  • Request signing using HMAC for critical endpoints (device registration, data erasure)
  • Security headers via Helmet middleware (CSP, HSTS, X-Frame-Options)

Design Decisions

Visual & UX Choices

Micro-Frontend Architecture

Rationale

Separated portals enable role-specific optimization, independent deployment cycles, security isolation with different authentication TTLs, and team scalability without merge conflicts

Details

Three Next.js applications on different ports (3000, 3001, 3002) with shared component library. Each portal bundles only code needed for that role, reducing bundle size and improving load times.

Dark Theme Interface

Rationale

Reduces eye strain for users working long hours, provides modern professional appearance, improves focus by minimizing visual distractions, and conserves battery on mobile devices

Details

Neutral color palette (neutral-950 to neutral-50) with blue accents for primary actions. ThemeContext supports light/dark mode toggle. Consistent spacing using Tailwind scale.

Dashboard Cards with Micro-interactions

Rationale

Visual hierarchy helps users quickly identify key metrics, hover states provide feedback, and animations create engaging user experience without overwhelming

Details

Framer Motion for page transitions and ScrollReveal animations. GSAP for complex animations. Skeleton loaders for perceived performance during data loading.

Attendance Heatmaps

Rationale

Visual pattern recognition is faster than scanning tabular data, color coding instantly highlights attendance issues, and historical trends become immediately apparent

Details

Recharts calendar heatmap with green gradient for present, red for absent, yellow for late. Hover tooltips show detailed breakdown. Supports monthly and yearly views.

Modal-Based Forms

Rationale

Keeps users in context without navigation, reduces cognitive load by focusing attention, enables quick actions without full page loads, and works well on mobile

Details

React Hook Form with Zod schema validation. Real-time field validation. Optimistic updates via TanStack React Query. Toast notifications for success/error feedback.

Impact

The Result

What We Achieved

Successfully deployed enterprise-grade HR platform serving thousands of employees with 95% API performance improvement (2-3s to 50-200ms), 85%+ cache hit rate, 60% database query reduction, 30-40% storage cost reduction, and 95%+ anomaly detection accuracy. System supports 4 user roles with granular permissions, real-time WebSocket updates, biometric device integration, and GDPR-compliant data management.

👥

Who It Helped

HR administrators gained centralized control over workforce management, automated approval workflows, and data-driven insights. Employees benefited from self-service leave requests, transparent performance tracking, mobile accessibility, and real-time notifications. Management received comprehensive analytics, performance leaderboards, and export capabilities for compliance reporting.

Why It Matters

Transformed manual, error-prone HR processes into automated, secure, and scalable system. Eliminated paper-based workflows, reduced HR administrative overhead, enabled data-driven decision making, ensured regulatory compliance, and improved employee satisfaction through transparency and mobile accessibility. Demonstrated ability to architect complex enterprise systems with multiple stakeholders.

Reflections

Key Learnings

Technical Learnings

  • MongoDB indexing strategy is critical for aggregation performance at scale - compound indexes on frequently queried fields reduced query time by 60%
  • Redis caching with cache-aside pattern significantly improves response times, but requires careful cache invalidation strategy to prevent stale data
  • WebSocket room-based architecture effectively handles role-based event distribution without broadcasting to all clients
  • IP geofencing requires careful configuration behind proxies and CDNs - must check multiple headers in priority order
  • Firebase Cloud Messaging batch limits (500 tokens) require chunking for large audiences and proper error handling
  • JWT tokens in httpOnly cookies provide better security than localStorage for SPA applications
  • TypeScript strict mode catches many bugs at compile time but requires careful type definitions for complex data structures
  • Bull job queue handles background processing reliably, but queue monitoring is essential to detect stuck jobs

Architectural Insights

  • Micro-frontend architecture enables independent deployment but increases complexity in shared state management and inter-app communication
  • Centralized backend API with role-based authorization is simpler to maintain than microservices for single-organization deployments
  • Missing performance metrics should be treated as 0% rather than excluded to prevent score manipulation
  • Automated data retention policies balance compliance requirements with operational needs, but require careful configuration per data type
  • Device validation adds security layer but requires thoughtful UX to avoid frustrating legitimate users
  • Separating concerns between service layer (business logic) and route handlers (HTTP logic) improves testability and reusability
  • Audit logging for security events (geofencing, login attempts) is essential for forensics and compliance
  • Consistent error codes and messages across API endpoints improve debugging and client-side error handling experience

What I'd Improve

  • Implement two-factor authentication (2FA) for enhanced security beyond password + device validation
  • Add SSO/SAML integration for enterprise customers with existing identity providers
  • Convert to multi-tenant architecture with tenant isolation for SaaS deployment
  • Implement email notification system as fallback when push notifications fail
  • Add Redis cluster configuration for high availability instead of single instance
  • Configure Socket.io with Redis adapter for horizontal scaling across multiple server instances
  • Implement automated backup and disaster recovery system with point-in-time restore
  • Add password complexity enforcement and password history tracking to prevent reuse

Roadmap

Future Enhancements

01

Implement two-factor authentication (2FA) using TOTP or SMS for enhanced account security

02

Add SSO/SAML integration for seamless authentication with enterprise identity providers

03

Multi-tenant architecture with tenant isolation for SaaS deployment to multiple organizations

04

Localization/i18n support for multiple languages (English, Arabic, Spanish) with RTL layout support

05

Redis cluster configuration with sentinel for high availability and automatic failover

06

Socket.io Redis adapter for horizontal scaling across multiple backend instances

07

Password complexity enforcement (minimum length, character requirements) and password history tracking

08

Automated backup system with incremental backups and point-in-time disaster recovery

09

Email notification system integration as fallback when push notifications fail or user opts out

10

Mobile offline-first architecture with sync queue for managing data when connectivity is poor

11

Advanced analytics dashboard with machine learning predictions for attrition risk and performance trends

12

Integration with payroll systems for automated salary calculation based on attendance and performance

13

Video calling integration for remote team meetings and interviews

14

Document OCR for automatic data extraction from uploaded IDs, certificates, and contracts

15

Employee training module with course management, progress tracking, and certification

16

Asset management system for tracking company equipment assigned to employees