Building MANTAX: An Ethical Facial Recognition System for NGOs

#Introduction

This is a technical deep-dive into MANTAX, an ethical facial recognition system designed for NGO use cases. In this blogpost, we'll explore the complete architecture, data flows, and implementation details—everything a developer needs to understand how this system works.


#System Architecture Overview

MANTAX follows a three-tier architecture with clear separation between presentation, business logic, and machine learning components.

Loading diagram...

#Technology Stack

| Layer | Technology | Purpose | |-------|------------|---------| | Frontend | Electron + Vanilla JS | Desktop app with custom titlebar | | Styling | SCSS → CSS | macOS Tahoe Liquid Glass design | | Backend | Flask (Python) | REST API with 31 endpoints | | ML Runtime | ONNX Runtime (ArcFace) + PyTorch (FaceNet) | Dual-model embedding extraction | | Face Detection | OpenCV DNN + MediaPipe | Face localization + 468-point landmarks |

MANTAX Interface


#The Data Pipeline

When a user uploads an image, it flows through a well-defined pipeline. Let's trace this journey:

Loading diagram...

#Face Detection Module

The detection module (src/detection/__init__.py) handles the first critical step: finding where faces are in an image.

#Detection Flow

Loading diagram...

#Primary Detection Method: OpenCV DNN

The system uses a pre-trained Caffe model for deep learning-based face detection:

Model files required:

  • deploy.prototxt.txt - Caffe architecture definition
  • res10_300x300_ssd_iter_140000.caffemodel - Pre-trained weights

#Fallback: Haar Cascade

If DNN fails to load, the system gracefully falls back to Haar Cascade:

#Face ROI Extraction

Once faces are detected, the API extracts the region of interest (ROI):

Face Detection Visualization


#Facial Landmark Detection

After detecting faces, the system extracts 468 facial landmarks using MediaPipe Face Mesh:

Loading diagram...

#Landmark Extraction Code

#Geometric Feature Extraction

For comparison purposes, we extract scale-invariant geometric features:

Landmark Detection


#Embedding Extraction (Dual-Model)

This is the core of the system—converting face images into mathematical embeddings that can be compared.

Loading diagram...

#ArcFace Implementation

ArcFace provides superior discrimination between different faces:

#FaceNet Implementation

FaceNet provides secondary signals and neural activation visualizations:

#Neural Network Activations

FaceNet also extracts intermediate layer activations for visualization:

Activation layers extracted: | Layer | Output Shape | Purpose | |-------|-------------|---------| | conv1 | (64, 112, 112) | First convolutions | | bn1 | (64, 112, 112) | First batch norm | | layer1 | (64, 56, 56) | Low-level features | | layer2 | (128, 28, 28) | Mid-level features | | layer3 | (256, 14, 14) | High-level features | | layer4 | (512, 7, 7) | Final features | | embedding | (128,) | Final embedding |

ArcFace Embedding


#The Compare Endpoint (Complete Flow)

Here's the full comparison flow from the /api/compare endpoint:

Loading diagram...

#Dual-Model Scoring

The scoring combines multiple signals with learned weights:

#Confidence Bands

Rather than binary decisions, the system outputs confidence bands:


#API Endpoints Reference

The Flask API exposes 31 endpoints for all operations:

Loading diagram...

| Endpoint | Method | Purpose | |----------|--------|---------| | /api/health | GET | System health check | | /api/embedding-info | GET | Current model info | | /api/diagnostics | GET | System diagnostics | | /api/detect | POST | Face detection | | /api/extract | POST | Embedding extraction | | /api/add-reference | POST | Add reference image | | /api/references | GET | List all references | | /api/references/<id> | DELETE | Remove reference | | /api/compare | POST | Compare embeddings | | /api/visualizations/<type> | GET | Get visualization | | /api/clear | POST | Clear session |


#Visualizations (14 Types)

The system provides 14 different AI visualizations to help investigators understand why scores were computed:

Loading diagram...

#Visualization Implementation Example

Neural Activations


#Session State Management

The API maintains in-memory session state:

#Persistence

References are saved to JSON for persistence across restarts:


#Testing Infrastructure

The system includes comprehensive tests:

Loading diagram...

#Running Tests


#File Structure


#Key Design Decisions

#1. Dual-Model Architecture

Using both ArcFace (512-dim) and FaceNet (128-dim) together provides better discrimination than either alone. ArcFace handles the primary matching while FaceNet provides secondary signals and activation visualizations.

#2. Confidence Bands, Not Binary Decisions

The system outputs confidence bands (Very High/High/Moderate/Insufficient) instead of "match/no-match". This ensures human investigators always make the final decision.

#3. Local-Only Processing

No images are sent to external servers. All computation happens on the user's machine, addressing NGO privacy concerns.

#4. Non-Reversible Embeddings

Facial embeddings cannot be used to reconstruct the original face—providing an additional layer of privacy protection.

Every reference image includes metadata about consent status, source, and purpose—essential for NGO documentation requirements.


#Summary

MANTAX is a fully functional ethical facial recognition system built with:

  • Flask API (2,131 lines) with 31 endpoints
  • Dual-model embedding (ArcFace 512-dim + FaceNet 128-dim)
  • OpenCV DNN face detection with MediaPipe landmarks
  • Electron desktop app with macOS Tahoe Liquid Glass UI
  • Comprehensive testing (E2E, edge cases, frontend)

The system is designed for NGO use cases with:

  • Local-only processing (no cloud)
  • Human-in-the-loop verification
  • Consent tracking
  • Confidence bands instead of binary decisions

In the next blogpost, we'll explore the JavaScript refactoring journey—how we tackled a 3,429-line monolithic app.js and broke it into 7 modular files following best practices.


#Demo Video

Here's a demo showing a no-match scenario:

Next: The JavaScript Refactoring Story

Created:
4/10/2026
Last Updated:
4/10/2026