Real-Time Log Security Scanner & Secret Redaction Microservice
Features โข Architecture โข Quick Start โข API Reference โข Benchmarks โข Security
OpSecGuard API is a lightweight, production-ready microservice that scans real-time application logs for sensitive data leaks โ API keys, secrets, credentials, database connection strings โ and redacts them instantly.
Built with Python/FastAPI and engineered for ultra-low latency (< 10ms per log batch), it uses pre-compiled, ReDoS-safe regex patterns with zero external database dependencies.
- ๐ 8 Specialized Detectors: OpenAI keys, AWS keys, Stripe keys, GitHub tokens, Bearer tokens, MongoDB/PostgreSQL URIs, Private keys
- ๐งฎ Optional Entropy Detection: Shannon entropy analysis for catching unknown secret formats (configurable, off by default)
- โก Ultra-Fast Scanning: Synchronous sequential processing optimized for CPU-bound regex work โ no GIL contention
- ๐ก๏ธ ReDoS-Safe: All patterns use bounded character classes โ no catastrophic backtracking
- ๐ Batch + Stream Modes: REST batch endpoint + WebSocket streaming for real-time log tailing
- ๐ณ Docker-Ready: Multi-stage Dockerfile with
google-re2for linear-time regex guarantees - ๐ Built-in Benchmarking: Measure throughput and P50/P95/P99 latency
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FastAPI Application โ
โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ POST /batch โโโโโถโ Scanning Engine โ โ
โ โโโโโโโโโโโโโโโโ โ โ โ
โ โโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ WS /stream โโโโโถโ โ Regex Detectors โ โ โ
โ โโโโโโโโโโโโโโโโ โ โ (pre-compiled, โ โ โ
โ โ โ ReDoS-safe) โ โ โ
โ โโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ GET /health โ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโ โ โ Entropy Detector โ โ โ
โ โ โ (optional, gated) โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Redaction Engine โ โ โ
โ โ โ (single-pass merge) โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Multi-worker Uvicorn (process-level parallelism) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
| Decision | Rationale |
|---|---|
| Synchronous sequential scanning | Python's GIL makes threading counterproductive for CPU-bound regex. Tight loop > thread pool overhead. |
| Process-level parallelism | uvicorn --workers N spawns independent processes, each with its own GIL. True multi-core scaling. |
| Entropy detector gated | math.log2 per-token computation is expensive. Disabled by default (ENABLE_ENTROPY=false). |
| Bounded regex quantifiers | [^=:\s]{0,20} instead of [^=]* prevents runaway scanning on long lines. |
| Stripe before OpenAI ordering | Both start with sk_/sk-. Stripe patterns match first to prevent misclassification. |
| Password-only URI redaction | DB URIs redact only the password (group capture), preserving connection info for debugging. |
# Clone the repository
git clone https://github.com/CS-Fasih/OpSecGuard-API.git
cd OpSecGuard-API
# Install dependencies
pip install -r requirements.txt
# Start the server
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
# Run tests
pytest tests/ -v
# Run benchmark (server must be running)
python benchmark.py# Build and run
docker compose up --build
# Or manually
docker build -t opsecguard-api .
docker run -p 8000:8000 opsecguard-apiScan a batch of log lines for sensitive data.
Request:
{
"logs": [
"2026-05-27 23:15:00 INFO: User login successful for admin",
"2026-05-27 23:15:02 ERROR: OpenAI initialization failed with key sk-proj-1234567890abcdef"
]
}Response:
{
"leak_detected": true,
"leaks_found": [
{
"type": "OpenAI API Key",
"line_index": 1
}
],
"sanitized_logs": [
"2026-05-27 23:15:00 INFO: User login successful for admin",
"2026-05-27 23:15:02 ERROR: OpenAI initialization failed with key [REDACTED_OPENAI_KEY]"
],
"scan_time_ms": 0.245
}Real-time streaming scan via WebSocket.
Send:
{"log": "ERROR: key sk-proj-1234567890abcdef used"}or
{"logs": ["line1", "line2"]}Receive (per line):
{
"line_index": 0,
"leak_detected": true,
"leaks": [{"type": "OpenAI API Key", "line_index": 0}],
"sanitized_line": "ERROR: key [REDACTED_OPENAI_KEY] used"
}{"status": "healthy", "service": "OpSecGuard API", "version": "1.0.0"}| Detector | Pattern Example | Redaction Tag |
|---|---|---|
| OpenAI API Key | sk-proj-abc123... |
[REDACTED_OPENAI_KEY] |
| AWS Access Key ID | AKIAIOSFODNN7EXAMPLE |
[REDACTED_AWS_ACCESS_KEY] |
| AWS Secret Key | aws_secret_access_key=... |
[REDACTED_AWS_SECRET_KEY] |
| Stripe Live Key | sk_live_abc123... |
[REDACTED_STRIPE_LIVE_KEY] |
| Stripe Test Key | sk_test_abc123... |
[REDACTED_STRIPE_TEST_KEY] |
| GitHub Token | ghp_abc123... |
[REDACTED_GITHUB_TOKEN] |
| Bearer Token | Bearer eyJhbG... |
[REDACTED_BEARER_TOKEN] |
| MongoDB URI | mongodb://user:pass@host |
Password โ [REDACTED_PASSWORD] |
| PostgreSQL URI | postgresql://user:pass@host |
Password โ [REDACTED_PASSWORD] |
| Private Key | -----BEGIN PRIVATE KEY----- |
[REDACTED_PRIVATE_KEY] |
| High-Entropy String* | Random 20+ char tokens | [REDACTED_HIGH_ENTROPY] |
*Requires ENABLE_ENTROPY=true environment variable.
All settings are controlled via environment variables:
| Variable | Default | Description |
|---|---|---|
HOST |
0.0.0.0 |
Server bind address |
PORT |
8000 |
Server bind port |
LOG_LEVEL |
info |
Logging verbosity |
MAX_BATCH_SIZE |
50000 |
Maximum lines per batch request |
ENABLE_ENTROPY |
false |
Toggle entropy detector (CPU-intensive) |
ENTROPY_THRESHOLD |
4.5 |
Minimum Shannon entropy to flag |
ENTROPY_MIN_LENGTH |
20 |
Minimum token length for entropy analysis |
WORKERS |
4 |
Uvicorn worker processes |
Run the benchmark suite:
# Start the server
uvicorn app.main:app --host 0.0.0.0 --port 8000
# Run with defaults (10,000 lines, batch size 100)
python benchmark.py
# Custom parameters
python benchmark.py --lines 50000 --batch-size 500 --poison-ratio 0.20- All regex patterns use bounded character classes and constrained quantifiers
- No nested quantifiers (
(a+)+) or overlapping alternations - Production Docker image includes
google-re2for O(n) guaranteed matching - Adversarial input tests included in test suite
- Global exception handler prevents stack trace leaks
- Batch size limits protect against resource exhaustion
- No sensitive data is stored โ pure stateless processing
- CORS middleware configured (restrict
allow_originsin production)
OpSecGuard-API/
โโโ app/
โ โโโ __init__.py
โ โโโ main.py # FastAPI app, middleware, routes
โ โโโ config.py # Environment-based settings
โ โโโ models.py # Pydantic request/response schemas
โ โโโ scanner/
โ โ โโโ __init__.py
โ โ โโโ detectors.py # Pre-compiled regex detectors
โ โ โโโ entropy.py # Shannon entropy calculator
โ โ โโโ engine.py # Scanning orchestration engine
โ โโโ routes/
โ โโโ __init__.py
โ โโโ batch.py # POST /v1/scan/batch
โ โโโ stream.py # WebSocket /v1/scan/stream
โโโ tests/
โ โโโ test_scanner.py # Comprehensive test suite
โโโ benchmark.py # Performance benchmarking script
โโโ Dockerfile # Multi-stage production image
โโโ docker-compose.yml # One-command deployment
โโโ requirements.txt # Python dependencies
โโโ requirements-docker.txt # Docker-only dependencies (re2)
โโโ LICENSE
โโโ README.md