⚡ NemulAI

Cut your GPU bill by 15-40%. Automatically.

Self-learning GPU cost intelligence. Per-job attribution, waste detection, and automated optimization for AI teams.

Website · Docs · Dashboard · Report a Bug

The Problem

Your A100 burns $40/hour. Do you know which of your training jobs was worth it?

Most ML teams can't answer that. nvidia-smi shows real-time watts. Cloud providers show a monthly bill. Neither tells you which specific job was your $800 run.

The hidden cost of GPU waste compounds fast:

Training jobs left running overnight when convergence stalled hours ago
Idle GPUs sitting at 3% utilization, eating full power draw
No per-team attribution → no accountability → no improvement
Finance asks "can we cut GPU spend?" and nobody has data to answer with

NemulAI closes that gap — and gets smarter over time. A lightweight Python agent runs on your GPU machines, attributes energy to individual jobs in real time, and streams dollar costs to a dashboard. The self-learning optimizer adapts to your workload patterns and improves its recommendations the longer it runs.

Features

Per-job cost attribution — tracks energy ($) per training run, not just per machine
Real-time power monitoring — samples NVML every 5 seconds via nvidia-ml-py
Self-learning optimizer — the agent learns your workload patterns over time; recommendations get better every week
Team chargeback — tag workloads with ALUMINATAI_TEAM to split costs by team
Waste detection — idle GPUs flagged automatically, saving 15-40% on compute spend
Budget alerts — get notified before costs spike, not after
WAL-backed reliability — metrics buffer locally during API outages, replay on reconnect
Multi-scheduler support — Kubernetes, Slurm, Run:ai, and manual tagging
MLflow & W&B callbacks — tag experiment runs with energy cost automatically
Prometheus endpoint — expose metrics to your existing Grafana stack (METRICS_PORT=9100)
Zero infra overhead — ~0% CPU, ~50 MB RAM, single pip install

Quick Start

Install

pip install nemulai

Run

export ALUMINATAI_API_KEY=alum_your_key_here
nemulai

That's it. The agent starts streaming GPU metrics to your dashboard immediately.

Get your API key at nemulai.com/dashboard — 7-day free trial, no credit card required.

Docker

docker run --rm --runtime=nvidia --pid=host \
  -e ALUMINATAI_API_KEY=alum_your_key_here \
  ghcr.io/agentmulder404/nemulai-agent:latest

Configuration

All settings are environment variables — no config files required.

Variable	Default	Description
`ALUMINATAI_API_KEY`	(required)	Your API key from the dashboard
`ALUMINATAI_API_ENDPOINT`	`https://nemulai.com/v1/metrics/ingest`	Ingest endpoint
`SAMPLE_INTERVAL`	`5.0`	Seconds between NVML samples
`UPLOAD_INTERVAL`	`60`	Seconds between metric flushes
`ALUMINATAI_TEAM`	(none)	Team tag for chargeback attribution
`ALUMINATAI_MODEL`	(none)	Model tag for per-experiment tracking
`LOG_LEVEL`	`INFO`	Logging verbosity
`METRICS_PORT`	`9100`	Prometheus scrape port (`0` = disabled)

Job Attribution

Tag your workloads at launch for per-job cost breakdown:

ALUMINATAI_TEAM=nlp-team \
ALUMINATAI_MODEL=llama3-finetune \
ALUMINATAI_API_KEY=alum_... \
python train.py

Or use the MLflow callback:

from nemulai.integrations.mlflow_callback import NemulMLflowCallback

with mlflow.start_run():
    cb = NemulMLflowCallback()
    trainer.add_callback(cb)

Or W&B:

from nemulai.integrations.wandb_callback import NemulWandbCallback

wandb.init(project="my-project")
trainer.add_callback(NemulWandbCallback())

Architecture

┌─────────────────────────────────────────────────────────┐
│                   GPU Machine                           │
│                                                         │
│  ┌──────────────┐     ┌──────────────┐                 │
│  │  NVML/NVIDIA │────▶│   Sampler    │  5s interval    │
│  │  Driver      │     │  (nvidia-    │                 │
│  └──────────────┘     │   ml-py)     │                 │
│                       └──────┬───────┘                 │
│                              │                          │
│                       ┌──────▼───────┐                 │
│                       │  Attributor  │ job_id / team   │
│                       │  (process +  │ tagging         │
│                       │  scheduler)  │                 │
│                       └──────┬───────┘                 │
│                              │                          │
│                       ┌──────▼───────┐                 │
│                       │  WAL Buffer  │ survives         │
│                       │  (local)     │ outages          │
│                       └──────┬───────┘                 │
└──────────────────────────────┼──────────────────────────┘
                               │ HTTPS
                               ▼
                    ┌─────────────────────┐
                    │  nemulai.com    │
                    │  /v1/metrics/ingest │
                    └─────────┬───────────┘
                              │
                    ┌─────────▼───────────┐
                    │  Dashboard          │
                    │  watts → $ per job  │
                    │  team attribution   │
                    │  chargeback reports │
                    └─────────────────────┘

Deployment

systemd (recommended for production)

# /etc/systemd/system/nemulai.service
[Unit]
Description=NemulAI GPU Agent
After=network.target

[Service]
ExecStart=/usr/local/bin/nemulai
Restart=on-failure
RestartSec=10
EnvironmentFile=/etc/nemulai.env

[Install]
WantedBy=multi-user.target

sudo systemctl enable --now nemulai

Kubernetes DaemonSet

kubectl apply -f https://raw.githubusercontent.com/AgentMulder404/NemulAI/main/deploy/k8s/daemonset.yaml

Slurm (Prolog/Epilog)

# /etc/slurm/prolog.d/nemulai.sh
source /etc/nemulai.env
nemulai &

Full deployment docs at nemulai.com/docs/agent.

Why Open Source?

GPU cost optimization should be a solved problem, not a proprietary feature gate.

The GPU monitoring space is full of tools that show you what's happening (nvidia-smi, Grafana) or what happened (cloud billing dashboards). NemulAI is the missing link: what each specific job cost, in real time, in dollars.

By open-sourcing the agent, anyone can:

Audit exactly what data is collected (it's just power draw and metadata you tag)
Run a fully self-hosted stack against your own endpoint
Contribute integrations for their scheduler, experiment tracker, or cloud provider
Build on the primitives for their own cost tooling

The hosted dashboard at nemulai.com is how the project is sustained. The agent that collects your data will always be free and open.

Self-Hosting

The agent is fully functional without the hosted dashboard. Point it at your own ingest endpoint:

ALUMINATAI_API_ENDPOINT=https://your-internal-api.com/v1/metrics/ingest \
ALUMINATAI_API_KEY=your_key \
nemulai

The ingest API schema is documented at nemulai.com/docs/api.

Contributing

Contributions are welcome. The project follows a standard fork → branch → PR workflow.

Good first issues: scheduler integrations, new MLflow/W&B/OTEL hooks, packaging improvements, docs.

Fork the repo
Create a branch: git checkout -b feat/your-feature
Make your changes with tests where applicable
Open a PR against main

By contributing, you agree your code will be licensed under Apache 2.0 and credited in the NOTICE file.

Code of conduct: Be direct, be useful, don't be a jerk.

Citation

If you use NemulAI in research, please cite:

@software{nemulai2026,
  author    = {Kevin},
  title     = {NemulAI: Per-Job GPU Energy Monitoring and Cost Attribution},
  year      = {2026},
  url       = {https://github.com/AgentMulder404/NemulAI},
  version   = {0.2.1}
}

See CITATION.cff for the machine-readable format.

Authorship & Credit

NemulAI was created and is maintained by Kevin.

X/Twitter: @NemulAI_Dev
Website: nemulai.com
GitHub: @AgentMulder404

The name "NemulAI" is a trademark of the original author. Forks and derivative works are welcome under the Apache 2.0 license, but may not use the NemulAI name or logo to represent their products without written permission.

If you build something with or on top of NemulAI, a mention or link back is appreciated — it helps others find the original project.

License

Apache 2.0 — see LICENSE for full terms.

In plain English: use it, fork it, build on it, sell products with it. Keep the copyright notice, don't call your fork "NemulAI", and don't claim you wrote it.

_{Built with obsession by Kevin · Star ⭐ if this saves you money}

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.claude/projects/-Users-rizz-AluminatiAi/memory		.claude/projects/-Users-rizz-AluminatiAi/memory
.github/workflows		.github/workflows
agent		agent
app		app
components		components
database/migrations		database/migrations
deploy		deploy
lib		lib
public		public
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
Dockerfile.test		Dockerfile.test
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
eslint.config.mjs		eslint.config.mjs
middleware.ts		middleware.ts
next-env.d.ts		next-env.d.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.cjs		postcss.config.cjs
pyproject.toml		pyproject.toml
tsconfig.ci.json		tsconfig.ci.json
tsconfig.ci.tsbuildinfo		tsconfig.ci.tsbuildinfo
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo
vercel.json		vercel.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ NemulAI

The Problem

Features

Quick Start

Install

Run

Docker

Configuration

Job Attribution

Architecture

Deployment

systemd (recommended for production)

Kubernetes DaemonSet

Slurm (Prolog/Epilog)

Why Open Source?

Self-Hosting

Contributing

Citation

Authorship & Credit

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ NemulAI

The Problem

Features

Quick Start

Install

Run

Docker

Configuration

Job Attribution

Architecture

Deployment

systemd (recommended for production)

Kubernetes DaemonSet

Slurm (Prolog/Epilog)

Why Open Source?

Self-Hosting

Contributing

Citation

Authorship & Credit

License

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages