Skip to content

OfficialWhyEd/WhyJarv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WhyJarv


Personal voice assistant that races Groq LLaMA 70B against Gemini Flash in parallel. Fastest reply wins. Under 300ms. Claude Code CLI executes real Mac actions via MCP. SQLite memory. Zero cloud storage.


Philosophy

Race mode — Two AI models fire simultaneously the moment you finish speaking. The first HTTP chunk back wins; the other connection is silently cancelled. Nobody does this at the personal assistant level. The result is sub-300ms latency on a 2015 MacBook Pro.

Real actions — Claude Code CLI runs with full MCP access. It opens apps, writes and runs code, reads files, controls the Mac via AppleScript. You hear the answer in 247ms while Claude executes in the background.

Local memory — SQLite + TF-IDF indexing. Every conversation is stored and retrievable. Preferences, context, history — forever, offline, private.

Zero cloud storage — No conversation data leaves your Mac. No account required. Your Groq and Gemini API keys stay in .env locally.

Always ready — Menu bar icon via rumps. Wake with "Let's start" or two claps (PyAudio). Uses zero resources while silent.


How it works

Voice → Apple STT → Groq LLaMA 70B ──────────────────→ Apple TTS
                         ↓ (if action needed)
                   Gemini Flash (context compression)
                         ↓
                   Claude Code CLI (executes via MCP)
Layer Technology Latency
STT Apple Web Speech API ~200ms
Conversation Groq LLaMA 3.3 70B + Gemini Flash (race) <300ms
Actions Claude Code CLI + MCP 3–8s
TTS Apple speechSynthesis — Alice <50ms
Memory SQLite local + TF-IDF retrieval instant

What you can ask

Ask What happens
"Open Xcode and show me the build errors" AppleScript opens Xcode, Claude reads the log, speaks the errors
"Remember I hate meetings before 10am" Stored in SQLite, surfaced whenever schedule is discussed
"Write a commit message for these changes" Claude reads the diff via MCP, drafts conventional commit
"What did we discuss yesterday about WhyPost?" TF-IDF retrieval returns the exact exchange in milliseconds
"Play something calm on Spotify" AppleScript fires the Spotify command
"I have a call in 40 min — brief me on the deck" Claude reads the file + fetches calendar context, speaks 3 key points
"Run the tests and tell me what broke" MCP executes tests, Claude reads output, gives you the summary
"Set a 25-minute focus timer" Native macOS timer via AppleScript

Why this is different

Feature WhyJarv Siri ChatGPT voice Alexa
Parallel model race Yes No No No
Real code execution Yes No Limited No
Local SQLite memory Yes No No No
Zero cloud storage Yes No No No
Claude CLI actions Yes No No No
Sub-300ms response Yes Sometimes No Sometimes

Stack

  • Backend: FastAPI + WebSocket on :8340
  • Frontend: React + Three.js (signal-orange orb #c94b25)
  • AI: Groq LLaMA 3.3 70B · Gemini Flash · Claude Code CLI
  • TTS/STT: Apple native (Web Speech API + speechSynthesis)
  • Memory: SQLite + TF-IDF (memory_store.py)

Quick start

git clone https://github.com/OfficialWhyEd/WhyJarv
cd WhyJarv

cp .env.example .env   # add GEMINI_API_KEY and GROQ_API_KEY
./start.sh             # start backend + open browser

Say "Let's start" to activate. Say "chiuditi" to shut down.


Structure

WhyJarv/
├── server.py          # FastAPI + WebSocket + AI race pipeline
├── memory_store.py    # SQLite atomic memory + TF-IDF retrieval
├── menu_bar.py        # macOS menu bar icon (rumps)
├── frontend/
│   ├── src/orb.ts     # Three.js particle orb
│   └── src/main.ts    # state machine, barge-in
└── workspace/         # SOUL.md, PROTOCOL.md, IDENTITY.md

Star History

Star History Chart


Spread the word

Found it useful or interesting? Share it.


Contributing

Pull requests welcome. Highest-value contributions:

  • Windows/Linux port — remove the macOS-only Apple TTS/STT dependency
  • New wake words — expand the activation vocabulary in server.py
  • Memory improvements — better retrieval algorithms in memory_store.py
  • New MCP tools — extend Claude's action capabilities

Built by @whyed · macOS only · local-first · website

About

Personal voice assistant racing Groq LLaMA 70B vs Gemini Flash in parallel — fastest reply wins, <300ms. Claude CLI executes real Mac actions via MCP. SQLite TF-IDF memory. Wake word. Zero cloud.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors