Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# SpeedReader - Copilot Instructions

## Project Overview
A Python desktop application that uses text-to-speech (TTS) to read text at high speeds (up to 500+ WPM). Built with tkinter for the GUI and pyttsx3 for speech synthesis.

## Architecture

### MVC-like Structure
```
SpeedReader.py # Entry point - instantiates controller and starts mainloop
Controllers/ # Application controllers (extend Tk)
SpeedReaderController.py # Main window controller, sets up grid layout
Frames/ # UI components (extend ttk.Frame)
MainFrame.py # All UI logic, TTS engine management, event handlers
```

### Key Patterns
- **Controller as Tk root**: `SpeedReaderController` extends `Tk` directly, not a separate class
- **Frame-based UI**: UI components are `ttk.Frame` subclasses passed `master=self` from controller
- **Threaded TTS**: Speech runs in daemon threads via `threading.Thread` to keep UI responsive
- **Fresh engine per session**: pyttsx3 engine is created fresh for each speech session to avoid state issues after interruption
- **Session ID tracking**: `speech_session_id` increments on new speech; callbacks check `current_session_id` to ignore stale events
- **Windows media control**: Pauses system music when TTS starts, resumes when finished (via `VK_MEDIA_PLAY_PAUSE` key simulation)

### Important Code Patterns

**Widget state checking** - uses string comparison:
```python
if self.speak_button['state'].__str__() == NORMAL:
```

**Text widget tagging** for highlighting current word:
```python
self.text_area.tag_config(TAG_CURRENT_WORD, foreground="red")
self.text_area.tag_add(TAG_CURRENT_WORD, index1, index2)
```

**pyttsx3 callbacks** - connect to engine events:
```python
self.engine.connect('started-utterance', self.onStart)
self.engine.connect('started-word', self.onStartWord)
self.engine.connect('finished-utterance', self.onEnd)
```

## Build & Run

### Development
```powershell
# Activate venv (may need execution policy)
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process
.\.venv\Scripts\Activate.ps1

# Run the app
python SpeedReader.py
```

### Build Executable
```powershell
pyinstaller SpeedReader.spec
# Output: dist/SpeedReader.exe (single file, no console)
```

## Dependencies
- `pyttsx3` - Cross-platform TTS (uses SAPI5 on Windows)
- `pyinstaller` - Build standalone executables
- `tkinter` - GUI (included with Python)

## UI Keyboard Shortcuts
- `Ctrl+A` - Select all text in text area
- `Ctrl+B` - Paste clipboard and immediately start speaking

## Testing Practices

### Test-Driven Development (TDD)
Follow the TDD cycle: **Red → Green → Refactor**
1. Write a failing test first
2. Write minimal code to make it pass
3. Refactor while keeping tests green

### Unit Test Structure
Use **Arrange-Act-Assert** pattern for all tests:
```python
def test_speed_entry_default_value():
# Arrange
controller = SpeedReaderController()
frame = controller.winfo_children()[0]

# Act
speed_value = frame.speed_entry.get()

# Assert
assert speed_value == "500"
controller.destroy()
```

### Testing tkinter Components
- Always call `controller.destroy()` in teardown to clean up Tk instances
- Use `controller.update()` to process pending UI events in tests
- Mock `pyttsx3.init()` to avoid actual speech synthesis during tests

## Agent Self-Improvement
**When you discover something new about this project**, update this instructions file:
- New patterns or conventions you observe in the code
- Build/run commands that aren't documented
- Gotchas or workarounds you encounter
- Integration points with external systems

Keep this file current so future AI agents benefit from your learnings.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,5 @@ target/

#Ipython Notebook
.ipynb_checkpoints

.vscode/
4 changes: 3 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ The tests mock `pyttsx3` (see [tests/test_speech_engine.py](tests/test_speech_en

# Custom Instructions
- ALWAYS dogfood your own MCP server. Whenever you need to ask me a question or finish a task, use the MCP to read it out loud. If the MCP is not available, remind me to turn it on for dogfooding purposes.
- ALWAYS add or update unit tests whenever functionality is added or changed — in the same change, never defer it. Put GUI-free logic in [Core/](Core/) so it can be tested, and run `python -m pytest -q` to confirm green before finishing.
- be repetitive in instructions and skills with high risk items due to the unpredictability of large language models
- list skills you used in completing the prompt
- ALWAYS create (not just suggest) any skill you wished you had before starting the prompt — create the missing `SKILL.md` under `.github/skills/<name>/` before finishing, then list what you created
Expand All @@ -73,4 +74,5 @@ The tests mock `pyttsx3` (see [tests/test_speech_engine.py](tests/test_speech_en
- ALWAYS keep [README.md](README.md) up to date for GitHub users when behavior, setup, run/build steps, or user-facing features change (e.g. the MCP server, config, shortcuts) — update it in the same change, never defer it
- be repetitive in instructions and skills with high risk items due to the unpredictability of large language models
- REPEAT: missing skills must be CREATED as files, never left as suggestions
- REPEAT: user-facing changes are not done until [README.md](README.md) reflects them
- REPEAT: user-facing changes are not done until [README.md](README.md) reflects them
- REPEAT: added or changed functionality is not done until unit tests cover it and `pytest` is green
39 changes: 37 additions & 2 deletions Core/speech_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ def __init__(self, on_start=None, on_word=None, on_end=None, init=None):
self._engine_ready = threading.Event()
self._voices_ready = threading.Event()
self._loop_requested = False
self._flush_generation = 0

def _ensure_engine(self):
"""Create + wire the engine. MUST run on the dedicated loop thread.
Expand Down Expand Up @@ -90,18 +91,52 @@ def _await_engine(self):
return self.engine
return self._ensure_engine()

def speak(self, text, rate, voice=None, block=True):
def flush(self):
"""Cancel queued utterances and interrupt the one being spoken now.

Bumps the flush generation so any callers blocked waiting for the speak
lock abort instead of speaking, then stops the engine to interrupt the
current utterance. Used by the GUI 'barge in' (Ctrl+B) path. The MCP
server never flushes, so agent utterances queue and play in order.
"""
self._flush_generation += 1
if self.engine is not None:
try:
self.engine.stop()
except Exception:
pass

def speak(self, text, rate, voice=None, block=True, interrupt=False, name=None):
"""Speak one utterance, optionally with a per-call ``voice`` id.

Serialized via a lock; when ``block`` (default) it waits for the
utterance to finish so the next speaker's voice cannot bleed in. Run on
a daemon/worker thread — never the tkinter main thread.

When ``interrupt`` is set, the current utterance is stopped and any
already-queued utterances are cancelled before this one speaks (the GUI
Ctrl+B path). Calls left ``interrupt=False`` (e.g. the MCP server) queue
normally and play in order.

``name`` is passed through to ``engine.say`` so it is echoed back to the
started/word/finished callbacks; the GUI uses it to tag each utterance
with a session id and ignore callbacks from an interrupted utterance
that arrive after a new one has already started.
"""
if interrupt:
self.flush()
my_generation = self._flush_generation
with self._speak_lock:
if self._flush_generation != my_generation:
# A flush happened while this call waited in the queue — drop it.
return
engine = self._await_engine()
self._apply_properties(rate, voice)
self._done.clear()
engine.say(text)
if name is None:
engine.say(text)
else:
engine.say(text, name)
if block:
self._done.wait(timeout=600)

Expand Down
Loading