Skip to content

Interactive chat mode with separate performance tracking#209

Merged
dddimcha merged 2 commits into
mainfrom
feature/interactive-chat-mode
Jan 29, 2026
Merged

Interactive chat mode with separate performance tracking#209
dddimcha merged 2 commits into
mainfrom
feature/interactive-chat-mode

Conversation

@dddimcha

Copy link
Copy Markdown
Owner

Summary

Adds a dedicated interactive chat mode (talk) and separates performance metrics from chat output.

New Commands

talk - Interactive Chat Mode

  • Continuous conversation loop with You> / AI> prompts
  • Auto-initialization of model/tokenizer/engine on start
  • Inline perf command for quick stats during chat
  • Session summary on exit (messages, tokens, time)

perf - Performance Metrics

  • Last message timing (prompt tokens, generated tokens, time, tok/s)
  • Session aggregates (total tokens, average throughput)
  • Keeps benchmarks out of the chat flow

Console UX Improvements

  • Redesigned help command with ASCII box formatting
  • Added help ai for AI-specific command reference
  • Added status, clear/cls, version/ver commands
  • Better unknown command handling with suggestions

Implementation

  • Added chat_perf_t struct for timing tracking
  • rdtsc/cntvct_el0 cycle counting for accurate timing
  • Session-level aggregation for average throughput

Improvements to console UX:
- Redesigned help command with ASCII box formatting
- Added 'help ai' for AI-specific command reference
- Added 'help all' for comprehensive command list
- Added 'status' command for system/AI status overview
- Added 'clear'/'cls' for terminal clear (ANSI escape)
- Added 'version'/'ver' to show kernel version

Chat command improvements:
- Better usage examples when called without message
- Progress messages during model/tokenizer/engine init
- Improved output formatting with labeled sections
- Better error messages with actionable suggestions

Unknown command handling:
- Suggests similar commands for common typos
- Points users to 'help' for available commands
New commands:
- 'talk' - Interactive chat session with continuous conversation loop
  - Auto-initializes model, tokenizer, and inference engine on start
  - Shows session banner and usage hints
  - Type 'perf' inline to see session stats
  - Type 'exit'/'quit'/'q' to leave with session summary
  - Tracks tokens/time per message and session totals

- 'perf' - Display chat performance metrics separately
  - Last message: prompt tokens, generated tokens, time, throughput
  - Session totals: messages, total tokens, average throughput
  - Keeps benchmarks out of the chat flow

Performance tracking:
- Added chat_perf_t struct to track timing
- rdtsc/cntvct_el0 cycle counting for accurate timing
- Session-level aggregation for average throughput

Chat command updated:
- Cleaner output (just the response, no extra formatting)
- Records timing but doesn't display it
- Points users to 'talk' for interactive mode
@dddimcha dddimcha merged commit 8d1b2af into main Jan 29, 2026
4 checks passed
@dddimcha dddimcha deleted the feature/interactive-chat-mode branch January 29, 2026 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant