| Notebook | Link |
|---|---|
| Logprobs Generation | |
| Token Tree Analysis |
This repository explores the internal stochastic nature of Gemma models. By extracting transition scores and logits from the Hugging Face transformers generation loop, we can analyze the model's confidence levels and visualize "competing" tokens at each step of the sequence.
This repository contains no confidential data/IP and is intended for demonstration and research use.
- Log Probability Analysis: Extract and analyze the log probabilities of generated tokens to understand model confidence.
- Top-K Candidates: View the top alternative tokens considered by the model at each step.
- Guided Generation: Steer the model's output by providing a specific starting prefix (e.g., forcing a code block).
- Token Tree Exploration: Construct and visualize decision trees of token generation paths based on probability thresholds.
- Data Export: Save analysis results to JSONL or JSON for further processing.
The dynamic thresholding logic in Token_Tree_Analysis.ipynb adapts how "picky" the model is about branching based on how busy the search queue currently is.
Where:
-
$T_{current}$ is the calculated probability threshold for the current step. -
$T_{min}$ is themin_branch_threshold(e.g., 0.1). -
$T_{max}$ is themax_branch_threshold(e.g., 0.5). -
$|Q|$ is the current length of the queue (number of active paths). -
$Q_{limit}$ is thesoft_queue_limit(target number of active paths).
Note: The saturation ratio is capped at 1.0.
-
Empty Queue: When the queue is small, the threshold is close to
$T_{min}$ . This encourages the model to branch out and explore even low-probability alternatives. -
Full Queue: As the queue fills up (approaching
soft_queue_limit), the threshold rises toward$T_{max}$ . This forces the model to be very selective, only branching on highly probable tokens to prevent the search from exploding exponentially.
Logprobs_in_Gemma.ipynb: The main Jupyter Notebook containing the log probability analysis code, helper functions, and experiments.Token_Tree_Analysis.ipynb: Notebook for generating and analyzing token trees.token_tree_analysis/: Contains the visualizer and sample outputs for the token tree analysis.
- Clone the repository.
- Install the required dependencies:
pip install -U torch transformers pandas accelerate numpy huggingface-hub- Open
Logprobs_in_Gemma.ipynbin VS Code or Jupyter Lab. - Ensure you have a Hugging Face account and an access token.
- Run the notebook cells to:
- Authenticate with Hugging Face.
- Load the Gemma model (default:
google/gemma-2-2b-it). - Run the log probability analysis experiment.
- Run the guided generation experiment.
- Open
Token_Tree_Analysis.ipynbto generate and analyze token decision trees.
-
You can visualize the generated JSONL data using the Gemma Token Analysis Visualizer.
-
For analyzing token generation trees, use the Token Tree Visualizer.
- Response Visualizer: Click any token to see alternatives and regenerate sequences
- Tree Visualizer: Interactive D3.js visualization of token generation paths
- Python 3.8+
- PyTorch
- Transformers
- Pandas
- Accelerate
- Bitsandbytes
- A GPU is recommended for faster inference.