GUI Testing Project

This repository contains the experimental engine for Plura, a research-focused VLM (Visual Language Model) designed to audit and outperform GPT-5.2 in specific UI navigation tasks.

Key Modules for Reviewers

Core Logic: plura_engine.py - The main orchestration pipeline.
Visual Physics: indexing/visual_physics/ - Modules for "Spectral Saliency" and click refinement (auditing perception artifacts).
Architecture: indexing/ocumamba_lite/ - Implementation of Mamba-based vision encoders for high-efficiency inference.
Benchmarking: scripts/gpt52_benchmark_fixed.py - The evaluation harness used to compare speed/cost against SOTA.

Infrastructure

See scripts/VASTAI_DEPLOY.md for GPU cluster deployment notes.

Note: This is an active research repo. You will see failed experiment scripts (e.g., _v1, _debug) which are preserved for audit trails.

GUI Testing Project

Research and benchmarking for GUI visual grounding models.

Project Structure

scripts/ - Benchmark and evaluation scripts
indexing/ - Active Inference GUI grounding implementation
docs/ - Research reports and documentation
security/ - Rate limiting and security utilities

Key Scripts

gpt52_cot_benchmark.py - GPT-5.2 benchmark with Chain-of-Thought reasoning
gpt_benchmark.py - GPT-4o benchmark script
active_gui_grounding.py - OcuMamba-Lite + Active Inference implementation

Setup

pip install openai datasets pillow
export OPENAI_API_KEY="your-key"

Running Benchmarks

python scripts/gpt52_cot_benchmark.py

Research

See docs/research_report.md for full analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cache		cache
common		common
docs		docs
indexing		indexing
scripts		scripts
security		security
simulation		simulation
visualization		visualization
.gitignore		.gitignore
CODEX_CONTEXT.md		CODEX_CONTEXT.md
README.md		README.md
__init__.py		__init__.py
baseline_50.csv		baseline_50.csv
baseline_50.json		baseline_50.json
baseline_50.jsonl		baseline_50.jsonl
combined_grounding.py		combined_grounding.py
enhanced_test.csv		enhanced_test.csv
enhanced_test.json		enhanced_test.json
enhanced_test.jsonl		enhanced_test.jsonl
high_accuracy_grounding.py		high_accuracy_grounding.py
hybrid_grounding.py		hybrid_grounding.py
icon_grounding.py		icon_grounding.py
improved_50.csv		improved_50.csv
improved_50.json		improved_50.json
improved_50.jsonl		improved_50.jsonl
ocumamba_grounding.py		ocumamba_grounding.py
plura_engine.py		plura_engine.py
plura_pdf.py		plura_pdf.py
screenspot_pro_tuned_v2.json		screenspot_pro_tuned_v2.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Key Modules for Reviewers

Infrastructure

GUI Testing Project

Project Structure

Key Scripts

Setup

Running Benchmarks

Research

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Key Modules for Reviewers

Infrastructure

GUI Testing Project

Project Structure

Key Scripts

Setup

Running Benchmarks

Research

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages