Vision: Build the "IDE for Science" by transforming VS Code into an Agentic Research Environment.
Strategy: Start with VS Code Extension → Strip the "Coder" UI → Inject "Researcher" Tools → Connect the Agentic Brain.
This roadmap is organized in phases, not timelines. Each phase has clear objectives and "Definition of Done" criteria. The speed of execution depends on development resources and community contributions.
Core Principle: Ship working software early, iterate based on real researcher feedback.
- 🏗️ The "Zen" Foundation - Clean document editor, no AI yet
- 🧠 The "Read" Loop - PDF intelligence and library management
- 🎓 The "Scholar" Loop - Citations and academic workflows
- 🚀 The "Power" Loop - Production-ready performance and scale
- 🌐 Cloud & Collaboration - Web version and team features
- 🔮 The Moonshots - Advanced features and ecosystem
Goal: A working, branded application that opens .docx files in a clean, distraction-free interface. No AI yet. Just a better writer than Word.
- Create VS Code Extension scaffold with TypeScript
- Successfully build with
npm run compile - Update
package.json: Set name to ScienceStudio - Design and add extension icon (Emerald theme)
- Hide the Noise: Configure to hide "Run", "Debug", and "Source Control" panels by default
- Zen Status Bar: Remove code-specific items, show "Word Count" only
- Welcome View: Replace default with "My Research" dashboard (Recent Papers, Thesis Progress)
- Focus Mode Command: Implement command to toggle minimal UI
- Custom Editor API: Register provider for
.docxand.researchfiles - The Webview: Mount React app running ProseMirror inside editor pane
- The Bridge: Implement
vscode.postMessagebridge for file system sync - Academic Styles: Apply paper-appropriate CSS (Times New Roman, standard margins)
- Basic .docx Import: Use mammoth.js for initial conversion
✅ Definition of Done: You can install the extension, open thesis.docx, write formatted text, hit Save, and it persists correctly.
Goal: The application becomes "aware" of the user's library. Drop 50 PDFs in, and the system indexes them intelligently.
- Library Folder: Create workspace structure with
library/folder - File Watcher: Monitor for new PDF additions
- PDF.js Integration: Basic PDF rendering in custom editor
- Text Extraction: Extract text with layout preservation
- Semantic Parser: Integrate LlamaParse or MarkItDown for structure extraction
- Section Detection: Identify Abstract, Methods, Results, Discussion
- Vector Storage: Implement ChromaDB for semantic search
- Metadata Extraction: Parse title, authors, year, journal
- Sidebar Chat: Add webview panel for AI interaction
- Search Tool: Implement
search_paperscommand - Context Retrieval: RAG pipeline with source citations
- Example Queries: "What methods did Smith 2023 use?" with accurate answers
✅ Definition of Done: Drop a folder of PDFs, ask "What do my papers say about cognitive load?", get accurate answer with sources.
Goal: Move from "Chatbot" to "Research Assistant." Handle the strict rules and workflows of academia.
- BibTeX Support: Parse and manage
references.bib - Citation Autocomplete: Type
@to trigger paper dropdown - Smart Citations: Create citation nodes
<cite id="smith2023"/>not just text - Reference List: Auto-generate bibliography from used citations
-
RESEARCH.md: Create project context file that feeds every AI prompt - Document Awareness: AI understands current section and structure
- Writing Suggestions: Context-aware next section recommendations
- Argument Tracking: Monitor claims and supporting evidence
- Synchronized Highlighting: Click citation → open PDF to exact location
- Annotation Sync: PDF highlights create notes in document
- Evidence Linking: Drag PDF text to create supported claim
- Split View: PDF and document side-by-side with sync scrolling
✅ Definition of Done: Write a page citing 5 papers with autocomplete, click any citation to verify source at exact page.
Goal: Handle real research workloads - 100+ page theses, 1000+ papers, zero data loss.
- LanceDB Upgrade: Implement high-performance vector store
- Lazy Loading: Stream large documents efficiently
- Background Processing: Non-blocking PDF analysis
- Cache Strategy: Smart caching for instant response
- Auto-Save with Git: Every save creates hidden commit
- Time Travel UI: Visual slider for document history
- Diff Viewer: See what changed between versions
- Branch for Reviews: Create branches for supervisor feedback
- Perfect .docx Export: Maintain all Word formatting
- LaTeX Pipeline: Clean .tex generation via Pandoc
- Journal Templates: One-click format for target journal
- Submission Package: Generate all required files
✅ Definition of Done: 100-page thesis with no lag, perfect Word export that professors think was native Word.
Goal: Enable anywhere access and team research via vscode.dev.
- Browser Compatibility: Full feature parity in browser
- Cloud Storage: Sync documents and library
- Offline Mode: Progressive web app capabilities
- Mobile View: Responsive design for tablets
- Real-time Editing: Multiple users in document
- Comment Threads: Contextual discussions
- Shared Libraries: Team PDF collections
- Review Workflows: Supervisor approval process
- Graph View: Visualize paper connections like Obsidian
- Podcast Mode: AI converts papers to audio summaries
- Writing Analytics: Track productivity and progress
- Grant Assistant: Specialized mode for proposals
- Peer Review Mode: Anonymize and format for review
- Citation Network: Discover related papers automatically
- Faster to Dogfooding: Focus on working writer first, AI second
- Concrete Checkboxes: More specific, actionable tasks
- Git Integration: Version control as core feature, not afterthought
- MCP Architecture: Consider Model Context Protocol for AI integration
- Clear "Definition of Done": Each phase has specific success criteria
- Performance Metrics: Specific targets for speed and scale
- Market Validation: User testing throughout
- Technical Decisions: Explicit choice points
- Risk Mitigation: Proactive problem solving
- Each phase builds on the previous one
- Complete "Definition of Done" before moving to next phase
- User feedback drives priority within phases
- Technical debt addressed between phases
- Alpha: Internal testing with trusted researchers
- Beta: Limited release to academic community
- Public: Open availability after Phase 3 completion
- Researchers can complete real work in the tool
- Performance meets targets under real workloads
- Community contributors start submitting PRs
- Users report time savings and quality improvements