Historical document: This roadmap was created during the early baseline phase. Many items listed as blockers or missing features have since been implemented. For current status, see PROJECT_STATUS.md and IMPLEMENTATION_SUMMARY.md.
This forked Copilot CLI extends beyond a terminal tool into a local agentic desktop assistant with:
- Electron Overlay: Transparent grid system for precise screen targeting
- Visual Awareness: Real-time screen capture, OCR, and UI element detection
- System Automation: Mouse, keyboard, and window control via native APIs
- AI Integration: Multi-provider support (Copilot, OpenAI, Anthropic, Ollama)
The goal is to create a baseline application where the AI agent can:
- See the user's screen via screen capture
- Identify UI elements via accessibility APIs and inspect mode
- Execute actions (click, type, scroll) with precision
- Verify outcomes and self-correct
- File:
src/renderer/overlay/preload.js - Issue:
require('../../shared/grid-math')fails in Electron sandbox - Impact:
window.electronAPI= undefined, overlay doesn't render - Status: [x] Fixed
Solution Applied:
- Inlined grid-math constants and
labelToScreenCoordinates()directly in preload.js - Sandboxed preload can't use
require('path')or load external modules - Removed dependency on external grid-math.js in preload context
- File:
src/main/visual-awareness.js - Issue:
.replace(/\n/g, ' ')breaks Here-Strings (@" ... "@) - Impact: All 4 PowerShell functions return parse errors
- Status: [x] Fixed
Solution Applied:
- Created
executePowerShellScript()helper function - Writes PS1 to temp files in
os.tmpdir()/liku-ps - Executes with
powershell -NoProfile -ExecutionPolicy Bypass -File <path> - Cleans up temp files after execution
- Updated all 4 functions:
getActiveWindow(),extractWithWindowsOCR(),detectUIElements(),findElementAtPoint()
- File:
src/main/system-automation.js - Issue: AI clicks not reaching VS Code through transparent overlay
- Root Cause:
mouse_event()is deprecated and doesn't work reliably with:- Electron's
setIgnoreMouseEvents(true, { forward: true })(only forwards hardware events) - Layered windows with
WS_EX_TRANSPARENTstyle - Background windows that aren't focused
- Electron's
- Status: [x] Fixed
Solution Applied:
- Replaced
mouse_event()withSendInput()- Modern Win32 API with better UIPI handling - Added
SetForegroundWindow()activation - Activates target window before clicking - Implemented
GetRealWindowFromPoint()- Finds actual window under cursor, skipping transparent overlays - Added
ForceForeground()with thread attachment - UsesAttachThreadInput()to overcome focus restrictions - Updated functions:
click(),doubleClick(),drag()
Technical Details:
// Skip transparent windows (like Electron overlay)
int exStyle = GetWindowLong(hwnd, GWL_EXSTYLE);
bool isTransparent = (exStyle & WS_EX_TRANSPARENT) != 0;
// Force activate target window
AttachThreadInput(currentThread, foregroundThread, true);
SetForegroundWindow(hwnd);
AttachThreadInput(currentThread, foregroundThread, false);
// Use SendInput instead of deprecated mouse_event
SendInput(2, inputs, Marshal.SizeOf(typeof(INPUT)));- Spec: "Prefer
targetIdactions; fallback to grid if no regions" - Status: [ ] Not Implemented
Implementation Tasks:
- Add
targetIdfield to action schema in system prompt - Create
resolveTargetId(targetId)function in ai-service.js - Implement fallback: targetId → inspect region center → grid coordinate
- Update action executor to accept targetId
- File:
src/main/inspect-service.js - Status: [ ] Defined but never called
Implementation Tasks:
- Call
generateAIInstructions()in ai-service.js message builder - Append to system prompt when inspect mode is active
- Include instructions for referencing regions by ID
- File:
src/main/visual-awareness.js - Status: [ ] Placeholder only
Implementation Tasks:
- Implement pixel-level comparison using canvas or native image lib
- Generate bounding boxes for changed regions
- Create heatmap overlay visualization
- Expose diff results to AI context
- Spec: "Attach verification summary to AI response"
- Status: [ ] Not Implemented
Implementation Tasks:
- Capture screenshot after action sequence
- Compare with expected outcome (from
verificationfield) - Generate verification report
- Feed back to AI for confirmation/retry
- Spec: "Highlight focused control; expose caret position"
- Status: [ ] Not Implemented
Implementation Tasks:
- Query focused element via UIAutomation API
- Extract caret position from text controls
- Highlight focused element in overlay
- Include focus info in AI context
- File:
src/shared/inspect-types.js - Status: [ ] Hardcoded to 0
Implementation Tasks:
- Query window z-order via Win32 API
- Populate
zOrderfield in window context - Use for multi-window targeting
- Status: ✅ Code exists, blocked by BLOCKER-1 and BLOCKER-2
- Files:
overlay.js,index.html(styles exist)
Unblock Tasks:
- Fix BLOCKER-1 (preload)
- Fix BLOCKER-2 (PowerShell)
- Test region rendering end-to-end
- Status:
⚠️ Regions appended to user message
Completion Tasks:
- Call
generateAIInstructions()(FEATURE-4) - Add
targetIdsupport (FEATURE-3) - Include window context with zOrder (FEATURE-8)
Priority: 🔴 CRITICAL Estimate: 2-3 hours
| Task ID | Description | File | Status |
|---|---|---|---|
| P1.1 | Fix preload require path | preload.js |
[ ] |
| P1.2 | Rewrite PowerShell execution | visual-awareness.js |
[ ] |
| P1.3 | Test overlay renders dots | Manual test | [ ] |
| P1.4 | Test inspect mode toggle | Manual test | [ ] |
Priority: 🟠 HIGH Estimate: 4-6 hours
| Task ID | Description | File | Status |
|---|---|---|---|
| P2.1 | Implement targetId resolution | ai-service.js |
[ ] |
| P2.2 | Update system prompt with targetId | ai-service.js |
[ ] |
| P2.3 | Inject generateAIInstructions | ai-service.js |
[ ] |
| P2.4 | Test AI uses targetId for clicks | Manual test | [ ] |
Priority: 🟡 MEDIUM Estimate: 4-6 hours
| Task ID | Description | File | Status |
|---|---|---|---|
| P3.1 | Implement pixel diff comparison | visual-awareness.js |
[ ] |
| P3.2 | Create heatmap overlay rendering | overlay.js |
[ ] |
| P3.3 | Add verification summary to response | ai-service.js |
[ ] |
| P3.4 | Implement retry logic on failure | ai-service.js |
[ ] |
Priority: 🟢 LOW Estimate: 2-4 hours
| Task ID | Description | File | Status |
|---|---|---|---|
| P4.1 | Track focused control | visual-awareness.js |
[ ] |
| P4.2 | Extract caret position | visual-awareness.js |
[ ] |
| P4.3 | Populate window zOrder | visual-awareness.js |
[ ] |
| P4.4 | Add focus info to AI context | ai-service.js |
[ ] |
-
grid-math.jscoordinate conversions -
inspect-types.jsregion functions -
system-automation.jsaction parsing
- Overlay renders with electronAPI available
- Inspect mode detects UI elements
- AI receives inspect context in prompt
- Actions execute at correct coordinates
- Launch app:
npm start - Overlay dots visible on screen
- Ctrl+Alt+I toggles inspect mode
- Hover shows tooltips on detected regions
- Chat AI can describe screen contents
- Chat AI can click detected buttons
- Screenshot diff shows changes after action
┌─────────────────────────────────────────────────────────────┐
│ Electron Main Process │
├─────────────────────────────────────────────────────────────┤
│ ai-service.js │ system-automation.js │ visual-awareness │
│ - Multi-provider │ - Mouse/keyboard │ - Screen capture │
│ - Context builder │ - PowerShell exec │ - UIAutomation │
│ - Action parser │ - Grid math │ - OCR integration │
├─────────────────────────────────────────────────────────────┤
│ IPC Bridge │
├─────────────────────────────────────────────────────────────┤
│ Overlay Renderer │ Chat Renderer │
│ - Canvas grid dots │ - Message history │
│ - Inspect region boxes │ - Action confirmation │
│ - Tooltip rendering │ - Visual context preview │
└─────────────────────────────────────────────────────────────┘
Future: The Electron features can be exposed to a CLI interface:
liku inspect- Launch overlay in inspect modeliku click <targetId>- Click a detected regionliku capture- Take screenshot and analyzeliku ask "click the submit button"- Natural language action
- ✅ Overlay renders dot grid without errors
- ✅ Inspect mode detects and displays UI regions
- ✅ AI receives region data in context
- ✅ AI can target regions by ID or coordinates
- ✅ Actions execute and verify outcomes
- ✅ Screenshot diff shows what changed
- Code implemented and compiles without errors
- Feature works in manual testing
- No regression in existing functionality
- Console shows expected log messages
| Date | Version | Changes |
|---|---|---|
| 2026-01-28 | 0.0.2 | Identified critical blockers, created baseline roadmap |
| TBD | 0.1.0 | Phase 1 complete - Overlay functional |
| TBD | 0.2.0 | Phase 2 complete - targetId actions working |
| TBD | 0.3.0 | Phase 3 complete - Verification loop |
| TBD | 1.0.0 | Baseline application complete |