Show HN: SoMatic – Vision-based OS automation framework for AI agents
Category: cli-tool
Tags: desktop-automation, computer-vision, ai-agents
Score: 7.0/10 (Innovation: 7, Technical: 7, Documentation: 7, Utility: 7)
SoMatic is a vision-based OS automation framework for AI agents that uses a local YOLO model to detect and number interactive UI elements on screen, enabling agents to take grounded actions via CLI commands. Its combination of an MIT-licensed core with an AGPL-licensed vision component, along with support for headless operation and MCP server integration, makes it an innovative tool for agent-based desktop automation.
Target audience: backend devs, devops, AI engineers
Repository: https://github.com/Smyan1909/SoMatic · Python · MIT · 1 stars
View on Hacker News