Launch HN: Mosaic (YC W25) – Agentic Video Editing
Hey HN! We’re Adish & Kyle from Mosaic (https://edit.mosaic.so, https://docs.mosaic.so/, https://mosaic.so). Mosaic lets you create and run your own multimodal video editing agents in a node-based canvas. It’s different from traditional video editing tools in two ways: (1) the user interface and (2) the visual intelligence built into our agent.
We were engineers at Tesla and one day had a fun idea to make a YouTube video of Cybertrucks in Palo Alto. We recorded hours of cars driving by, but got stuck on how to scrub through all this raw footage to edit it down to just the Cybertrucks.
We got frustrated trying to accomplish simple tasks in video editors like DaVinci Resolve and Adobe Premiere Pro. Features are hidden behind menus, buttons, and icons, and we often found ourselves Googling or asking ChatGPT how to do certain edits.
We thought that surely now, with multimodal AI, we could accelerate this process. Better yet, an AI video editor could automatically apply edits based off what it sees and hears in your video. The idea quickly snowballed and we began our side quest to build “Cursor for Video Editing”.
We put together a prototype and to our amazement, it was able to analyze and add text overlays based on what it saw or heard in the video. We could now automate our Cybertruck counting with a single chat prompt. That prototype is shown here: https://www.youtube.com/watch?v=GXr7q7Dl9X0.
After that, we spent a chunk of time building our own timeline-based video editor and making our multimodal copilot powerful and stateful. In natural language, we could now ask chat to help with AI asset generation, enhancements, searching through assets, and automatically applying edits like dynamic text overlays. That version is shown here: https://youtu.be/X4ki-QEwN40.
After talking to users though, we realized that the chat UX has limitations for video: (1) the longer the video, the more time it takes to process. Users have to wait too long between chat responses. (2) Users have set workflows that they use across video projects. Especially for people who have to produce a lot of content, the chat interface is a bottleneck rather than an accelerant.
That took us back to first principles to rethink what a “non-linear editor” really means. The result: a node-based canvas which enables you to create and run your own multimodal video editing agents. https://screen.studio/share/SP7DItVD.
Each tile in the canvas represents a video editing operation and is configurable, so you still have creative control. You can also branch and run edits in parallel, creating multiple variants from the same raw footage to A/B test different prompts, models, and workflows. In the canvas, you can see inline how your content evolves as the agent goes through each step.
The idea is that canvas will run your video editing on autopilot, and get you 80-90% of the way there. Then you can adjust and modify it in an inline timeline editor. We support exporting your timeline state out to traditional editing tools like DaVinci Resolve, Adobe Premiere Pro, and Final Cut Pro.
We’ve also used multimodal AI to build in visual understanding and intelligence. This gives our system a deep understanding of video concepts, emotions, actions, spoken word, light levels, shot types.
We’re doing a ton of additional processing in our pipeline, such as saliency analysis, audio analysis, and determining objects of significance—all to help guide the best edit. These are things that we as human editors internalize so deeply we may not think twice about it, but reverse-engineering the process to build it into the AI agent has been an interesting challenge.
Some of our analysis findings:
Optimal Safe Rectangles: https://assets.frameapp.ai/mosaicresearchimage1.png
Video Analysis: https://assets.frameapp.ai/mosaicresearchimage2.png
Saliency Analysis: https://assets.frameapp.ai/mosaicresearchimage3.png
Mean Movement Analysis: https://assets.frameapp.ai/mosaicresearchimage4.png
Use cases for editing include: - Removing bad takes or creating script-based cuts from videos / talking-heads - Repurposing longer-form videos into clips, shorts, and reels (e.g. podcasts, webinars, interviews) - Creating sizzle reels or montages from one or many input videos - Creating assembly edits and rough cuts from one or many input videos - Optimizing content for various social media platforms (reframing, captions, etc.) - Dubbing content with voice cloning and lip syncing.
We also support use cases for generating content such as motion graphic animations, cinematic captions, AI UGC content, adding contextual AI-generated B-Rolls to existing content, or modifying existing video footage (changing lighting, applying VFX).
Currently, our canvas can be used to build repeatable agentic workflows, but we’re working on a fully autonomous agent which will be able to do things like: style transfer using existing video content, define its own editing sequence / workflow without needing a canvas, do research and pull assets from web references, and so on.
You can try it today at https://edit.mosaic.so. You can sign up for free and get started playing with the interface by uploading videos, making workflows on the canvas, and editing them in the timeline editor. We do paywall node runs to help cover model costs. Our API docs are at https://docs.mosaic.so. We’d love to hear your feedback!
Comments URL: https://news.ycombinator.com/item?id=45980760
Points: 109
# Comments: 105
Wed, 19 Nov 2025, 3:28 pm
GitHub Down
Seeing:
"""
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
"""
From multiple accounts across multiple orgs + repos.
Edit: there it is https://www.githubstatus.com
Comments URL: https://news.ycombinator.com/item?id=45971741
Points: 98
# Comments: 2
Tue, 18 Nov 2025, 8:41 pm
Show HN: RowboatX – open-source Claude Code for everyday automations
Claude Code is great, but it’s focused on coding. The missing piece is a native way to build and run custom background agents for non-code tasks. We built RowboatX as a CLI tool modeled after Claude Code that lets you do that. It uses the file system and unix tools to create and monitor background agents for everyday tasks, connect them to any MCP server for tools, and reason over their outputs.
Because RowboatX runs locally with shell access, the agents can install tools, execute code, and automate anything you could do in a terminal with your explicit permission. It works with any compatible LLM, including open-source ones.
Our repo is https://github.com/rowboatlabs/rowboat, and there’s a demo video here: https://youtu.be/cyPBinQzicY
For example, you can connect RowboatX to the ElevenLabs MCP server and create a background workflow that produces a NotebookLM-style podcast every day from recent AI-agent papers on arXiv. Or you can connect it to Google Calendar and Exa Search to research meeting attendees and generate briefs before each event.
You can try these with: `npx @rowboatlabs/rowboatx`
We combined three simple ideas:
1. File system as state: Each agent’s instruction, memory, logs, and data are just files on disk, grepable, diffable, and local. For instance, you can just run: grep -rl '"agent":""' ~/.rowboat/runs to list every run for a particular workflow.
2. The supervisor agent: A Claude Code style agent that can create and run background agents. It predominantly uses Unix commands to monitor, update, and schedule agents. LLMs handle Unix tools better than backend APIs [1][2], so we leaned into that. It can also probe any MCP server and attach the tools to the agents.
3. Human-in-the-loop: Each background agent can emit a human_request message when needed (e.g. drafting a tricky email or installing a tool) that pauses execution and waits for input before continuing. The supervisor coordinates this.
I started my career over a decade ago building spam detection models at Twitter, spending a lot of my time in the terminal with Unix commands for data analysis [0] and Vowpal Wabbit for modeling. When Claude Code came along, it felt familiar and amazing to work with. But trying to use it beyond code always felt a bit forced. We built RowboatX to bring that same workflow to everyday tasks. It is Apache-2.0 licensed and easily extendable.
While there are many agent builders, running on the user's terminal enables unique use cases like computer and browser automation that cloud-based tools can't match. This power requires careful safety design. We implemented command-level allow/deny lists, with containerization coming next. We’ve tried to design for safety from day one, but we’d love to hear the community’s perspective on what additional safeguards or approaches you’d consider important here.
We’re excited to share RowboatX with everyone here. We’d love to hear your thoughts and welcome contributions!
—
[0] https://web.stanford.edu/class/cs124/kwc-unix-for-poets.pdf
[1] https://arxiv.org/pdf/2405.06807
[2] https://arxiv.org/pdf/2501.10132
Comments URL: https://news.ycombinator.com/item?id=45970338
Points: 101
# Comments: 39
Tue, 18 Nov 2025, 6:50 pm
Show HN: Browser-based interactive 3D Three-Body problem simulator
Features include:
- Several preset periodic orbits: the classic Figure-8, plus newly discovered 3D solutions from Li and Liao's recent database of 10,000+ orbits (https://arxiv.org/html/2508.08568v1)
- Full 3D camera controls (rotate/pan/zoom) with body-following mode
- Force and velocity vector visualization
- Timeline scrubbing to explore the full orbital period
The 3D presets are particularly interesting. Try "O₂(1.2)" or "Piano O₆(0.6)" from the Load Presets menu to see configurations where bodies weave in and out of the orbital plane. Most browser simulators I've seen have been 2D.
Built with Three.js. Open to suggestions for additional presets or features!
Comments URL: https://news.ycombinator.com/item?id=45967079
Points: 173
# Comments: 61
Tue, 18 Nov 2025, 3:00 pm
Show HN: A subtly obvious e-paper room air monitor
In the cold season we tend to keep the windows closed. The air gets "stale": humidity often rises above 60 %, which can harm our wellbeing and promote mould. At the same time the CO₂ level in the air increases, which impacts our ability to concentrate.
So I built a room air monitor that stays unobtrusive as long as everything is in the green zone, but becomes deliberately noticeable once thresholds are exceeded. For my personal love of statistics I also visualise the measurements in a clear dashboard.
Comments URL: https://news.ycombinator.com/item?id=45962266
Points: 54
# Comments: 21
Tue, 18 Nov 2025, 7:14 am