I’ve been spending some time this weekend hacking on a side project I call Vibe Check. The idea is instead of doing agentic code review, you let the agents check your code by actually running your software.
The long-term question is: if these agents get good enough, can these checks basically substitute for review and verification that the software does what it’s expected to do? And in that case, kind of bypass the review step for some portion of your code.
In some ways it’s like a new type of CI platform, but everything happens inside a sandbox, and the agent is the runtime. It makes decisions like “given this particular change, how would I run this code?” and then invokes a series of skills.
It’s been pretty interesting to play with. Not sure if it goes anywhere.