
I’m a head of agentic AI and lead research engineer at Dynamo AI (YC 22). Currently I focus on building AgentWarden: a product to detect agentic risk vectors, guardrails and tooling to reduce the risk and observability tool with intelligence to flag when things are going wrong. I spent 2 years at Dynamo working on (synthetic) data flywheels, evaluations, and training (SFT / RL), all focused on creating efficient and aligned custom guardrailing and judge models. What makes it hard (and thus fun) is that the objectives are subjective, under-specified in natural language and require iterative human-model alignment through extensive evals.
Before joining Dynamo I worked in RL for Combinatorial Optimization and Code Generation teams at Qualcomm AI Research in Amsterdam. I studied Artifical Intelligence at the Univeristy of Amsterdam, specializing in Reinforcement Learning where I spent 9 months at Amsterdam Machine Learning lab with prof. Herke van Hoof.
Projects I am most proud of:
- Built Dynamo’s output guardrail offering and team from the ground up into a mature, high-demand product. I touched every part of the stack, from interacting with PMs on definig evalaution sets, setting up annotation procedures and feedback loops, synthetic data generation, training, post-training interventions for more customizability and efficient inference. The product is used be a few Fortune 500 companies (1, 2, 3) to safeguard their AI deployments.
- Togeher with my team at Qualcomm we achieved SOTA on The Abstraction and Reasoning Challenge (ARC) with a ~ 220M language model by combining hindsight relabeling of erronoues program and learning from prioritized hinsight reply (ICML 24’ paper). Despite being ~ a dead end I am also proud of our attempt to use MCTS as a neurally-guided search language model decoding method to provide natural curriculm for learning to write simple programs in zero human data regime (ICML 24’ workshop paper)
- Demonstrated that (hierarchical) RL can mitiagte congestion in power grids up to 6x more efficiently than a physics based simulator and that hierarchical policies can outperform the non-hierarchical ones. Wrote a paper about it.
Outside of work, I love endurance sports and science behind achieving peak human perfromance. I swim, bike, run, and like Middle Distance Training (70.3 IM) the most. Have a sub-10 Ironman race under the belt, want to do a sub 9 at some point. I lack time for other sports but I also do enjoy them: despite failing at learning surfing, I am not giving up :)
Sometimes I write about stuff; you can read it here: /posts/.
Contact: if you’d like to chat about AI, go for a bike ride or grab coffee send me a DM on X / LinkedIn / Strava.
Updates
- December 2025
—
Promoted to Head of Agentic AI at Dynamo, overseeing all of our agentic products.
- December 2025
—
Presented our paper at NeurIPS in San Diego.
- November 2025
—
Personal: took part in and crashed the Ironman 70.3 World Championship.
- November 2025
—
Introducing AgentWarden, a product where I am a lead research engineer.