RL environment design for computer-use agents. Reward shaping, reward hacking detection, trajectory collection for browser-based tasks.
Mario Brajkovski
Building RL environments and evaluation systems for frontier AI agents
North Macedonia · Remote
Founding engineer at HUD, building RL training environments and evaluation infrastructure for computer-use agents. Co-author of ZeroDayBench (ICLR 2026), a benchmark for evaluating whether LLM agents can autonomously discover and patch zero-day vulnerabilities. Most of my time goes into the infrastructure that makes models better — RL environments, reward design that holds up under hacking, and evals that actually mean something.
What I Work On
Evaluation infrastructure for frontier models. Figuring out what to measure and how to measure it without fooling yourself.
Agent safety and red-teaming. Trying to break autonomous agents before they go to production.
Research
Can frontier LLMs autonomously find and patch real vulnerabilities? We tested GPT-5.2, Claude Sonnet 4.5, and Grok 4.1 on 22 novel critical zero-days in open-source codebases. Short answer: not yet. But the failure modes are telling. A couple of frontier labs are now looking at running it internally.
Privacy-preserving ML using AMD SEV-SNP encrypted VMs. Full memory encryption and attestation with <20% overhead on 110M-param BERT.
Selected Work
RL environments for training computer-use agents. Reward hacking analysis and prevention, red-teaming infrastructure, alignment testing.
Eval infrastructure for an unreleased frontier model. Designed the evals from scratch, including the scoring and validation.
Eval environment for AI agents on competitive programming. Anti-cheating grading server, 21 problem scenarios with hidden test cases.
Writing
Projects
AI companion with a Live2D avatar that talks, emotes, and moves in sync with the conversation. Sub-second voice latency, persistent memory.
Background
2019–2025: Infrastructure and reliability engineering at Joyn (streaming platform, 4 yrs) and HanseMerkur (insurance). Distributed systems, large-scale ops.