Mario Brajkovski

AI Research Engineer at HUD (Founding Team)

San Francisco / Munich · Remote

I build custom AI solutions for frontier labs, working directly with product and engineering teams to deliver production systems on tight timelines. Currently at HUD building evaluation infrastructure for computer-use agents. I've shipped full-stack AI products and delivered bespoke LLM evaluation pipelines for major AI companies.

Technical Highlights

Enterprise AI Delivery: Built custom evaluation pipelines and model testing infrastructure for frontier AI labs, collaborating directly with product teams from scoping through deployment.

Full-Stack Production: End-to-end AI applications with Next.js, TypeScript, PostgreSQL, and Stripe—320+ active users, sub-second latency, real-time WebRTC streaming.

Client Collaboration: Track record of on-time delivery working embedded with client engineering and PM teams on complex, deadline-driven AI projects.

RL & Evaluation Systems: Designed full-scale reinforcement learning environments for next-generation AI models, including reward hacking analysis and prevention.

Safety & Red-teaming: Identifying failure modes, alignment issues, and adversarial vulnerabilities in autonomous agents through systematic red-teaming.

Now

HUD San Francisco · Jul 2025–

Founding Engineer. Building evaluation infrastructure and RL environments for computer-use agents. Working embedded with frontier AI labs to deliver custom evaluation systems—scoping requirements directly with product managers, iterating on solutions, and shipping production-ready tooling on tight deadlines. YC-backed.

Client Engagements

Frontier Lab (2025): Designed and delivered custom LLM evaluation infrastructure for an unreleased model. Worked directly with product managers from requirements gathering through deployment. Delivered on schedule.

Frontier Lab (2025): Built full-scale RL environments for next-generation AI model training. Designed reward hacking analysis and prevention systems, red-teaming infrastructure, and alignment testing frameworks.

Projects

Tutor.mk

AI tutoring platform for the Macedonian curriculum serving 320+ students. Full-stack TypeScript with Next.js, PostgreSQL, Stripe integration, and custom LLM prompting for curriculum-aligned responses.

TwoPeas

Voice-first AI companion with sub-second end-to-end latency. Custom WebRTC implementation with SDP negotiation, VAD, and persistent memory system. Deployed on Firebase with real-time audio streaming.

Tutorist

Real-time tutoring with Live2D avatars and OpenAI Realtime API. Custom WebRTC audio session management, dynamic tool injection, and emotion-mapped avatar expressions. TypeScript throughout with Terraform IaC.

Before

HanseMerkur Munich · 2024–25

DevOps. On-prem to AWS EKS migration, Crossplane.

Joyn Munich · 2019–23

Site Reliability. Kafka, Vault, observability stack.

Research

Confidential LLM inference in trusted execution environments

Demonstrated practical privacy-preserving ML using AMD SEV-SNP encrypted VMs with full memory encryption and attestation. Achieved <20% performance overhead on 110M-param BERT, showing viable path for confidential AI in production environments.

X GitHub LinkedIn