← Back to Jobs
Posted May 31, 2026

Senior Staff Machine Learning Engineer, GenAI Platform

Job Description: • Lead and execute the vision, strategy, and roadmap for Reddit’s large-scale GenAI Platform. • Define the platform architecture and operating model that enable teams to build, deploy, and scale GenAI products reliably. • Drive the strategy for a unified LAG Gateway supporting internally and externally hosted LLMs through consistent APIs and abstractions. • Set the direction for core platform capabilities such as rate and token limit management, intelligent failover, and production resilience. • Shape Reddit’s approach to an enterprise-grade RAG system • Establish the strategic direction for agentic AI workflows and tool-use patterns across the platform. • Own the end-to-end platform strategy from concept through production adoption and long-term evolution. • Drive MLOps and LLMOps standards across CI/CD, testing, versioning, evaluation, and lifecycle management. • Define best practices for observability, monitoring, governance, and operational excellence across GenAI systems. • Partner across engineering, product, and leadership to align platform investments with company priorities and user needs. • Champion platform thinking with a strong focus on scalability, reliability, performance, and developer experience. • Influence technical direction across teams by turning emerging AI capabilities into a scalable platform strategy. Requirements: • 10+ years of experience in ML Engineering, AI Platform Engineering, or Cloud AI Deployment roles. • Have a track record of leading technical strategy and delivering AI platforms in cloud-based production environments at scale. • Demonstrate strong execution by turning strategy into action, driving complex initiatives end to end, and consistently delivering high-quality platform outcomes. • Bring deep experience operating Kubernetes and other orchestration systems in large-scale production environments. • Deep experience with cloud-based technologies for supporting an ML platform, including tools like AWS, Google Cloud Storage, infrastructure-as-code (Terraform), and more • Proficiency with the common programming languages and frameworks of ML, such as Go, Python, etc. • Excellent communication skills with the ability to articulate technical AI concepts to non-technical stakeholders • Strong focus on scalability, reliability, performance, and developer experience. You are an undying advocate for platform users and have a deep intuition for the genAI product development lifecycle. • Strong knowledge of model serving, inference pipelines, monitoring, and observability for AI systems is a plus Benefits: • Comprehensive Healthcare Benefits and Income Replacement Programs • 401k with Employer Match • Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support • Family Planning Support • Gender-Affirming Care • Mental Health & Coaching Benefits • Flexible Vacation & Paid Volunteer Time Off • Generous Paid Parental Leave