AI Systems Engineer

ZRG Careers

Middlesex County, MA, USA

Published: 6/14/2022

Technology

Full Time

Job Description

Our client is building a cutting-edge AI platform for autonomous system design and operations. The platform understands workload characteristics, adapts to dynamic environments, and continuously optimizes multiple layers of the AI stack to meet performance and cost objectives. It can design and deliver system optimizations 10–100x faster than human experts, compressing weeks of effort into just hours.

The technology is grounded in over a decade of award-winning research originating from the Networks and Mobile Systems group at MIT CSAIL.

The company has been founded to commercialize this technology, with an initial focus on AI inference optimization — spanning the inference engine as well as higher-level orchestration layers such as scheduling, routing, autoscaling, and GPU selection.

As one example, the platform recently discovered a completely novel request routing algorithm for distributed LLM serving, achieving more than a 10x reduction in latency slowdown while reducing GPU costs by over 20%. This solution was discovered and validated in simulation in just two hours — a task that would typically take an experienced systems researcher weeks.

About the Role

Our client is hiring an AI Systems Engineer who combines strong systems engineering instincts with a genuine enthusiasm for AI. This is not a role for someone who is skeptical of AI’s role in systems development; the ideal candidate actively embraces AI as a collaborator and a tool for discovering new ideas in computer and networked systems.

In this role, you will help design, build, and ship the first versions of the company’s product. You’ll work closely with the core engineering team to translate deep research into robust, production-ready systems, with the broader goal of transforming how computer systems are designed and optimized using AI.

What You’ll Do

Design, build, and deploy core components of the AI-driven optimization stack.
Collaborate with the founders to translate research insights into real, usable systems.
Own critical pieces of the platform: routing, scheduling, autoscaling, hardware selection, inference engine integrations, and more.
Build prototypes quickly, validate them, and iterate based on user and customer feedback.
Help establish technical architecture, coding standards, and best practices for a high-velocity engineering culture.
Set up development workflows, infrastructure, tooling, and deployment pipelines.
Evaluate third-party tools/services and make key build vs. buy decisions.
Document design and development work and help establish the foundation for future engineering hires.

What We’re Looking For

Because this field is new, we don’t expect years of experience. We’re looking for engineers with great systems intuition, intellectual curiosity, and a strong desire to embrace AI-driven engineering.

Strong signals include:

Experience with AI inference, distributed serving, model optimization, and related topics.
Hands-on with vLLM, PyTorch, Triton inference server, Ray Serve, or other inference frameworks
Experience with GPU optimization, kernel design, CUDA programming, and low-level systems performance engineering
Strong Python and PyTorch fundamentals
Comfort in building large-scale, high-performance distributed systems
PhD or equivalent experience (PhD or equivalent + ~5 years is a good sweet spot)
Dislike for rigid bureaucratic engineering processes, but appreciation for disciplined engineering rigor.