Software Engineer, Data Platform Infrastructure (4-7 years)
Job Description
Job Description
Senior Data Platform Engineer (AI / Data Fabric / Iceberg Lakehouse)
Location: Boston, MA (Onsite 4 days/week)
About Our Client
Our client is a global investment firm building foundational AI, data, and platform capabilities to enable scale and business transformation. Their platform organization is responsible for evolving technology into robust, reusable, platform-based solutions that increase agility and deliver material business impact across the firm.
The Team
You'll join a core AI, Data, and Platform Technologies engineering team that architects and supports the firm's foundational data and AI capabilities. This team is a key enabler for firm-wide services—designing platform primitives that power AI and investment workflows at enterprise scale.
The Role
Our client is hiring a Senior Engineer to take a hands-on technical leadership role in Boston. You will help design and build world-class data engineering capabilities that process massive pipelines, leverage AI-powered insights and document extraction, and integrate across diverse cloud-powered databases.
You'll define the technical blueprint for how the firm structures, stores, governs, and leverages data to support critical AI and investment platforms—ensuring integrity, performance, and accessibility at scale. This is an onsite role with expectations to be in the Boston office 4 days per week.
What You'll Do
- Lead the architecture, buildout, and modernization of a unified data fabric, establishing scalable patterns for access, interoperability, governance, and productization across business and technology teams
- Own the design and evolution of Apache Iceberg-based data platform capabilities: ingestion/egress, replication, streaming, performance tuning, lifecycle management, and adoption standards for analytical + operational use cases
- Define and implement compute architecture across stateful, stateless, and distributed processing layers—balancing performance, resiliency, scalability, and cost
- Design and drive adoption of event-driven patterns for real-time ingestion and data movement with low-latency, reliable, observable flows
- Extend platform capabilities to support the firm's AI/ML ecosystem, including curated datasets, feature-ready pipelines, training/inference data services, and integration points for model development and deployment
- Translate business and engineering priorities into a clear technical roadmap—sequencing platform investments for long-term value
- Serve as a senior engineering lead across strategic platform domains, partnering with app engineering, enterprise architecture, data consumers, and ML stakeholders
- Establish standards for data quality, observability, lineage, governance, security, and operational excellence across batch and streaming environments
- Mentor junior engineers and raise the bar on design rigor, implementation quality, and operational ownership
- Evaluate emerging technologies, run PoCs, and recommend production-ready solutions aligned with target-state architecture
What Our Client Is Looking For
- Bachelor's degree in CS/Engineering/IS (advanced degree preferred)
- 4+ years in data engineering, distributed systems, or platform engineering, operating at a senior technical leadership level
- Proven platform mindset: designed/delivered enterprise-scale data platforms or lakehouse ecosystems with emphasis on scalability, reliability, governance, and developer enablement
- Deep hands-on experience with Apache Iceberg and modern open table formats (modeling, partitioning, tuning, metadata management, operational best practices)
- Strong understanding of distributed compute architectures (stateful/stateless processing, orchestration, fault tolerance, performance optimization)
- Experience implementing event-driven/streaming architectures using modern messaging and data movement patterns
- Experience enabling AI/ML platform capabilities (pipeline design, feature/data prep, integration with model development or production ML systems)
- Strong proficiency in Python and SQL; Java/Scala and modern data processing frameworks highly desirable
- Cloud-native/containerized experience, including orchestration, infra automation, and observability tooling
- Ability to operate as a senior technical decision-maker: influence architecture, drive execution through others, and partner across technical + non-technical stakeholders
- Highly desirable: experience in AI Harness engineering
Why This Role
- Own core architectural decisions for a firm-wide data fabric and lakehouse platform
- Work at the intersection of high-scale data engineering and AI enablement
- Build real-time and batch systems with strong governance, lineage, and production standards
- High visibility, high leverage platform work that compounds across many business units
