Principal Software Engineer – Model Inference

By estreetsecurity
June 20, 2025June 20, 2025
Comments are off

Hybrid

Boston, MA, Raleigh, NC

Posted 2 months ago

Principal Software Engineer – Model Inference

Posted: June 19, 2025

Job Type: Permanent

Industry: Computer and Mathematical

Our client, a recognized leader in the technology sector, is actively seeking a highly skilled Principal Software Engineer to join their dynamic team. As a Principal Software Engineer, you will be an integral part of the Software Engineering department, directly supporting the innovative OpenShift AI team. The ideal candidate will possess strong communication skills, a deeply collaborative mindset, and an unbridled passion for innovation, ensuring successful alignment within the organization’s forward-thinking and high-impact environment.

Location & Compensation:

Location: Raleigh, NC, or Boston, MA (This is a hybrid role.)
Salary Range: Competitive

What’s the Job?

As a Principal Software Engineer focused on Model Inference, you will play a critical role in advancing the capabilities of AI and Machine Learning platforms. Your key responsibilities will include:

High-Performance ML Inference Runtime Development: Leading the design, development, and maintenance of a high-quality, high-performing ML inference runtime platform. This platform is crucial for enabling multi-modal and distributed model serving at scale.
Open-Source Community Contribution: Directly contributing to significant upstream inference runtime communities. This includes active participation in projects and libraries such as vLLM, TGI, PyTorch, OpenVINO, and other relevant open-source initiatives.
CI/CD Pipeline Maintenance & Optimization: Maintaining and optimizing robust CI/CD (Continuous Integration/Continuous Delivery) build pipelines specifically for container images. This ensures faster, more secure, reliable, and frequent releases of ML inference components.
Stakeholder Coordination & Communication: Effectively coordinating and communicating with various internal and external stakeholders to ensure clear project alignment, transparency, and successful delivery of AI/ML solutions.
Continuous Learning & AI/ML Advancement: Applying a strong growth mindset by continuously staying up to date with the latest advancements in the rapidly evolving fields of Artificial Intelligence (AI) and Machine Learning (ML). You’ll translate these insights into practical applications and improvements.
Problem Solving: Tackling complex technical challenges related to model serving scalability, performance, and efficiency.

What’s Needed?

We’re looking for a highly experienced and technically profound individual with:

Python & PyTorch Expertise: Extensive hands-on experience with programming in Python and a deep proficiency in PyTorch, a foundational framework for deep learning.
Model Optimization Familiarity: Strong familiarity with critical model optimization techniques such as model parallelization, quantization, and memory optimization. This includes practical experience using relevant libraries like vLLM, TGI, and other specialized inference libraries.
Python Packaging Experience: Proven experience with Python packaging, including building and managing PyPI libraries.
C++ & CUDA (Bonus): Development experience with C++, especially with the CUDA APIs, is considered a significant advantage, demonstrating capability in high-performance computing for AI.
Model Inferencing Architectures: A solid understanding of the fundamental principles and architectural patterns behind efficient model inferencing, from single-node to distributed deployments.

What’s in it for Me?

This role offers compelling opportunities for significant professional growth and impact:

Cutting-Edge AI/ML: The unique opportunity to work daily on cutting-edge AI and machine learning technologies, pushing the boundaries of what’s possible in the field.
Collaborative Environment: Join a highly collaborative and inclusive work environment that fosters teamwork, innovation, and mutual support among talented engineers.
Open-Source Contribution: A direct chance to contribute to significant open-source development communities, making a broader impact on the AI/ML ecosystem.
Diverse Team Engagement: Engage with a diverse team of exceptionally talented engineers, fostering knowledge exchange and continuous learning.
Professional Growth: Access to excellent professional growth and development opportunities, supporting your continuous learning journey and career advancement.

If this challenging and rewarding permanent role interests you and you’d like to learn more, click “apply now,” and a recruiter will be in touch to discuss this great opportunity. We look forward to speaking with you!