AI Operations Platform Consultant
Job Description:
- Brings extensive experience operating large-scale GPU-accelerated AI platforms, deploying and managing LLM inference systems on Kubernetes with strong expertise in Triton Inference Server and TensorRT-LLM.
- They have repeatedly built and optimized production-grade LLM pipelines with GPU-aware scheduling, load balancing, and real-time performance tuning across multi-node clusters. Their background includes designing containerized microservices, implementing robust deployment workflows, and maintaining operational reliability in mission-critical environments.
- They have led end-to-end LLMOps processes involving model versioning, engine builds, automated rollouts, and secure runtime controls.
- The candidate has also developed comprehensive observability for inference systems, using telemetry and custom dashboards to track GPU health, latency, throughput, and service availability.
- Their work consistently incorporates advanced optimization methods such as mixed precision, quantization, sharding, and batching to improve efficiency. Overall, they bring a strong blend of platform engineering, AI infrastructure, and hands-on operational experience running high-performance LLM systems in production
Basic Info:
- AI Operations Platform Consultant
- Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
- Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server.
- Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
- Setup and operation of AI inference service monitoring for performance and availability.
- Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
- Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
- Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
- Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc.
- Managing scalable infrastructure for deploying and managing LLMs
- Deploying models in production environments, including containerization, microservices, and API design
- Triton Inference Server, including its architecture, configuration, and deployment.
- Model Optimization techniques using Triton with TRTLLM
- Model optimization techniques, including pruning, quantization, and knowledge distillation
Recommended Jobs
Electronic Medical Records Clerk - Remote
Summary: Anova Care, a provider of home care and home health services, is looking for a compassionate and reliable care provider to assist with care in the area of Elizabeth, CO. Our medical facilit…
Tax Advisor - Fiduciary Trust Tax Ser - FSO - EDGE - Senior Analyst - Multiple Positions - 1677055
EY focuses on high-ethical standards and integrity among its employees and expects all candidates to demonstrate these qualities. At EY, you’ll have the chance to build a career as unique as y…
2026 Part-Time Gameday Communications Staff
Red Bull New York are one of 30 teams in Major League Soccer (MLS). RBNY, one of the ten charter clubs of MLS, have competed in the league since its founding in 1996. Red Bull New York play home matc…
Administrative Assistant/Customer Service
Northern Architectural Systems (“NAS”) is a growth-minded, locally owned and operated company which is dedicated to servicing our customers. We offer high quality, energy-efficient fenestration and b…
Senior LIMS Software Engineer - Remote
Kforce is working with a well-known client in search of a Senior LIMS Software Engineer - Remote to join their team! Overview: Are you an experienced Software Engineer passionate about building solut…
Companion-Caregiver NO LICENSE REQUIRED
COMPANION POSITION - NO LICENSE REQUIRED Urgently Hiring! It’s more than just a job! Monmouth and Middlesex County We are looking for YOU to make a difference in someone life! Come join one of the…
Assistant Route Service Sales Representative (4 Day Workweek)
Requisition Number: 215224 Job Description Cintas is seeking an Assistant Route Service Sales Representative. Responsibilities include providing route service to a set customer base, through t…
Certified Occupational Therapist School
Epic Special Education Staffing is seeking a passionate and dedicated Certified Occupational Therapist to join our collaborative team. This role focuses on supporting students with special needs in ed…
Manager, Commerical Excellence
Description Position at WebMD WebMD is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status,…
Automotive - Technical Support Specialist
As an independent group of companies, the BMW Group has a commitment to creativity and breakthrough ideas that goes well beyond the racetrack. In order to continuously create ultimate driving machine…