AI Operations Platform Consultant
Job Description:
- Brings extensive experience operating large-scale GPU-accelerated AI platforms, deploying and managing LLM inference systems on Kubernetes with strong expertise in Triton Inference Server and TensorRT-LLM.
- They have repeatedly built and optimized production-grade LLM pipelines with GPU-aware scheduling, load balancing, and real-time performance tuning across multi-node clusters. Their background includes designing containerized microservices, implementing robust deployment workflows, and maintaining operational reliability in mission-critical environments.
- They have led end-to-end LLMOps processes involving model versioning, engine builds, automated rollouts, and secure runtime controls.
- The candidate has also developed comprehensive observability for inference systems, using telemetry and custom dashboards to track GPU health, latency, throughput, and service availability.
- Their work consistently incorporates advanced optimization methods such as mixed precision, quantization, sharding, and batching to improve efficiency. Overall, they bring a strong blend of platform engineering, AI infrastructure, and hands-on operational experience running high-performance LLM systems in production
Basic Info:
- AI Operations Platform Consultant
- Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
- Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server.
- Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
- Setup and operation of AI inference service monitoring for performance and availability.
- Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
- Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
- Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
- Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc.
- Managing scalable infrastructure for deploying and managing LLMs
- Deploying models in production environments, including containerization, microservices, and API design
- Triton Inference Server, including its architecture, configuration, and deployment.
- Model Optimization techniques using Triton with TRTLLM
- Model optimization techniques, including pruning, quantization, and knowledge distillation
Recommended Jobs
Physical Therapist Assistant (PTA) North Bergen NJ
Physical Therapist Assistant (PTA) North Bergen NJ Full-Time or Part-Time | up to $65,000 per year | Immediate Openings Available Join Our Growing Team! We are seeking a compassionate and m…
AI Implementation Engineer - Digital Marketing
Benefits Flexible schedule Opportunity for advancement Training & development About VINIO Marketing VINIO Marketing is a full-service digital marketing agency based in Toms River, New …
Marketing Manager
Company Overview Our client is a growing organization focused on delivering reliable products and exceptional service through well-organized purchasing and operational processes. With a strong e…
Ralph Lauren Executive Assistant, Finance
Company Description Ralph Lauren Corporation (NYSE:RL) is a global leader in the design, marketing and distribution of premium lifestyle products in five categories: apparel, accessories, home, fr…
Certified Home Health Aide
Benefits: ~401(k) ~ Competitive salary ~ Dental insurance ~ Employee discounts ~ Flexible schedule ~ Health insurance ~ Paid time off ~ Training & development ~ Vision insurance …
LEAD, US PROFESSIONAL MARKETING (Sr. Director)
Job Description Job Description Description: Zydus Therapeutics is a clinical stage, specialty-focused bio-pharmaceutical company focused on developing transformative treatments to transform l…
Special Procedure Technologist (Per Diem Evening) Toms River, NJ
Job Title: Special Procedure Technologist Location: Community Medical Center Department: Special Procedures-Radiology Req#: 0000199307 Status: Per Diem Shift: Evening Pay Range: $50.…
Inventory & Demand Planner
Join the team at New Jersey’s largest wine and spirits distributor! At Allied Beverage Group, we’re proud of our dynamic, family-based culture and our role in keeping the beverage industry moving. I…
Automotive Service Technician
Jim Curley Buick GMC of Lakewood is hiring and we want to talk to you! Are you ready to jump-start your career and grow with our organization? If the answer is yes, apply below! WE OFFER: * Steady …