Loading, please wait...

artificial intelligence (AI) consultant

Title posted on CareerBeacon - Staff Platform Engineer - AI Infrastructure

Posted on April 25, 2026 by Employer details Medium

Job details

About the Role As a Staff Platform Engineer - AI Infrastructure, you will build and scale the infrastructure behind Paytm's AI inference platform, serving internal teams and enterprise customers and supporting new customer use cases from the ground up. You will own GPU infrastructure, model hosting and serving, and multi-model routing across modalities. This includes running our own coding and domain-specific models (voice, vision, risk, fintech workflows) as well as third-party models on shared GPU and accelerator clusters.You will also build self-service platforms that let teams provision, compute, deploy and customize models, and manage resources through APIs and control planes, so they can use AI without rebuilding infrastructure each time.Your work will form the AI control plane for Paytm Intelligence (Pi): policy-driven routing, quotas, observability, and usage and cost visibility. It will directly affect how fast we ship agents and AI features, how reliably they run, and how efficiently we use our hardware across payments, risk, fraud, collections, support, and developer experience. What You'll Do Design and operate GPU infrastructure for model hosting, including provisioning, scheduling, and cost optimization across cloud and on-premise environmentsBuild and scale model serving systems using vLLM, TensorRT-LLM, Triton, or equivalent, supporting real-time inference with strong latency and availability guaranteesImplement multi-model routing to serve multiple models across modalities (text, voice, code, vision) on shared infrastructureOwn the model lifecycle end to end: download, deploy, serve, monitor, swap, and scaleDrive inference optimization including quantization strategies (AWQ, GPTQ), batching, caching, and cold start reductionBuild self-service infrastructure platforms where teams provision compute, storage, and model endpoints through APIs, and control planesImplement infrastructure-as-code at scale using Terraform, Pulumi, or CDKBuild observability and reliability for inference systems: SLIs/SLOs, GPU utilization monitoring, latency tracking, automated capacity planning, and alertingDefine platform standards and governance including multi-tenant isolation, cost attribution, and resource quotasLead architectural design and influence engineering direction across the AI infrastructure stack What You'll Bring 8+ years of software engineering experience, including 3+ years building infrastructure platforms or ML/AI infrastructureDeep experience with cloud infrastructure (AWS, GCP) and KubernetesHands on experience with GPU workloads and model serving (vLLM, TensorRT-LLM, Triton, or similar)Strong software engineering fundamentals in Python, Go, or C++Experience with infrastructure-as-code (Terraform, Pulumi, CDK)Experience designing self service platforms or internal developer toolingUnderstanding of model optimization: quantization, batching, serving architecturesProven ability to lead complex cross team technical initiativesStrong communication skills and the ability to influence technical direction Nice to Have Experience building or operating inference infrastructure at scaleExperience with CUDA, GPU scheduling, or hardware level optimizationExperience with multi model serving across different modalitiesExperience with edge inference or on device model deploymentExperience with model fine tuning infrastructure (LoRA, QLoRA, PEFT)Background in fintech or regulated industries Go Big or Go Home! Paytm Labs believes in diversity and equal opportunity and we will not tolerate any forms of discrimination or harassment. Our people are critical to our success and we know the more inclusive we are, the better our work will be.Paytm Labs is committed to meeting the accessibility needs of all individuals in accordance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code (OHRC). Should you require accommodations during the recruitment and selection process, please let us know.

Location Toronto, ON
Work location On site
Salary$26.00 to $78.00HOUR hourly
Terms of employment Permanent employmentFull time
Starts as soon as possible
Source CareerBeacon #3120883

View the full job posting on CareerBeacon

Advertised until

2026-05-24

Important notice: This job posting has been provided by a partner site. Job Bank is not responsible for this content.

Report a problem with this job posting

*What’s wrong?

This job posting contains incorrect information

Report potential misuse of Job Bank

Plus account

Job market information

artificial intelligence (AI) consultant NOC 21211 Toronto Region

Median wage Help -: 46.33 $/hour

Explore this career

Page details

Date modified:: 2026-04-21

Language selection

Search

Job Bank

Job Search