About the role
AI summarisedThe MTS, AI Engineering role at Micron Technology focuses on developing and optimizing AI/ML solutions for manufacturing processes, including architecting large-scale model training, optimizing GPU performance, designing autonomous AI agents, and collaborating with hardware architects on next-generation GPU features. This senior-level position requires deep expertise in GPU architecture, distributed training, and AI agent development, with 9+ years of experience in performance optimization and parallel computing.
IDMFull-timeSmart MFG/AI
Key Responsibilities
- Architect and execute large-scale custom model training and fine-tuning jobs (SFT, RLHF) on multi-node, multi-GPU clusters
- Optimize training throughput and memory efficiency using distributed training strategies (FSDP, DeepSpeed, Megatron-LM) and mixed-precision techniques (FP16/BF16)
- Design and develop autonomous AI Agents capable of multi-step reasoning, planning, and tool execution to automate complex manufacturing workflows
- Analyze and profile complex workloads (e.g., LLM training, Rendering pipelines) to identify bottlenecks in compute, memory bandwidth, and latency
- Write and optimize high-performance kernels using CUDA, HIP, or custom assembly (PTX/SASS) to unlock hardware capabilities
- Collaborate with Hardware Architects to define features for next-generation GPUs based on workload characterization
- Design and implement performance regression testing suites to catch degradations in drivers or compilers
- Mentor junior engineers on parallel programming paradigms and optimization techniques
Requirements
- Technical Degree required. Ph.D. in Computer Science or Statistics background highly desired
- Deep understanding of GPU architecture (memory hierarchy, tensor cores, interconnects like NVLink) and experience managing GPU resources in both cloud environments and on-prem
- Hands-on experience with Distributed Data Parallel (DDP), Fully Sharded Data Parallel (FSDP), and model parallelism techniques
- Proficiency in fine-tuning Large Language Models using PEFT techniques (LoRA, QLoRA) and optimizing inference engines (vLLM, TensorRT-LLM)
- Experience developing GenAI applications and AI Agents using frameworks like LangChain, LangGraph, LlamaIndex, or AutoGen
- Proficiency with Large Language Models (LLMs), including prompt engineering, function calling/tool use, and Chain-of-Thought (CoT) reasoning
- Experience in building and executing end-to-end ML systems automating training, testing and deploying Machine Learning models
- Familiarity with machine learning frameworks (PyTorch is required, TensorFlow, scikit-learn, etc.)
- Software development skills and the desire to work on cutting edge development in a Cloud environment
- Strong scripting and programming skills in one of the following, Python or Java (Python preferred)
- Experience with continuous integration/continuous delivery (CI/CD) tools (Jenkins, Git, Docker, Kubernetes)
- 9+ years of experience in performance optimization, parallel computing, or low-level systems programming
- Deep expertise in C++ and at least one GPGPU framework (CUDA is preferred, but HIP/OpenCL/Metal are acceptable)
- Outstanding analytical thinking, interpersonal, oral and written communication skills
- Ability to prioritize and meet critical
