Micron Technology

MTS, AI Engineering, SMAI

Micron Technology
Integrated Device ManufacturingSingapore, SingaporeOnsitePosted 4 weeks ago

About the role

AI summarised

Join the Smart Manufacturing and AI team at Micron Technology to deliver industry-winning machine learning, custom GenAI, and Agentic AI solutions. This role focuses on optimizing GPU performance for large-scale AI/ML workloads to drive value from Micron’s manufacturing processes and systems.

IDMOnsiteSmart MFG/AI

Key Responsibilities

  • Architect and execute large-scale custom model training and fine-tuning jobs (SFT, RLHF) on multi-node, multi-GPU clusters.
  • Optimize training throughput and memory efficiency using distributed training strategies (FSDP, DeepSpeed, Megatron-LM) and mixed-precision techniques (FP16/BF16).
  • Design and develop autonomous AI Agents capable of multi-step reasoning, planning, and tool execution to automate complex manufacturing workflows.
  • Analyze and profile complex workloads (e.g., LLM training, Rendering pipelines) to identify bottlenecks in compute, memory bandwidth, and latency.
  • Write and optimize high-performance kernels using CUDA, HIP, or custom assembly (PTX/SASS) to unlock hardware capabilities.
  • Collaborate with Hardware Architects to define features for next-generation GPUs based on workload characterization.
  • Design and implement performance regression testing suites to catch degradations in drivers or compilers.
  • Mentor junior engineers on parallel programming paradigms and optimization techniques.

Requirements

  • 9+ years of experience in performance optimization, parallel computing, or low-level systems programming.
  • Deep understanding of GPU architecture (memory hierarchy, tensor cores, interconnects like NVLink) and experience managing GPU resources in cloud/on-prem environments.
  • Hands-on experience with Distributed Data Parallel (DDP), Fully Sharded Data Parallel (FSDP), and model parallelism techniques.
  • Proficiency in fine-tuning Large Language Models using PEFT techniques (LoRA, QLoRA) and optimizing inference engines (vLLM, TensorRT-LLM).
  • Experience developing GenAI applications and AI Agents using frameworks like LangChain, LangGraph, LlamaIndex, or AutoGen.
  • Proficiency with Large Language Models (LLMs), including prompt engineering, function calling/tool use, and Chain-of-Thought (CoT) reasoning.
  • Experience in building and executing end-to-end ML systems automating training, testing and deploying Machine Learning models.
  • Strong scripting and programming skills in Python or Java (Python preferred).
  • Deep expertise in C++ and at least one GPGPU framework (CUDA preferred, HIP/OpenCL/Metal acceptable).
  • Technical Degree required; Ph.D. in Computer Science or Statistics highly desired.