Our premier AI/ML client is seeking an AI Compiler Architect to lead the design and development of cutting-edge compiler architectures for their NPUs (Neural Processing Units), as well as end-to-end pipelines for training, fine-tuning, of a range of state space models, as well as a distillation pipeline for Large Language Models (LLMs). The ideal candidate will have 3+ years of experience in a similar role, be well-versed in embedded systems or edge-based AI, and possess a strong professional background in software engineering.
Responsibilities:
- Define, architect, and optimize compilers for NPUs, including code generation and optimization techniques for AI workloads.
- Work closely with hardware and R&D teams to ensure seamless integration of compiler features and hardware capabilities.
- Architect and implement scalable pipelines for model training, fine-tuning, and distillation of LLMs.
- Work cross-functional with ML Scientist, Software and Hardware Design teams to define project requirements, architectural decisions, and timelines.
- Develop compiler optimizations and passes that convert high-level AI models (e.g., from TensorFlow, PyTorch) into intermediate representations (IR).
- Adapt and optimize AI models for deployment on embedded systems and edge devices, focusing on performance, memory footprint, and power consumption.
- Provide technical leadership and mentorship to junior engineers and developers.
- Stay current with the latest trends in compiler design, AI frameworks, and hardware acceleration.
Requirements:
- Master’s degree in Computer Science, Computer Engineering, or a related field. (Ph.D. preferred)
- Experience: 3+ years of hands-on experience in compiler architecture, AI model training pipelines, and/or related embedded systems development.
- Familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow) and related libraries.
- Strong background in compiler internals (e.g., TVM, XLA, or similar) and code optimization strategies.
- Proven experience in designing and maintaining training pipelines for large-scale machine learning models.
- Understanding of quantization, pruning, and distillation techniques for AI model optimization.
- Embedded Systems & Edge AI: Demonstrated experience with low-level programming and embedded or edge-based deployments of AI models.
- Software Engineering Proficiency: Strong skills in at least one systems language (C, C++) and one scripting language (Python, Bash). Version control (Git) and continuous integration (CI) knowledge is a plus.
Preferred Qualifications:
- Proven track record of leading complex, multidisciplinary projects from concept to completion.
- Experience with hardware abstraction layers or hardware accelerators beyond NPUs (e.g., GPUs, FPGAs).
- Prior work on AI model optimization and deployment in resource-constrained environments.
- Contributions to open-source compiler or AI projects.
Immigration Requirements:
This role will consider a H1-B transfer for a strong candidate currently residing in the United States.
Relocation Assistance:
This position will provide relocation assistance for the selected candidate
#machinelearning #ml #artificialintelligence #ai #compiler #PyTorch #TensorFlow #npu #llm