Our premier AI/ML client is seeking an AI Compiler Architect to lead the design and development of cutting-edge compiler architectures for their NPUs (Neural Processing Units), as well as end-to-end pipelines for training, fine-tuning, of a range of state space models, as well as a distillation pipeline for Large Language Models (LLMs). The ideal candidate will have 3+ years of experience in a similar role, be well-versed in embedded systems or edge-based AI, and possess a strong professional background in software engineering.

Responsibilities:

Define, architect, and optimize compilers for NPUs, including code generation and optimization techniques for AI workloads.
Work closely with hardware and R&D teams to ensure seamless integration of compiler features and hardware capabilities.
Architect and implement scalable pipelines for model training, fine-tuning, and distillation of LLMs.
Work cross-functional with ML Scientist, Software and Hardware Design teams to define project requirements, architectural decisions, and timelines.
Develop compiler optimizations and passes that convert high-level AI models (e.g., from TensorFlow, PyTorch) into intermediate representations (IR).
Adapt and optimize AI models for deployment on embedded systems and edge devices, focusing on performance, memory footprint, and power consumption.
Provide technical leadership and mentorship to junior engineers and developers.
Stay current with the latest trends in compiler design, AI frameworks, and hardware acceleration.

Requirements:

Master’s degree in Computer Science, Computer Engineering, or a related field. (Ph.D. preferred)
Experience: 3+ years of hands-on experience in compiler architecture, AI model training pipelines, and/or related embedded systems development.
Familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow) and related libraries.
Strong background in compiler internals (e.g., TVM, XLA, or similar) and code optimization strategies.
Proven experience in designing and maintaining training pipelines for large-scale machine learning models.
Understanding of quantization, pruning, and distillation techniques for AI model optimization.
Embedded Systems & Edge AI: Demonstrated experience with low-level programming and embedded or edge-based deployments of AI models.
Software Engineering Proficiency: Strong skills in at least one systems language (C, C++) and one scripting language (Python, Bash). Version control (Git) and continuous integration (CI) knowledge is a plus.

Preferred Qualifications:

Proven track record of leading complex, multidisciplinary projects from concept to completion.
Experience with hardware abstraction layers or hardware accelerators beyond NPUs (e.g., GPUs, FPGAs).
Prior work on AI model optimization and deployment in resource-constrained environments.
Contributions to open-source compiler or AI projects.

Immigration Requirements:

This role will consider a H1-B transfer for a strong candidate currently residing in the United States.

Relocation Assistance:

This position will provide relocation assistance for the selected candidate

#machinelearning #ml #artificialintelligence #ai #compiler #PyTorch #TensorFlow #npu #llm

Job CateEgory: Large Language Models (LLMs) Machine Learning Neural Processing Units (NPUs) SoC Machine Learning

Job Type: Full Time Hybrid

Job Location: Southern California

AI Compiler Architect (AI/ML Startup)

Apply for this position

© Tech Integrated Staffing, LLC | All Right Reserved