
Cloud Machine Learning LLM Serving Staff engineer
Qualcomm · Bengaluru, Karnataka, India
- On site
- Full-time
- $150,000 / year
- Bengaluru, Karnataka, India
Job highlights
- Develop ML hardware and software for cloud solutions.
- Optimize Deep Learning models on Qualcomm AI 100.
- Build LLM framework extensions and tools.
- Collaborate on training and inference optimization.
- Requires strong C++/Python and ML expertise.
About the role
Cloud Machine Learning LLM Serving Staff Engineer
Qualcomm is seeking ambitious, bright, and innovative engineers with experience in machine learning framework development to join their Cloud Computing team. This role involves developing hardware and software for Machine Learning solutions across data center, edge, infrastructure, and automotive markets. The position spans the entire product life cycle, from early design to commercial deployment, in a fast-paced, cross-functional environment requiring strong communication, planning, and execution skills.
Key Responsibilities
- Analyze software requirements and design feasibility within given constraints, collaborating with architecture and HW engineers to implement optimal software solutions for Qualcomm's SOCs.
- Identify and analyze system-level issues, working closely with software development, integration, and test teams.
- Lead high-performing teams in Machine Learning software engineering, demonstrating a proven track record.
- Apply a strong foundation in mathematical modeling and linear algebra, coupled with state-of-the-art ML/AI algorithms.
- Improve and optimize key Deep Learning models on Qualcomm AI 100 hardware.
- Build deep learning framework extensions for Qualcomm AI 100 in upstream open-source repositories.
- Collaborate with internal teams to analyze and optimize training and inference for deep learning workloads.
- Develop software tools and build the ecosystem around the AI SW Stack.
- Work with vLLM, Triton, ExecuTorch, Inductor, and TorchDynamo to create abstraction layers for inference accelerators.
- Optimize workloads for both scale-up (multi-SoC) and scale-out (multi-card) systems.
- Optimize the entire deep learning pipeline, including graph compiler integration.
- Apply software engineering best practices throughout the development process.
Desirable Skills and Aptitudes
- Deep Learning experience or knowledge in areas such as LLMs, Natural Language Processing, Vision, Audio, and Recommendation systems.
- Understanding of PyTorch and TensorFlow software stacks, including their component structures and functions.
- Excellent C/C++/Python programming and software design skills, including debugging, performance analysis, and test design.
- Ability to work independently, define requirements and scope, and lead development efforts.
- Proficiency with open-source development practices.
- Strong developer with a research mindset, driven to innovate and solve complex problems.
- Knowledge of tiling and scheduling for Machine Learning operators is a plus.
- Experience with C++ 14 advanced features.
- Experience in software profiling and optimization techniques.
- Hands-on experience with SIMD and/or multi-threaded high-performance code is a plus.
- Experience with ML compilers and auto-code generation (using MLIR) is a plus.
- Experience running workloads on large-scale heterogeneous clusters is a plus.
- Hands-on experience with CUDA and cuDNN is a plus.
Qualifications
- Bachelor's/Master's/PhD degree in Engineering, Machine learning/AI, Information Systems, Computer Science, or a related field.
- 8+ years of Software Engineering or related work experience.
- 8+ years of experience with programming languages such as C++, Python.
Minimum Qualifications
- Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 4+ years of Software Engineering or related work experience.
- OR Master's degree in Engineering, Information Systems, Computer Science, or related field and 3+ years of Software Engineering or related work experience.
- OR PhD in Engineering, Information Systems, Computer Science, or related field and 2+ years of Software Engineering or related work experience.
- 2+ years of work experience with programming languages such as C, C++, Java, Python, etc.
Key skills/competency
- Machine Learning
- Deep Learning
- LLM
- Python
- C++
- Software Engineering
- Optimization
- Framework Development
- Cloud Computing
- System Design
Skills & topics
- Cloud Machine Learning
- LLM
- Staff Engineer
- Machine Learning
- Deep Learning
- Python
- C++
- Software Engineering
- Optimization
- Framework Development
- Qualcomm
- AI
- Data Center
- Edge Computing
- Automotive
How to get hired
- Tailor your resume: Highlight your experience with Machine Learning, Deep Learning, LLMs, C++, and Python, aligning it with Qualcomm's focus areas.
- Showcase open-source contributions: Emphasize your experience with open-source development practices and contributions to projects like PyTorch or TensorFlow.
- Prepare for technical interviews: Be ready to discuss your understanding of ML/AI algorithms, mathematical modeling, and optimizations for deep learning pipelines.
- Demonstrate leadership: Highlight any experience leading technical teams and driving projects from conception to deployment.
Technical preparation
Master deep learning frameworks (PyTorch, TensorFlow).,Practice ML algorithm implementation and optimization.,Prepare for system design and coding challenges.,Review Qualcomm AI 100 architecture details.
Behavioral questions
Describe leading a complex ML project.,How do you handle cross-functional team challenges?,Share an example of innovative problem-solving.,How do you stay updated with ML advancements?
Frequently asked questions
- What specific LLMs or Deep Learning frameworks are prioritized for the Cloud Machine Learning LLM Serving Staff Engineer role at Qualcomm?
- The role specifically mentions working with vLLM, Triton, ExecuTorch, Inductor, and TorchDynamo. Experience with PyTorch and TensorFlow is also highly desirable, indicating a focus on these widely used frameworks for LLM and deep learning development.
- What level of experience is expected for a Staff Engineer at Qualcomm in this role?
- The job description requires 8+ years of Software Engineering experience, or a Master's degree with 3+ years, or a PhD with 2+ years. A Staff Engineer typically implies a senior individual contributor role with significant technical leadership and deep expertise.
- Does Qualcomm India Private Limited offer opportunities for remote work for this Staff Engineer position?
- The job description lists 'Engineering Group, Engineering Group > Software Engineering' and does not explicitly mention remote work. Given the emphasis on hardware and SOC development, it is likely an on-site or hybrid role, primarily based in Qualcomm's India facilities.
- What is the typical interview process for a Staff Engineer at Qualcomm?
- The interview process at Qualcomm typically involves multiple rounds. Expect technical screening calls, followed by in-depth interviews focusing on system design, algorithms, data structures, and your specific experience in Machine Learning, LLMs, and software optimization. Behavioral questions assessing leadership and collaboration skills are also common.
- How does Qualcomm India foster innovation and a research mindset among its engineers for roles like the LLM Serving Staff Engineer?
- Qualcomm encourages a research mindset by involving engineers in the entire product lifecycle, from early design to deployment. The role involves building framework extensions, optimizing models, and working with cutting-edge technologies, providing ample opportunities for innovation and contribution to open-source projects.