Networking Operating System Firmware Engineer
OpenAI
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About The Team
OpenAI’s Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models. In addition to delivering production-grade silicon for OpenAI’s supercomputing infrastructure, the team also creates custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI.
About The Role
We’re seeking a Networking Operating System Firmware Engineer to help bootstrap and scale the switching layer of our AI supercomputers. In this role, you’ll build and maintain custom SONiC NOS images from scratch, working across the Linux kernel, switch ASIC SAI/SDKs, platform drivers, control-plane services, and orchestration layers.
You will validate, configure, and optimize switch platforms used across our high-bandwidth cluster fabric, ensuring performance, reliability, availability, and seamless integration with fleet automation. You’ll collaborate with hardware and systems teams and guide vendors to meet stringent technical expectations.
This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.
In This Role, You Will
- Design, develop, and maintain custom SONiC NOS images for large-scale bleeding-edge AI fabrics.
- Integrate and configure Linux kernel components, device drivers, switch ASIC SDKs, and SAI layers.
- Bring up new switch platforms (thermal/fan control, power monitoring, transceiver management, watchdogs, OSFP CMIS, LEDs, CPLDs, etc.).
- Extend and customize SONiC services for routing, telemetry, control-plane state, and distributed automation.
- Work with hardware teams to validate ASIC configurations, link bring-up, SerDes tuning, buffer profiles, and performance baselines.
- Evaluate switch silicon SDK releases, track vendor deliverables, and define platform requirements with vendors and ASIC partners.
- Debug complex issues spanning kernel, platform drivers, SONiC dockers, routing agents, orchestration services, hardware signals, and network topology.
- Integrate switches into fleet-wide monitoring, remote diagnostics, telemetry pipelines, and automated lifecycle workflows.
- Develop robust CI/build pipelines for reproducible NOS builds and controlled rollout across the fleet.
- Support factory bring-up and qualification all the way through mass deployment.
- Collaborate, architect, implement, and deploy novel networking protocols and technologies to achieve maximum performance and reliability at AI factory scale.
You Might Thrive In This Role If You
- Proven experience working with SONiC or comparable NOS stacks (FBOSS, Cumulus Linux, Arista EOS, Junos PFE-level integration, etc.).
- Experience with updating OpenConfig gNMI interfaces and YANG data models.
- Strong background in Linux kernel, network device drivers, and low-level OS internals.
- Experience integrating Broadcom / Marvell / NVIDIA / Intel ASIC SDKs and SAI implementations.
- Proficiency in C, C++ and Python; familiarity with Rust/Go is a plus.
- Deep understanding of L2/L3 forwarding, ECMP, RoCE, BGP, QoS, PFC, buffer tuning, and telemetry.
- Hands-on experience with hardware platform bring-up and board-level debugging.
- Familiarity with CI/CD pipelines, distributed config/state management, and large-scale automation.
- Strong cross-functional problem solving in high-performance, distributed environments.
- Ability to lead teams to deliver a project end to end.
Key skills/competency
- SONiC
- Linux Kernel
- Switch ASIC
- C/C++
- Python
- L2/L3 Forwarding
- RoCE
- Hardware Debugging
- CI/CD
- Distributed Systems
How to Get Hired at OpenAI
- Research OpenAI's mission: Study their dedication to general-purpose AI benefiting humanity, safety principles, and deployment strategies for AI systems.
- Tailor your resume for firmware expertise: Highlight extensive experience with SONiC, Linux kernel, network device drivers, and switch ASIC SDKs.
- Showcase deep networking knowledge: Emphasize your understanding of L2/L3 forwarding, RoCE, BGP, QoS, and buffer tuning in high-performance environments.
- Prepare for systems-level debugging: Be ready to discuss experience solving complex issues spanning kernel, platform drivers, and hardware signals.
- Demonstrate collaboration and leadership: Provide examples of cross-functional problem-solving, guiding vendors, and leading project delivery end-to-end.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background