PitchMeAI
NVIDIA

Senior Network Infrastructure Engineer

NVIDIA · Mumbai Metropolitan Region

  • On site
  • Full-time
  • $120,000 / year
  • Mumbai Metropolitan Region

Job highlights

  • Support and maintain NVIDIA's cloud network infrastructure.
  • Remediate critical alerts and triage network incidents.
  • Manage large-scale IP network technologies and infrastructures.
  • Monitor health of on-premises and cloud infrastructures.
  • Automate provisioning, monitoring, and management tasks.

About the role

Network Operations Engineer at NVIDIA

NVIDIA is looking for a Network Operations Engineer to support and maintain our cloud network infrastructure. This network serves the needs across the whole software stack for NVIDIA, from Graphics Drivers to Autonomous Vehicles and Artificial Intelligence. In this role, the Network Operations Engineer will remediate critical alerts within defined SLAs, triage production impacting network incidents, and interact with internal customers on network related issues. They will also be responsible for engaging with external vendors to remediate hardware and software issues, and participate in project related work such as network device upgrades and capacity augmentations. An ideal candidate will possess a wide range of skills, including alert monitoring & resolution in large-scale networks and CSP environments, outstanding troubleshooting skills, understanding of L3 underlay networks, and network protocol knowledge in large multi-vendor infrastructures.

What You Will Be Doing

  • Engage in 24/7 global shift rotations to provide remote support for network repairs and changes while collaborating across teams and updating customers on status and ticket information.
  • Drive operational improvements in change management and daily operations by following procedures.
  • Manage and operate large scale IP network technologies and infrastructures.
  • Utilise your skills in Peering and Datacenter interconnect technologies: PNI, Transit, Exchange, Passive DWDM, Wave circuits.
  • Monitor and support the network health of on-premises and cloud infrastructures.
  • Collaborate and develop workflow enhancements while documenting best practices.

What We Need To See

  • Deep knowledge and experience of TCP/IP, BGP, OSPF, MPLS, IS-IS, VxLAN, EVPN, QoS, GRE, IPsec, DNS, and MACsec.
  • Over 4 years of experience in network operations.
  • Skilled in network troubleshooting techniques and leveraging creative problem-solving abilities.
  • Strong track record of alert response within defined SLAs and Incident management.
  • Experience with one or more of the following CSP environments: AWS, Azure, GCP, OCI.
  • Familiarity with Arista, Fortinet and Juniper.
  • Hands-on experience with contributing to tooling and automation for provisioning, monitoring, and managing complex network infrastructures.
  • Bachelor’s degree in Computer Science, related technical field, or equivalent experience.
  • Excellent verbal and written communication skills.

Ways To Stand Out From The Crowd

  • Strong background of Mellanox/Cumulus OS.
  • Working knowledge of Infiniband technology.
  • Skilled in Unix/Linux system administration, with the ability to write and understand Python/Shell scripts to enhance productivity in hyperscale environments.
  • Familiarity with leveraging tools such as Netbox/Nautobot, Prometheus, Grafana, Panoptes to monitor and manage a global network.
  • Passionate about innovating and investing in ground breaking technologies.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard-working people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Key skills/competency
  • Network Operations
  • Cloud Networking
  • TCP/IP
  • BGP
  • OSPF
  • MPLS
  • Troubleshooting
  • Incident Management
  • Automation
  • Scripting

Skills & topics

  • Network Engineer
  • Network Operations
  • Cloud Networking
  • TCP/IP
  • BGP
  • OSPF
  • MPLS
  • Troubleshooting
  • Incident Management
  • Automation
  • Python
  • Shell Scripting
  • AWS
  • Azure
  • GCP
  • Arista
  • Juniper
  • Fortinet
  • NVIDIA

How to get hired

  • Customize your resume: Highlight your 4+ years of network operations experience and specific skills in TCP/IP, BGP, OSPF, MPLS, and troubleshooting.
  • Showcase cloud and vendor experience: Emphasize experience with AWS, Azure, GCP, OCI, and familiarity with Arista, Fortinet, or Juniper.
  • Demonstrate automation skills: Detail your experience with scripting (Python/Shell) and tools like Netbox, Prometheus, or Grafana.
  • Prepare for technical interviews: Be ready to discuss L3 underlay networks, peering technologies, and incident management scenarios.
  • Highlight problem-solving: Bring examples of creative troubleshooting and alert response within SLAs.

Technical preparation

Master TCP/IP, BGP, OSPF, and MPLS protocols.,Practice troubleshooting large-scale multi-vendor networks.,Familiarize yourself with cloud environments (AWS, Azure, GCP).,Gain experience with automation and scripting (Python/Shell).

Behavioral questions

Describe a critical network incident you resolved.,How do you prioritize alerts under pressure?,Share an example of driving operational improvements.,How do you collaborate with internal/external teams?

Frequently asked questions

What specific network protocols are essential for the NVIDIA Network Operations Engineer role?
For the NVIDIA Network Operations Engineer position, deep knowledge of TCP/IP, BGP, OSPF, MPLS, IS-IS, VxLAN, EVPN, QoS, GRE, IPsec, DNS, and MACsec is required. Familiarity with these protocols is crucial for managing and troubleshooting large-scale IP networks.
How does NVIDIA handle 24/7 support for its network infrastructure?
NVIDIA operates on 24/7 global shift rotations to provide continuous remote support for network repairs and changes. This ensures that network issues are addressed promptly regardless of the time zone, with cross-team collaboration and customer updates.
What cloud environments does NVIDIA utilize, and is experience required?
NVIDIA utilizes major cloud environments such as AWS, Azure, GCP, and OCI. Experience with one or more of these CSP environments is a required qualification for the Network Operations Engineer role.
What kind of automation experience is NVIDIA looking for in this role?
NVIDIA seeks hands-on experience in contributing to tooling and automation for provisioning, monitoring, and managing complex network infrastructures. This includes scripting with Python/Shell and familiarity with tools like Netbox/Nautobot, Prometheus, Grafana, and Panoptes.
Is a Bachelor's degree mandatory for the Network Operations Engineer position at NVIDIA?
A Bachelor’s degree in Computer Science, a related technical field, or equivalent experience is required. NVIDIA values practical experience and skills that align with the role's demands.
What are the key hardware and software vendors mentioned for this NVIDIA role?
The job description mentions familiarity with network equipment vendors such as Arista, Fortinet, and Juniper. Additionally, experience with Mellanox/Cumulus OS and Infiniband technology are considered advantageous.
How important is incident management and alert response for this role?
A strong track record of alert response within defined SLAs and effective incident management is a critical requirement for the Network Operations Engineer. This role involves triaging production-impacting network incidents and remediating critical alerts swiftly.