
Distributed Systems Engineer - Data Platform (Delivery, Database, Retrieval)
Cloudflare · Austin, TX
- On site
- Full-time
- $150,000 / year
- Austin, TX
Email the hiring manager to get a response.
Get their verified email + an intro that's ready to send.
Subject: Interested in the Distributed Systems Engineer - Data Platform (Delivery, Database, Retrieval) role at Cloudflare
Hi Avery — I came across the Distributed Systems Engineer - Data Platform (Delivery, Database, Retrieval) opening and wanted to reach out directly. I've spent the last few years doing exactly this kind of work, and Cloudflare stood out because…
✎ Personalized to your résumé after sign-up.
- ✓ Verified email of the hiring manager
- ✓ Intro email personalized to your résumé
- ✓ $9/mo = unlimited — any job link
Secure checkout · cancel anytime
Job highlights
- Build scalable distributed data systems.
- Work with Go, ClickHouse, and GraphQL.
- Optimize high-throughput data pipelines.
- Ensure real-time data visibility for customers.
- Contribute to data platform innovation.
About the role
About Us
At Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world’s largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazine’s Top Company Cultures list and ranked among the World’s Most Innovative Companies by Fast Company.
At Cloudflare, we’re not looking for people who wait for a polished roadmap; we’re looking for the builders who see the cracks in the Internet that everyone else has simply learned to live with. We value candidates who have the instinct to spot a "normalized" problem and the AI-native curiosity to create a solution using the latest tools. Our culture is built on iteration, leveraging AI to ship faster today to make it better tomorrow, while ensuring that every improvement, no matter how small, is shared across the team to lift everyone up. If you’re the type of person who values curiosity over bureaucracy, and that AI is a partner in solving tough problems to keep the Internet moving forward, you’ll fit right in.
Locations Available: Austin (US), Atlanta (US), Denver (US), Toronto (Canada)
About Role
We are looking for experienced and highly motivated engineers to join our DATA Org and help build the future of data at Cloudflare. Our organisation is responsible for the entire data lifecycle - from ingestion and processing to storage and retrieval - powering the critical logs and analytics that provide our customers with real-time visibility into the health and performance of their online properties.
Our mission is to empower customers to leverage their data to drive better outcomes for their business. We build and maintain a suite of high-performance, scalable systems that handle more than a billion events in a second. As an engineer in our organisation, you will have the opportunity to work on complex distributed systems challenges across different parts of our data stack.
Our Data Org is composed of several key teams, and you could contribute to any of the following areas:
- Data Delivery: You will build and operate our distributed data delivery pipeline, a high-throughput, low-latency system (primarily written in Go) responsible for ingesting, processing, and routing massive volumes of data from across Cloudflare's global network to multi-core destination.
- Analytical Database Platform: Contribute to our core analytical platform powered by ClickHouse. This team builds and maintains a high-performance, scalable database platform optimised for the immense analytical workloads generated by our products and services.
- Data Retrieval: Be responsible for building the customer-facing products that make data accessible and actionable. This includes developing our public GraphQL API, building robust log delivery solutions and integrations with customer destinations, and contributing to our alerting products, which empower users to configure and receive near real-time alerts based on the logs and metrics observed by our data platform.
Responsibilities
As a Software Engineer in our Data Organisation depending on the team you join, you will focus on a subset of the following areas:
- Design, develop, and maintain scalable and reliable distributed systems across the entire data lifecycle.
- Build and optimise key components of our high-throughput data delivery platform to ensure data integrity and low-latency delivery.
- Develop new and improve existing components for the Cloudflare Analytical Platform to extend functionality and performance.
- Scale, monitor, and maintain the performance of our large-scale database clusters to accommodate the growing volume of data.
- Develop and enhance our customer-facing GraphQL APIs, log delivery, and alerting solutions, focusing on performance, reliability, and user experience.
- Work to identify and remove bottlenecks across our data platforms, from streamlining data ingestion processes to optimizing query performance.
- Collaborate with other teams across Cloudflare to understand their data needs and build solutions that empower them to make data-driven decisions.
- Collaborate with the ClickHouse open-source community to add new features and contribute to the upstream codebase.
- Participate in the development of the next generation of our data platforms, including researching and evaluating new technologies and approaches.
Key Qualifications
- 3+ years of experience working in software development covering distributed systems and databases.
- Strong programming skills (Golang is preferable), as well as a deep understanding of software development best practices and principles.
- Hands-on experience with modern observability stacks, including Prometheus, Grafana, and a strong understanding of handling high-cardinality metrics at scale.
- Strong knowledge of SQL and database internals, including experience with database design, optimisation, and performance tuning.
- A solid foundation in computer science, including algorithms, data structures, distributed systems, and concurrency.
- Strong analytical and problem-solving skills, with a willingness to debug, troubleshoot, and learn about complex problems at high scale.
- Ability to work collaboratively in a team environment and communicate effectively with other teams across Cloudflare.
- Experience with ClickHouse is a plus.
- Experience with data streaming technologies (e.g., Kafka, Flink) is a plus.
- Experience developing and scaling APIs, particularly GraphQL, is a plus.
- Experience with Infrastructure as Code tools like SALT or Terraform is a plus.
- Experience with Linux container technologies, such as Docker and Kubernetes, is a plus.
If you're passionate about building scalable and performant data platforms using cutting-edge technologies and want to work with a world-class team of engineers, then we want to hear from you! Join us in our mission to help build a better internet for everyone!
This role requires flexibility to be on-call outside of standard working hours to address technical issues as needed.
What Makes Cloudflare Special?
We’re not just a highly ambitious, large-scale technology company. We’re a highly ambitious, large-scale technology company with a soul. Fundamental to our mission to help build a better Internet is protecting the free and open Internet.
- Project Galileo: Since 2014, we've equipped more than 2,400 journalism and civil society organizations in 111 countries with powerful tools to defend themselves against attacks that would otherwise censor their work, technology already used by Cloudflare’s enterprise customers--at no cost.
- Athenian Project: In 2017, we created the Athenian Project to ensure that state and local governments have the highest level of protection and reliability for free, so that their constituents have access to election information and voter registration. Since the project, we've provided services to more than 425 local government election websites in 33 states.
- 1.1.1.1: We released 1.1.1.1 to help fix the foundation of the Internet by building a faster, more secure and privacy-centric public DNS resolver. This is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released. Here’s the deal - we don’t store client IP addresses never, ever. We will continue to abide by our privacy commitment and ensure that no user data is sold to advertisers or used to consumer.
Sound like something you’d like to be a part of? We’d love to hear from you!
Please note that applicants who progress to the offer stage of the interview process may be asked to attend an in-person interview within one of the Cloudflare Offices or Cloudflare Hubs. More details about this will be available at that stage of the interview process.
This position may require access to information protected under U.S. export control laws, including the U.S. Export Administration Regulations. Please note that any offer of employment may be conditioned on your authorization to receive software or technology controlled under these U.S. export laws without sponsorship for an export license.
Cloudflare is proud to be an equal opportunity employer. We are committed to providing equal employment opportunity for all people and place great value in both diversity and inclusiveness. All qualified applicants will be considered for employment without regard to their, or any other person's, perceived or actual race, color, religion, sex, gender, gender identity, gender expression, sexual orientation, national origin, ancestry, citizenship, age, physical or mental disability, medical condition, family care status, or any other basis protected by law. We are an AA/Veterans/Disabled Employer.
Cloudflare provides reasonable accommodations to qualified individuals with disabilities. Please tell us if you require a reasonable accommodation to apply for a job. Examples of reasonable accommodations include, but are not limited to, changing the application process, providing documents in an alternate format, using a sign language interpreter, or using specialized equipment. If you require a reasonable accommodation to apply for a job, please contact us via e-mail at hr@cloudflare.com or via mail at 101 Townsend St. San Francisco, CA 94107.
Key skills/competency
- Distributed Systems Engineering
- Data Platform Development
- Database Optimization
- Go Programming (Golang)
- SQL and Database Internals
- API Development (GraphQL)
- Observability (Prometheus, Grafana)
- Scalability and Performance Tuning
- Data Ingestion and Processing
- Cloudflare Data Org
Skills & topics
- Distributed Systems Engineer
- Data Platform
- Go
- Golang
- ClickHouse
- GraphQL
- Databases
- Software Engineer
- Cloudflare
- Data Engineering
How to get hired
- Tailor your resume: Highlight distributed systems, database experience, and Go programming skills. Quantify achievements where possible.
- Showcase Go expertise: Emphasize any projects or contributions using Golang, as it's a preferred language.
- Demonstrate data platform knowledge: Detail your experience with data ingestion, processing, storage, and retrieval systems.
- Prepare for technical interviews: Be ready to discuss distributed systems concepts, algorithms, data structures, and problem-solving at scale.
- Understand Cloudflare's mission: Connect your passion for building a better internet with the company's values and projects.
Technical preparation
Behavioral questions
Frequently asked questions
- What kind of distributed systems experience is Cloudflare looking for in a Data Platform Engineer?
- Cloudflare seeks engineers with at least 3 years of experience in developing and maintaining scalable and reliable distributed systems, specifically focusing on the entire data lifecycle from ingestion and processing to storage and retrieval. Experience with high-throughput, low-latency systems, particularly those written in Go, is highly valued.
- What programming languages and technologies are most important for this Distributed Systems Engineer role at Cloudflare?
- Golang is the preferred programming language for this role due to its use in the high-throughput data delivery platform. Strong knowledge of SQL, database internals, and experience with analytical databases like ClickHouse are also crucial. Familiarity with GraphQL APIs and observability tools like Prometheus and Grafana is beneficial.
- How does Cloudflare handle data at scale for its Data Platform?
- Cloudflare's Data Org manages systems that handle over a billion events per second. This involves building and optimizing high-performance, scalable analytical database platforms (like ClickHouse) and data delivery pipelines to ensure data integrity and low-latency delivery for massive data volumes.
- What are the key responsibilities of a Distributed Systems Engineer on the Data Platform team at Cloudflare?
- Responsibilities include designing and developing scalable distributed systems, optimizing data delivery pipelines, enhancing analytical database platforms, scaling and monitoring large-scale database clusters, and developing customer-facing APIs and alerting solutions. You'll also focus on identifying and removing bottlenecks across the data platform.
- What opportunities are there for growth and learning in Cloudflare's Data Org?
- The Data Org offers opportunities to work on complex distributed systems challenges across various parts of the data stack, collaborate with the ClickHouse open-source community, and research new technologies. You'll be part of developing the next generation of data platforms, contributing to a world-class engineering team.
- Is on-call availability required for the Distributed Systems Engineer - Data Platform position?
- Yes, this role requires flexibility to be on-call outside of standard working hours to address technical issues as needed, which is typical for critical systems engineering roles managing large-scale data platforms.
Similar roles
Open positions we recommend based on this role.
