24 days ago

Senior Data Engineer

Pythian

Hybrid
Full Time
$145,000
Hybrid
Apply

Job Overview

Job TitleSenior Data Engineer
Job TypeFull Time
Offered Salary$145,000
LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

Why Pythian

At Pythian, we are experts in strategic database and analytics services, driving digital transformation and operational excellence. Pythian, a multinational company, was founded in 1997 and started by ensuring the reliability and performance of mission-critical databases. We quickly earned a reputation for solving tough data challenges. We were there when the industry moved from on-premises to cloud environments, and as enterprises sought more from their data, we expanded our competencies to include advanced analytics.

Today, we empower organizations to embrace transformation and leverage advanced technologies, including AI, to stay competitive. We deliver innovative solutions that meet each client’s data goals and have built strong partnerships with Google Cloud, AWS, Microsoft, Oracle, SAP, and Snowflake. The powerful combination of our extensive expertise in data and cloud and our ability to keep on top of the latest bleeding edge technologies make us the perfect partner to help mid and large-sized businesses transform to stay ahead in today’s rapidly changing digital economy.

Why You

As a Senior Data Engineer, you will collaborate with a globally distributed team of architects, engineers, and consultants to design and deliver impactful solutions for enterprise data platforms, primarily focused on cloud technologies. Your role will involve producing outcomes for real-world customer projects, contributing to software artifacts, and driving automation in data platform implementations and migrations.

What You Will Be Doing

  • Design and develop end-to-end cloud-based solutions with a strong emphasis on data applications and infrastructure.
  • Lead discovery and design sessions with customers to gather requirements and translate functional needs into detailed designs.
  • Create and contribute to technical design documents and other project-related documentation.
  • Work with stakeholders to identify technical and business requirements, and apply best practices and standards to achieve successful project outcomes.
  • Regularly demonstrate proficiency in established practices and standards for cloud solutions.
  • Write high-performance, reliable, and maintainable code.
  • Develop test automation frameworks and associated tooling to ensure project success.
  • Handle complex and diverse cloud-based projects, including tasks such as collecting, managing, analyzing, and visualizing very large datasets.
  • Build efficient and scalable data pipelines for batch and real-time use cases across various source and target systems.
  • Optimize ETL/ELT pipelines, troubleshoot pipeline issues, and enhance observability dashboards.
  • Execute data pipeline-specific DevOps activities, such as IaC provisioning, implementing data security, and automation.
  • Analyze potential issues, perform root cause analyses, and resolve technical challenges.
  • Review bug descriptions, functional requirements, and design documents to ensure comprehensive testing plans and cases.
  • Performance tuning of batch and real-time data processing pipelines.
  • Ensure security best practices are followed when working on internal and customer-facing cloud data platforms.
  • Build foundational CI/CD pipelines for all infrastructure components, data pipelines, and custom data applications.
  • Develop observability and data quality solutions for data platforms, including ML and AI applications.
  • Act as a trusted advisor for customers, addressing technical queries and providing support.
  • Engage in thought leadership activities such as whitepaper authoring, conference presentations, and podcasting.
  • Suggest and implement ways to improve project progress and efficiency.
  • Participate in pre-sales activities when required.

Behavioral Expectations

What we need from you:

  • Demonstrate professional and respectful conduct in all interactions with customers, peers, and stakeholders.
  • Manage time effectively and attend all internal and customer meetings punctually and prepared.
  • Adhere strictly to organizational processes, including accurate and timely completion of timesheets.
  • Communicate promptly, clearly, and responsibly through email, messaging tools, and meetings.
  • Take ownership of commitments and deliverables without requiring repeated follow-ups.

Technical Expectations (Must Have’s)

  • Experience in implementing complex data architecture, data modeling, data design, and persistence (e.g., warehousing, data marts, data lakes).
  • Proficiency in a programming language such as Python, Java, Go, or Scala.
  • Experience with big data cloud technologies like Microsoft Fabric, Databricks, EMR, Athena, Glue, BigQuery, Dataproc, and Dataflow.
  • Ideally, you will have specific strong hands-on experience working with Google Cloud Platform data technologies—Google BigQuery, Google DataFlow, and executing PySpark and SparkSQL code at Dataproc.
  • Solid understanding of Spark (PySpark or SparkSQL), including using the DataFrame Application Programming Interface as well as analyzing and performance tuning Spark queries.
  • Strong experience in data orchestration using Apache Airflow.
  • Develop frameworks and solutions that enable us to acquire, process, monitor, and extract value from large datasets.
  • Highly proficient in SQL.
  • Strong experience in using code repositories such as GitHub and demonstrable GitOps best practices.
  • Bring a good knowledge of popular database and data warehouse technologies and concepts from Google, Amazon, or Microsoft (Cloud & Conventional RDBMS), such as BigQuery, Redshift, Microsoft Azure SQL Data Warehouse, Snowflake, etc.
  • Have knowledge of how to design distributed systems and the trade-offs involved, including working with software engineering best practices for development, networking, source control systems, automated deployment pipelines like Jenkins, and DevOps tools like Terraform.
  • Have strong knowledge of CI/CD tools and frameworks such as Jenkins and GitLab to implement DevOps pipelines.
  • Proficiency in using GenAI tools for productivity e.g. Copilot.

Technical Expectations (Desired)

  • Have strong knowledge of data orchestration solutions like Oozie, Luigi, or Talend.
  • Have strong knowledge of DBT (Data Build Tool) or Dataform.
  • Experience in Snowflake.
  • Experience with Apache Iceberg, Hudi, and query engines like Presto (Trino).
  • Knowledge of data catalogs (AWS Glue, Google DataPlex) and data governance or data quality solutions (e.g., Great Expectations) is an added advantage.
  • Experience in performing DevOps activities such as IaC using Terraform, provisioning infrastructure in GCP/AWS/Azure, defining data security layers, etc.
  • Experience in designing microservice architecture, REST API gateways is a plus.
  • Knowledge of MLOps frameworks and orchestration pipelines such as Kubeflow or TFX is a plus.
  • Certification in GCP, Azure, AWS, Snowflake, Databricks.

What You Will Receive

  • Love your career: Competitive total rewards package. Blog during work hours; take a day off and volunteer for your favorite charity.
  • Love your work/life balance: Flexibly work remotely from your home, there’s no daily travel requirement to an office! All you need is a stable internet connection.
  • Love your coworkers: Collaborate with some of the best and brightest in the industry!
  • Love your development: Hone your skills or learn new ones with our substantial training allowance; participate in professional development days, attend training, become certified, whatever you like!
  • Love your workspace: We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!
  • Love yourself: Pythian cares about the health and well-being of our team. You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more). Additionally, you will receive a generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity.

Key skills/competency

  • Cloud Data Platforms
  • Data Architecture
  • Data Pipelines
  • ETL/ELT Optimization
  • Big Data Technologies
  • Apache Airflow
  • Spark (PySpark/SparkSQL)
  • SQL Proficiency
  • DevOps (CI/CD, IaC)
  • Data Security

Tags:

Senior Data Engineer
data architecture
data modeling
data pipelines
ETL
cloud solutions
automation
DevOps
data quality
performance tuning
big data
Python
SQL
GCP
Spark
Airflow
BigQuery
Dataflow
Dataproc
GitHub
Terraform

Share Job:

How to Get Hired at Pythian

  • Research Pythian's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor, focusing on "data challenges" and "digital transformation."
  • Tailor your resume: Highlight experience with cloud data platforms, big data technologies, and data pipeline automation, using keywords like "GCP," "Spark," "Airflow," and "data modeling" for the Senior Data Engineer role.
  • Showcase problem-solving: Prepare examples demonstrating your ability to design end-to-end cloud solutions, optimize data pipelines, and troubleshoot complex technical challenges effectively.
  • Emphasize collaboration & communication: Pythian values professional conduct and effective communication within globally distributed teams; illustrate these skills with past project experiences.
  • Highlight cloud and big data expertise: Emphasize hands-on experience with Google Cloud Platform data technologies, Spark, SQL, and DevOps practices relevant to data platforms.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background