22 hours ago

Senior Research Data Engineer

DeepL

On Site

Full Time

€135,000

Berlin, Berlin, Germany

Job Overview

Job TitleSenior Research Data Engineer

Job TypeFull Time

CategoryCommerce

Experience5 Years

DegreeMaster

Offered Salary€135,000

LocationBerlin, Berlin, Germany

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About DeepL

DeepL is a global AI product and research company dedicated to building secure, intelligent solutions for complex business problems. Trusted by over 200,000 business customers and millions of individuals across 228 global markets, DeepL's Language AI platform provides human-like translation, improved writing, and real-time voice translation. Building on a history of innovation, quality, and security, DeepL is expanding its offerings beyond language, including DeepL Agent, an autonomous AI assistant. Founded in 2017 by CEO Jarek Kutylowski, DeepL boasts over 1,000 passionate employees and is supported by world-renowned investors like Benchmark, IVP, and Index Ventures.

Our goal is to become the global leader in trusted, intelligent AI technology, creating products that drive better communication, foster connections, and create meaningful impact. We are looking for talented individuals to join our journey and shape the future of AI in a fast-moving, purpose-driven environment.

What Sets DeepL Apart

DeepL offers a unique blend of cutting-edge AI technology, meaningful work, and a thriving culture. We are a team of innovators, researchers, and creators driven by a shared purpose: to unlock human potential by making work simpler, smarter, and more connected. Our technology helps millions of people and businesses communicate and work better every day, underpinned by a culture of trust, curiosity, and care. Being part of DeepL means joining a team dedicated to innovation, growth, and well-being. Discover more about life at DeepL on LinkedIn, Instagram, and our Blog.

Meet the Foundation Model Team

Innovation at DeepL begins in the research department, driven by researchers, engineers, and developers passionate about advancing AI. Data is the lifeblood fueling this passion, crucial for model training and quality evaluation. You will join our Foundation Model track, a cross-functional group of research scientists and data engineers specializing in machine learning. This team develops foundation models for use in DeepL's AI products. Data engineers in this team create, refine, and manage multi-modal training corpora, owning the associated data collection and preparation pipelines. The team works with unstructured data on a petabyte scale and leverages tens of thousands of cores in a hybrid cloud setting for ambitious projects.

Your Responsibilities as a Senior Research Data Engineer

Work on ambitious frontier research projects as part of a team comprising research scientists and research data engineers.
Architect, design, and build data pipelines capable of handling petabytes of multi-modal unstructured data.
Develop a modern data engineering stack based on state-of-the-art technology for orchestration and parallel computation, extensively utilizing actively developing open-source solutions.
Identify performance bottlenecks, debug issues, and create stable pipelines, from individual components to system-wide views.
Leverage DeepL's large on-prem data centers and AWS cloud infrastructure for blazing data processing.
Go beyond traditional “Big Data” and ETL, engineering and operating complex Python data solutions for real-world unstructured data, including text, code, image, and audio modalities.
Collaborate effectively with stakeholders, research scientists, other research data engineers, and data tooling/platform teams.
Raise the standard for excellence and act as owner and champion for the quality and availability of our foundation model training data.
Ensure mission-critical reliability of data pipeline jobs and maintain high-quality code.

We encourage you to contribute with creativity, thoroughness, pragmatism, foresight, ingenuity, persistence, and every quality that elevates the team.

Qualities We Look For

Professional Experience: In data, platform, or software engineering, ideally with a focus on large-scale unstructured data.
Python Expertise: Extensive professional experience in Python software engineering, ideally maintaining proprietary or open-source software products.
Data Handling: Experience with exploratory data analysis, cleaning, validation, and quality control beyond business intelligence and analytics scale.
Pipeline Development: Experience building reproducible pipelines for storing and processing petabytes of data.
Operations Proficiency: In containerization and automatic deployment, ideally with Kubernetes and cloud infrastructure.
Scaling Knowledge: Experience with highly scalable, parallel compute workloads (e.g., Dask, Ray, Celery).
Performance Optimization: Experience writing and optimizing highly performant code.
Cross-functional Affinity: Ability to collaborate directly with researchers and engineering stakeholders to translate needs into data products with desired user experience and performance.
Soft Skills: Excellent problem-solving abilities, strong communication skills, and a collaborative mindset.

Ideally, You Have Domain-Specific Experiences

LLM or VLM training data preparation.
NLP, text classification, reinforcement learning, model-based/GPU workflows.
Dynamic workflow orchestration frameworks like Argo Workflows, Airflow, Dagster, or Flyte.
Linguistics expertise or speaking multiple languages.
Experience in a high-performance programming language like C++, Go, or Rust.

What DeepL Offers

Diverse & International Team: Join a global community of over 90 nationalities, with a presence in the UK, Germany, Netherlands, Poland, US, and Japan.
Open Communication & Feedback: A culture that values clear, honest communication, smooth collaboration, direct feedback, and growth mindset.
Hybrid Work & Flexible Hours: Hybrid schedule (2 days in office), flexible working hours, and trust in your productivity.
Regular In-person Team Events: Vibrant local, business unit, new-joiner, and company-wide gatherings.
Monthly Full-day Hacking Sessions: “Hack Fridays” for passionate project work and cross-team collaboration.
Generous Annual Leave: 30 days of annual leave (excluding public holidays) and access to mental health resources.
Competitive Benefits: Tailored benefits package reflecting the diversity of our global team and locations.

If this role resonates with you, but you don't check every box, we encourage you to apply. DeepL values the potential you bring and the growth we can foster together.

Key skills/competency

Data Engineering
Foundation Models
Python Programming
Large-Scale Data
Distributed Systems
Cloud Infrastructure (AWS)
Data Pipelines
Performance Optimization
Kubernetes
Machine Learning Data

Tags:

Senior Research Data Engineer

Data Engineering

Foundation Models

Data Pipelines

Unstructured Data

Machine Learning

Data Quality

Performance Optimization

Cloud Infrastructure

Distributed Systems

Research Projects

Python

AWS

Kubernetes

Dask

Ray

Celery

Argo Workflows

Airflow

Dagster

Flyte

C++

Rust

How to Get Hired at DeepL

Research DeepL's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
Customize your resume: Tailor your Senior Research Data Engineer resume to highlight experience with large-scale data, Python, cloud platforms, and ML data pipelines.
Showcase data expertise: Prepare examples demonstrating your ability to architect, build, and optimize data solutions for unstructured, petabyte-scale data.
Understand foundation models: Familiarize yourself with DeepL's AI products and the role of data in training and evaluating large language/vision models.
Prepare for technical interviews: Be ready to discuss data architecture, distributed systems, Python performance, and problem-solving relevant to complex data challenges.

Frequently Asked Questions

Find answers to common questions about this job opportunity

01What kind of data will a Senior Research Data Engineer at DeepL primarily work with?

02What is the work arrangement for the Senior Research Data Engineer role at DeepL?

03What technical skills are crucial for a Senior Research Data Engineer at DeepL?

04How does DeepL support professional growth for a Senior Research Data Engineer?

05What cloud platforms does DeepL leverage for data processing in this role?

06What specific orchestration frameworks are preferred for data pipelines at DeepL?

Explore similar opportunities that match your background

Senior Research Data Engineer

DeepL

Job Overview

Who's the hiring manager?

Job Description

About DeepL

What Sets DeepL Apart

Meet the Foundation Model Team

Your Responsibilities as a Senior Research Data Engineer

Qualities We Look For

Ideally, You Have Domain-Specific Experiences

What DeepL Offers

Key skills/competency

Tags:

Share Job:

How to Get Hired at DeepL

Frequently Asked Questions