20 days ago

Data Engineer

Cohorte Constances

On Site
Full Time
€60,000
Greater Paris Metropolitan Region
Apply

Job Overview

Job TitleData Engineer
Job TypeFull Time
Offered Salary€60,000
LocationGreater Paris Metropolitan Region

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About the Cohorte Constances

The Cohorte Constances is a research study involving 220,000 volunteers aged 18-69 at inclusion. The cohort is followed longitudinally through annual self-questionnaires, regular medical check-ups at health examination centers, and linkages with national administrative databases (socio-professional and health). Additional data enrichment is gathered through specific questionnaires and matches with professional or environmental exposure data. The cohort is structured into functional poles, guided by a data governance framework and overseen by a project committee (COPROJ) for a unified and strategic approach. Key organizational components include: i) a strategy pole for scientific and technical direction; ii) a data sourcing pole for data collection from questionnaires, health centers, and national databases; iii) a methods and data processing pole for data quality, enrichment, weighting scores, and algorithm implementation; iv) a project support pole for external data user access; and v) a digital infrastructure and data flow management pole. The Constances database contains over 4000 variables, hundreds of tables, multiple storage spaces, and data ingestion is managed by automated or semi-automated flows.

Missions and Responsibilities

Within the Digital Infrastructure and Data Flow Management pole, the Data Engineer will be primarily responsible for managing the Constances database within their scope, as well as maintaining and developing the cohort's analysis and monitoring interface.

Main Activities

1. Database Management:

  • Responsible for data integration within their scope.
  • Implementation of data integration tools.
  • Monitoring of rejected data.
  • Documentation of implemented programs and processes.
  • Continuous improvement of processes and programs.
  • Database evolutions/corrections based on business needs (new variables, new tables, etc.).
  • Participation in migrating data integration programs from SAS to Python.
  • Orchestration of automated pipelines using Prefect.

2. Responsible for the Cohort's Analysis and Monitoring Interface (DataApp):

  • This interface, developed in Streamlit, centralizes data flows from the DatalaLake (MariaDB) and various APIs.
  • Maintain tool fluidity and fix bugs.
  • Integrate new data sources.
  • Propose new uses and functionalities.

3. Participation in various IT projects:

  • Data governance.
  • Metadata documentation projects.

Skills Required

  • Proficiency in Python and its data ecosystem (Pandas, NumPy). Knowledge of SAS is a plus.
  • Good command of database management systems (MySQL, MariaDB).
  • Experience in managing workflow orchestration tools (Prefect).
  • Data visualization skills (dashboard design, clear data presentation).
  • Proficiency with versioning tools (Git).
  • Good knowledge of Windows and Linux environments.
  • Awareness of data and system security issues.

Behavioral Skills

  • Strict adherence to data confidentiality and proper usage.
  • Initiative, proactivity, and a proposal-driven mindset.
  • Excellent communication skills, including with non-technical stakeholders.
  • Rigor, organizational skills, and methodology.
  • Good interpersonal skills and team spirit.

Contract Details

  • Public service contract.
  • 3 to 5 years of experience required.
  • Salary based on experience.
  • Fixed-term contract (CDD) for 24 months, renewable, full-time.
  • Work location: UMS 11, Hôpital Paul Brousse, Villejuif.
  • Confidentiality agreement required.

Key skills/competency

  • Data Engineering
  • Python
  • SQL
  • Database Management
  • Data Pipelines
  • Data Visualization
  • Prefect
  • Streamlit
  • Data Governance
  • Data Quality

Tags:

Data Engineer
Python
SQL
Database Management
Data Pipelines
Data Visualization
Prefect
Streamlit
SAS
Data Governance
Data Quality
Data Integration
Cloud Data
Big Data
Data Analysis
Software Engineering
IT
Research
Healthcare
Public Service

Share Job:

How to Get Hired at Cohorte Constances

  • Tailor your resume: Highlight your Python, SQL, and Prefect experience. Emphasize data integration and pipeline automation.
  • Showcase your portfolio: Include projects demonstrating data visualization and dashboard creation, ideally with Streamlit.
  • Prepare for technical questions: Be ready to discuss database management (MySQL, MariaDB) and data governance principles.
  • Demonstrate team fit: Highlight your communication skills, proactivity, and ability to work with non-technical teams.
  • Understand the context: Research the Constances cohort and its mission to align your application with their goals.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background