
Data Migration Engineer
Capgemini · Dallas, TX
- On site
- Full-time
- $93,232 / year
- Dallas, TX
Job highlights
- Migrate data pipelines and logic to Lakehouse environment.
- Execute physical data transfers ensuring data integrity.
- Translate and optimize SQL and Spark consumption patterns.
- Perform rigorous data validation and reconciliation.
- Collaborate with stakeholders and data owners.
About the role
Data Migration Engineer
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
Job Location
Dallas, TX (Day One Onsite - 5 days in a week)
Key Responsibilities
- Pipeline Migration: Logic Scheduling Refactoring and migrating extraction logic and job scheduling from legacy frameworks to the new Lakehouse environment.
- Data Transfer: Executing the physical migration of underlying datasets while ensuring data integrity.
- Stakeholder Engagement: Acting as a technical liaison to internal clients facilitating handoff and signoff conversations with data owners to ensure migrated assets meet business requirements.
- Consumption Pattern Migration: Code Conversion Translating and optimizing legacy SQL and Spark-based consumption patterns (raw and modeled) for compatibility with Snowflake and Iceberg.
- Usage Analysis: Understand usage patterns to deliver the required data products.
- Data Reconciliation & Quality: A rigorous approach to data validation is required. Candidates must work with reconciliation frameworks to build confidence that migrated data is functionally equivalent to that already used within production flows.
- Data Engineer Collaboration: Will also need to work with internal data management platforms team and must have an aptitude for learning new workflows and language constructs as necessary.
Required Skills
- Experience: Minimum of 3-5 years of professional hands-on keyboard coding experience in a collaborative, team-based environment. Ability to troubleshoot SQL and basic scripting experience.
- Languages: Professional proficiency in Python or Java.
- Methodology: Deep familiarity with the full Software Development Life Cycle (SDLC) and CI/CD best practices. K8s deployment experience.
- Core Data Engineering Competencies: Candidates must demonstrate a sophisticated understanding of the following modeling concepts to ensure data correctness during reconciliation:
- Temporal Data Modeling: Managing state changes over time (e.g., SCD Type 2).
- Schema Management: Expertise in Schema Evolution (Ref Iceberg, Apache) and enforcement strategies.
- Performance Optimization: Advanced knowledge of data partitioning and clustering.
- Architectural Theory: Balancing Normalization vs. Denormalization and the strategic use of Natural vs. Surrogate Keys.
- Technologies: While candidates are not expected to be experts in every tool, the collective team must cover the following technologies: Extraction Logic, Kafka, ANSI SQL, FTP, Apache Spark.
Life At Capgemini
Capgemini supports all aspects of your well-being throughout the changing stages of your life and career. For eligible employees, we offer:
- Flexible work
- Healthcare including dental, vision, mental health, and well-being programs
- Financial well-being programs such as 401(k) and Employee Share Ownership Plan
- Paid time off and paid holidays
- Paid parental leave
- Family building benefits like adoption assistance, surrogacy, and cryopreservation
- Social well-being benefits like subsidized back-up child/elder care and tutoring
- Mentoring, coaching and learning programs
- Employee Resource Groups
- Disaster Relief
Key skills/competency
- Data Migration
- Data Engineering
- Python
- Java
- SQL
- Apache Spark
- Snowflake
- Iceberg
- SDLC
- CI/CD
Skills & topics
- Data Migration Engineer
- Data Migration
- Data Engineering
- Pipeline Migration
- Data Transfer
- SQL
- Spark
- Python
- Java
- Lakehouse
- Snowflake
- Iceberg
- SDLC
- CI/CD
- Dallas
- On-site
How to get hired
- Tailor your resume: Highlight your 3-5 years of coding experience, SQL troubleshooting, and Python/Java proficiency.
- Showcase SDLC and CI/CD: Emphasize your familiarity with software development lifecycle and continuous integration/deployment practices.
- Demonstrate data modeling expertise: Detail your experience with temporal data modeling, schema evolution, partitioning, and normalization concepts.
- Prepare for technical questions: Be ready to discuss your experience with Kafka, Spark, Snowflake, and Iceberg.
- Research Capgemini: Understand their focus on digital transformation, sustainability, and inclusive work environment.
Technical preparation
Practice SQL query optimization and complex data manipulation.,Build small migration scripts using Python or Java.,Familiarize with Spark and data lakehouse concepts.,Review data reconciliation and validation techniques.
Behavioral questions
Describe a complex data migration challenge you faced.,How do you ensure data integrity during migration?,How do you handle conflicting stakeholder requirements?,Tell me about your experience with SDLC and CI/CD.
Frequently asked questions
- What are the primary responsibilities of a Data Migration Engineer at Capgemini in Dallas?
- As a Data Migration Engineer at Capgemini in Dallas, you will be responsible for migrating data pipelines and logic to a Lakehouse environment, executing physical data transfers while ensuring integrity, translating and optimizing SQL and Spark consumption patterns, and performing rigorous data validation and reconciliation. You will also act as a technical liaison to internal clients and data owners.
- What technical skills are essential for this Data Migration Engineer role?
- Essential technical skills include 3-5 years of hands-on coding experience, proficiency in Python or Java, troubleshooting SQL, and experience with the full SDLC and CI/CD best practices. You should also have a strong understanding of temporal data modeling, schema management (Iceberg, Apache), performance optimization (partitioning, clustering), and architectural theory. Familiarity with Kafka, ANSI SQL, FTP, Apache Spark, Snowflake is also crucial.
- Is this Data Migration Engineer position remote or on-site?
- This Data Migration Engineer position in Dallas, TX is an on-site role, requiring employees to be in the office five days a week from day one.
- What is the expected salary range for a Data Migration Engineer at Capgemini in Dallas?
- The base salary range for this role in Dallas is between $80,420 and $106,050 annually, with potential for additional compensation through variable incentives, bonuses, or commissions.
- What kind of career growth opportunities does Capgemini offer to Data Migration Engineers?
- Capgemini emphasizes career shaping and provides a collaborative community for support and inspiration. They offer mentoring, coaching, and learning programs, as well as Employee Resource Groups, to help you grow within the company.
- How does Capgemini support employee well-being for a Data Migration Engineer?
- Capgemini offers comprehensive well-being support including flexible work options, robust healthcare (dental, vision, mental health), financial wellness programs (401(k), ESOP), paid time off, parental leave, family building benefits, and social well-being benefits like subsidized child care and tutoring.
- What is the company culture like at Capgemini for a Data Migration Engineer?
- Capgemini fosters a collaborative and inclusive culture, empowering employees to shape their careers and encouraging them to reimagine what's possible through technology. They are committed to diversity and aim to build a more sustainable and inclusive world.