
Lead I - Data Engineering
UST · Pune Division, Maharashtra, India
- On site
- Full-time
- $150,000 / year
- Pune Division, Maharashtra, India
Job highlights
- Design and build modern Azure data solutions.
- Develop data pipelines with Azure Data Factory.
- Transform data using Azure Databricks and PySpark.
- Implement data storage and modeling in Azure.
- Collaborate on data requirements and best practices.
About the role
Role Overview
We are seeking a skilled Azure Data Engineer with 5+ years of experience in designing, developing, and maintaining modern data pipelines and data integration solutions using Azure services. The ideal candidate should have strong expertise in Azure Data Factory (ADF), Azure Databricks, Azure Synapse, and Azure Data Lake Storage (ADLS). You will work closely with business analysts, architects, and data scientists to deliver reliable and scalable data solutions that power analytics and business intelligence platforms.
Key Responsibilities
Data Ingestion & Integration
- Design, build, and maintain data pipelines using Azure Data Factory for batch and incremental data ingestion.
- Connect to various data sources (SQL Server, REST APIs, CSV, JSON, SAP, etc.) and integrate into the Azure ecosystem.
- Develop metadata-driven and parameterized pipelines to improve reusability.
- Implement data validation, error handling, and logging frameworks in ADF.
Data Transformation & Processing
- Use Azure Databricks (PySpark) for data cleansing, transformation, and enrichment.
- Optimize Spark jobs for performance and cost efficiency.
- Implement ETL/ELT workflows with Delta Lake and Medallion (Bronze, Silver, Gold) architecture.
Data Storage & Modeling
- Work with Azure Data Lake Storage Gen2 for raw and curated data zones.
- Develop data models in Azure Synapse Analytics / SQL Server for reporting and analytics.
- Implement partitioning, indexing, and performance tuning strategies.
Deployment & DevOps
- Implement CI/CD pipelines using Azure DevOps or GitHub Actions for data workflows.
- Collaborate with architects to automate deployments and version control using Git.
Security & Governance
- Manage data access using Azure RBAC, Managed Identities, and Key Vault.
- Ensure data security, compliance, and privacy as per organizational standards.
Collaboration
- Work with data analysts, BI developers, and business users to define data requirements.
- Participate in code reviews and adhere to best practices in data engineering.
Technical Skills Required
- Azure Services: Azure Data Factory, Azure Databricks, Azure Synapse, ADLS Gen2, Azure SQL Database
- Programming: Python (PySpark), SQL, Spark SQL
- Data Modeling: Star/Snowflake schema, Dimensional modeling
- Source Systems: SQL Server, Oracle, SAP, Flat Files (CSV, JSON, XML), REST APIs
- Version Control & CI/CD: Git, Azure DevOps
- Scheduling & Monitoring: ADF triggers, Databricks jobs, Log Analytics
- Security: Managed Identity, Key Vault, Access Control
- Preferred: Power BI basics, exposure to DataBricks Delta Live Tables or Synapse Pipelines
Soft Skills
- Strong analytical and problem-solving skills.
- Good communication and collaboration abilities.
- Ability to work in agile/scrum environments.
- Self-driven and proactive in identifying process improvements.
Educational Qualifications
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Azure Data Engineer Associate (DP-203) certification preferred.
Example Project Responsibilities
- Design and implement end-to-end data ingestion from on-prem SQL Server to Azure Data Lake using ADF.
- Build Databricks notebooks for data cleansing and transformations using PySpark.
- Implement Delta Lake tables and load curated data into Synapse for reporting.
- Collaborate with BI teams to publish Power BI dashboards on top of Synapse datasets.
Optional (Good To Have)
- Experience with Real-time data processing (Event Hub / Stream Analytics).
- Knowledge of Infrastructure as Code (IaC) using Terraform or ARM templates.
- Familiarity with data quality and data catalog tools (Purview).
Key skills/competency
- Azure Data Factory
- Azure Databricks
- Azure Synapse
- Azure Data Lake Storage (ADLS)
- PySpark
- SQL
- Data Modeling
- ETL/ELT
- CI/CD
- Data Engineering
Skills & topics
- Data Engineering
- Azure Data Factory
- Azure Databricks
- Azure Synapse
- Azure Data Lake Storage
- PySpark
- SQL
- Data Modeling
- ETL
- CI/CD
- Lead Data Engineer
How to get hired
- Tailor your resume: Highlight Azure services like ADF, Databricks, Synapse, and ADLS experience. Emphasize PySpark, SQL, and data modeling skills.
- Showcase project impact: Quantify achievements in data ingestion, transformation, and modeling. Mention CI/CD and security experience.
- Prepare for technical questions: Be ready to discuss data pipeline design, Spark optimization, and Azure data architecture.
- Demonstrate collaboration: Provide examples of working with analysts, architects, and business users to meet data needs.
- Research UST: Understand their mission, values, and recent projects in data engineering to align your answers.
Technical preparation
Master Azure Data Factory for pipeline orchestration.,Practice PySpark for data transformation in Databricks.,Build data models for Synapse Analytics.,Implement CI/CD with Azure DevOps/Git.
Behavioral questions
Describe a complex data challenge you solved.,How do you collaborate with business stakeholders?,Share an experience improving data pipeline efficiency.,How do you handle data quality issues?
Frequently asked questions
- What are the key Azure services I need to be proficient in for the Lead Data Engineer role at UST?
- For the Lead Data Engineer position at UST, strong proficiency is required in Azure Data Factory (ADF), Azure Databricks, Azure Synapse, and Azure Data Lake Storage (ADLS) Gen2. Experience with Python (PySpark) and SQL is also crucial for data transformation and processing.
- How important is data modeling experience for this Lead Data Engineer position?
- Data modeling is a key responsibility for this Lead Data Engineer role at UST. You'll be expected to develop data models in Azure Synapse Analytics/SQL Server, utilizing concepts like Star/Snowflake schema and dimensional modeling for reporting and analytics.
- What kind of experience is expected regarding CI/CD for the Lead Data Engineer job at UST?
- The Lead Data Engineer at UST is expected to implement CI/CD pipelines using Azure DevOps or GitHub Actions for data workflows. Collaboration with architects to automate deployments and version control using Git is also a significant part of this role.
- Does UST prefer candidates with specific certifications for the Data Engineer role?
- While a Bachelor's degree in a related field is required, UST prefers candidates with an Azure Data Engineer Associate (DP-203) certification for this Lead Data Engineer position.
- What are the 'good to have' skills for the Lead Data Engineer at UST?
- For the Lead Data Engineer role at UST, 'good to have' skills include experience with real-time data processing (Event Hub/Stream Analytics), knowledge of Infrastructure as Code (Terraform/ARM templates), and familiarity with data quality/catalog tools like Purview.
- What is the expected level of experience for the Lead I - Data Engineering position at UST?
- UST is seeking a Lead I - Data Engineering with 5+ years of experience in designing, developing, and maintaining modern data pipelines and integration solutions using Azure services.