
Data Engineer
micro1 · NAMER
This listing has closed — view similar roles below.
- Hybrid
- Full-time
- $120,000 / year
- NAMER
Job highlights
- Build and maintain scalable data pipelines.
- Prepare datasets for AI research and model training.
- Optimize data models, schemas, and storage systems.
- Write SQL and Python for data extraction and analysis.
- Ensure data quality and reliability in pipelines.
About the role
About Us
micro1 is a data engine that helps AI labs train foundational models and enterprises build AI agents. We provide frontier evaluations and reinforcement learning environments used to improve LLM capabilities, as well as contextual evaluations used to monitor and improve AI agents in enterprise settings. Our data engine includes an AI recruiter agent that sources and vets domain experts, a data platform that enables rapid production of high-quality training data, and a pipeline performance system that ensures both quality and velocity.
The Role
We are looking for a Data Engineer to support data infrastructure and experimentation in an AI research environment. In this role, you will build reliable data pipelines, explore datasets, and help transform raw data into structured formats that enable research and model development.
Key Responsibilities
- Design, build, and maintain scalable data pipelines to ingest, process, and transform data from multiple sources.
- Collaborate with AI researchers and data scientists to structure and prepare datasets for experimentation and model training.
- Develop and maintain data models, schemas, and storage systems optimized for large-scale datasets.
- Write efficient SQL queries and Python scripts to extract, transform, and analyze data.
- Ensure data quality, integrity, and reliability across data pipelines and storage layers.
- Implement data validation, monitoring, and automation workflows that support iterative research cycles.
Required Skills and Qualifications
- Strong proficiency in Python and SQL.
- Experience designing and maintaining ETL / ELT pipelines.
- Solid experience with data manipulation libraries such as Pandas and NumPy.
- Experience working with structured and semi-structured datasets.
- Familiarity with relational databases such as PostgreSQL or MySQL.
- Strong analytical thinking and ability to work in collaborative research-driven environments.
- Excellent written and verbal communication skills.
Nice to Have
- Exposure to AI/ML workflows or research environments.
- Experience with data visualization tools such as Matplotlib, Seaborn, or Plotly.
- Familiarity with LLM-related data workflows (datasets for training, evaluation, or prompt experimentation).
Key skills/competency
- Data Engineering
- Python
- SQL
- ETL/ELT
- Data Pipelines
- Data Modeling
- Data Analysis
- PostgreSQL
- MySQL
- AI/ML
Skills & topics
- Data Engineer
- Python
- SQL
- ETL
- ELT
- Data Pipelines
- Data Modeling
- Data Analysis
- AI
- Machine Learning
- LLM
- Remote
How to get hired
- Tailor your resume: Highlight Python, SQL, and ETL/ELT pipeline experience relevant to micro1's data engineering needs.
- Showcase AI/ML exposure: Emphasize any experience with AI/ML workflows, LLM data, or research environments in your application.
- Prepare for technical interviews: Brush up on SQL, Python, data modeling, and common data engineering challenges micro1 faces.
- Demonstrate collaboration skills: Be ready to discuss how you work with researchers and data scientists to achieve data goals.
Technical preparation
Behavioral questions
Frequently asked questions
- What are the core responsibilities of a Data Engineer at micro1?
- As a Data Engineer at micro1, you'll be instrumental in building and maintaining scalable data pipelines, transforming raw data into structured formats for AI research, and ensuring data quality and reliability. You'll collaborate closely with AI researchers and data scientists, write SQL and Python scripts, and optimize data models and storage systems.
- What technical skills are essential for the Data Engineer role at micro1?
- Strong proficiency in Python and SQL is crucial. You'll also need experience with ETL/ELT pipeline design and maintenance, data manipulation libraries like Pandas and NumPy, and familiarity with relational databases such as PostgreSQL or MySQL. Excellent analytical and communication skills are also key.
- Is this Data Engineer position remote?
- Yes, the Data Engineer position at micro1 is a fully remote role, offering flexibility in your work location.
- What kind of data will I be working with as a Data Engineer at micro1?
- You will work with diverse datasets to support AI labs training foundational models and enterprises building AI agents. This includes data for frontier evaluations, reinforcement learning environments, contextual evaluations for AI agents, and data for LLM training, evaluation, or prompt experimentation.
- What is micro1's mission as a company?
- micro1's mission is to be a data engine that empowers AI labs to train foundational models and enables enterprises to build AI agents. They focus on providing advanced evaluation environments and a robust data platform for high-quality AI development.
- What are the 'nice to have' skills for this Data Engineer role?
- While not strictly required, experience with AI/ML workflows or research environments, familiarity with data visualization tools (Matplotlib, Seaborn, Plotly), and knowledge of LLM-related data workflows are considered advantageous for this role.
- How does micro1 ensure data quality in its pipelines?
- micro1 emphasizes data quality through robust validation, monitoring, and automation workflows within their data pipelines and storage layers. The role involves ensuring the integrity and reliability of data to support iterative research cycles.