10 days ago

Data Scientist

Deepgram

Hybrid
Full Time
$150,000
Hybrid
Apply

Job Overview

Job TitleData Scientist
Job TypeFull Time
Offered Salary$150,000
LocationHybrid
Map of Hybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About Deepgram

Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT), text-to-speech (TTS), and building production-grade voice agents at scale. More than 200,000 developers and 1,300+ organizations build voice offerings that are ‘Powered by Deepgram’, including Twilio, Cloudflare, Sierra, Decagon, Vapi, Daily, Cresta, Granola, and Jack in the Box. Deepgram’s voice-native foundation models are accessed through cloud APIs or as self-hosted and on-premises software, with unmatched accuracy, low latency, and cost efficiency. Backed by a recent Series C led by leading global investors and strategic partners, Deepgram has processed over 50,000 years of audio and transcribed more than 1 trillion words. There is no organization in the world that understands voice better than Deepgram.

Company Operating Rhythm

At Deepgram, we expect an AI-first mindset—AI use and comfort aren’t optional, they’re core to how we operate, innovate, and measure performance. Every team member who works at Deepgram is expected to actively use and experiment with advanced AI tools, and even build your own into your everyday work. We measure how effectively AI is applied to deliver results, and consistent, creative use of the latest AI capabilities is key to success here. Candidates should be comfortable adopting new models and modes quickly, integrating AI into their workflows, and continuously pushing the boundaries of what these technologies can do. Additionally, we move at the pace of AI. Change is rapid, and you can expect your day-to-day work to evolve just as quickly. This may not be the right role if you’re not excited to experiment, adapt, think on your feet, and learn constantly, or if you’re seeking something highly prescriptive with a traditional 9-to-5.

The Opportunity

Voice is the most natural modality for human interaction with machines. However, current sequence modeling paradigms based on jointly scaling model and data cannot deliver voice AI capable of universal human interaction. The challenges are rooted in fundamental data problems posed by audio: real-world audio data is scarce and enormously diverse, spanning a vast space of voices, speaking styles, and acoustic conditions. Even if billions of hours of audio were accessible, its inherent high dimensionality creates computational and storage costs that make training and deployment prohibitively expensive at world scale. We believe that entirely new paradigms for audio AI are needed to overcome these challenges and make voice interaction accessible to everyone.

The Role

Deepgram is currently looking for seasoned Data Scientists with demonstrated experience solving hard data problems while exploring research frontiers to join our Research Staff. Conversational audio presents incredibly rich scientific, engineering, and infrastructure challenges that are orders of magnitude harder than working with text. As a Member of the Research Staff, you will help us to build an industrial “data factory” that will be used to power the next generation of Voice AI systems - unlocking the creation of models that go beyond basic transcription and comprehension, capturing nuanced meanings in complex conversations, adapting robustly to diverse speech patterns, and generating empathic responses with human-like, contextualized speech. You will collaborate closely with our product, engineering, and data teams to build and deploy models in the most scalable voice API on the planet. We look forward to you bringing your expertise, sharing insights from your latest experiments, and collaborating with us to push the boundaries of AI and voice technology.

The Challenge

We are seeking Research Staff who:

  • See "unsolved" problems as opportunities to pioneer entirely new approaches
  • Can identify the one critical experiment that will validate or kill an idea in days, not months
  • Have the vision to scale successful proofs-of-concept 100x
  • Are obsessed with using AI to automate and amplify your own impact

If you find yourself energized rather than daunted by these expectations—if you're already thinking about five ideas to try while reading this—you might be the researcher we need. This role demands obsession with the problems, creativity in approach, and relentless drive toward elegant, scalable solutions. The technical challenges are immense, but the potential impact is transformative.

What You'll Do

  • Drive high performance data acquisition, preparation and synthesis pipelines to generate data for the next generation of speech and language AI foundation models
  • Develop advanced characterizations of complex conversational audio utilizing a diverse toolkit of signals processing techniques and deep learning models
  • Collaborate with DataOps and Engineering to create automated systems which scale the ability of human annotators to label high value data and provide critical feedback on model outputs
  • Build advanced benchmarking methodologies and curated datasets for evaluating conversational voice systems
  • Document and present results of data experiments and analysis for internal and external audiences

You’ll Love This Role If You

  • Are obsessed with making sense out of complex and/or messy data
  • Enjoy building from the ground up and love to create new systems from scratch
  • Are passionate about AI and interested in leveraging data to solve hard problems
  • Are motivated by the prospect of scaling yourself using automation and AI models

It's Important to Us That You Have

  • Experience building data processing pipelines from a blank page and owning the entire data stack including data acquisition, characterization, cleaning, serving and transformation
  • Experience and expertise applying statistical methods and deep learning models to understand complex data
  • Strong communication skills and the ability to translate complex concepts in simple terms, depending on the target audience
  • Strong software engineering skills with particular emphasis on developing clean, modular code in Python and working with Pytorch

Nice to Haves

  • Background in Physics, Mechanical Engineering or Language Processing
  • Experience building models
  • Speech and audio experience

Benefits & Perks*

  • Holistic health: Medical, dental, vision benefits, Annual wellness stipend, Mental health support, Life, STD, LTD Income Insurance Plans
  • Work/life blend: Unlimited PTO, Generous paid parental leave, Flexible schedule, 12 Paid US company holidays, Quarterly personal productivity stipend, One-time stipend for home office upgrades, 401(k) plan with company match, Tax Savings Programs
  • Continuous learning: Learning / Education stipend, Participation in talks and conferences, Employee Resource Groups, AI enablement workshops / sessions

For candidates outside of the US, we use an Employer of Record model in many countries, which means benefits are administered locally and governed by country-specific regulations. Because of this, benefits will differ by region — in some cases international employees receive benefits US employees do not, and vice versa. As we scale, we will continue to evaluate where we can create more alignment, but a 1:1 global benefits structure is not always legally or operationally possible.

Key skills/competency

  • Data Science
  • Speech-to-Text (STT)
  • Deep Learning
  • Python
  • Pytorch
  • Data Acquisition
  • Data Preparation
  • Audio Processing
  • Machine Learning Models
  • Conversational AI

Tags:

Data Scientist
Voice AI
Speech-to-Text
Deep Learning
Python
Pytorch
Data Acquisition
Data Preparation
Audio Processing
Machine Learning Models
Conversational AI
Research Staff

Share Job:

How to Get Hired at Deepgram

  • Tailor your resume: Highlight experience with data processing pipelines, statistical methods, and deep learning models relevant to audio AI.
  • Showcase AI proficiency: Emphasize your AI-first mindset and experience integrating AI tools into your workflow.
  • Demonstrate problem-solving: Provide examples of how you tackle complex data challenges and drive impactful experiments.
  • Prepare for technical questions: Be ready to discuss your Python and Pytorch skills and experience with model building.
  • Research Deepgram: Understand their mission, technology, and AI-driven culture to align your application.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background