Staff Applied Research Scientist – Descript
Our vision is to build the next‑generation platform for fast and easy creation of audio and video content. In November 2022, we launched Storyboard, the video editing foundation that we’re excited to build atop AI capabilities such as Overdub, Studio Sound, and other generative AI features. We’re used by some of the world’s top podcasters and influencers as well as businesses like BBC, ESPN, HubSpot, Shopify and The Washington Post to communicate via video. We’ve raised $100 million from investors including Andreessen Horowitz, OpenAI Startup Fund, Redpoint Ventures and Spark Capital.
We need great people to help us build these cutting‑edge technologies and guide their development. In particular, we’re always looking to hire smart applied research scientists. You will join a team of around a dozen researchers specialized in generative models and deep learning.
Key Research Publications
- SampleRNN : An Unconditional End‑to‑End Neural Audio Generation Model.
- Char2Wav : End‑to‑End Speech Synthesis.
- ObamaNet : Photo‑realistic lip‑sync from text.
- MelGAN : Generative Adversarial Networks for Conditional Waveform Synthesis.
- Chunked Autoregressive GAN for Conditional Waveform Synthesis.
- Wav2CLIP : Learning Robust Audio Representations From CLIP.
This role can be remote in the US or based in the SF Bay Area.
Responsibilities
Oversee the research process from problem definition to running and analyzing experiments.Collaborate and communicate clearly with the team about status, results, and challenges.Contribute to designing the company’s research roadmap.Train and mentor other team members.Own the research function of specific product features.Challenges
Apply deep learning—including NLP, speech processing, computer vision—to solve media creation and editing problems.Create realistic voice doubles from only a few minutes of audio.Develop tools to synthesize photo‑realistic videos for our Overdub (personalized speech synthesis) feature.Design and develop new algorithms for media synthesis, anomaly detection, speech recognition, speech enhancement, filler‑word detection, audio and video tagging, etc.Identify new research directions to improve our product.Requirements
Proven experience designing and implementing deep‑learning algorithms.Ph.D. or Master’s degree in Deep Learning or equivalent experience.Track record of new ideas in machine learning, demonstrated by first‑author publications or projects.Strong programming skills and experience with deep‑learning frameworks.Ability to generate more ideas than can be implemented.Efficient implementation of ideas; able to test many ideas per day and manage time productively.Desire for more GPUs to run desired experiments.Expertise with PyTorch / TensorFlow.Domain knowledge in computer vision or speech processing is not required.Selection Criteria
Lead author of an accepted publication in a top conference (ICLR, ICML, NeurIPS, ICASSP, ICCV, CVPR, InterSpeech, etc.)Played a key role in shipping a production feature that uses deep learning as a core component.Base salary range : $190,000 – $300,000 per year. Final offers consider experience, expertise, location and other factors.
About Descript
Descript builds a simple, intuitive, fully‑powered editing tool for video and audio, powered by AI. We have a team of 150 and are backed by investors such as OpenAI, Andreessen Horowitz, Redpoint Ventures, and Spark Capital. We’re early enough that each new employee can measurably influence the company’s direction.
Benefits
Generous healthcare package401(k) matching programCatered lunchesFlexible vacation timeOption to work remotely or in a hybrid modelDescript is an equal‑opportunity workplace dedicated to equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or veteran status.
Referrals increase your chances of interviewing at Descript by 2x.
#J-18808-Ljbffr