2024 June
Building the Future of Speech AI: My 2024 Collaboration with Defined AI


In 2024, I contributed to the forefront of AI development through a fascinating project with Defined AI, focusing on the crucial data that powers speech recognition systems.
My work spanned two key areas:
1. Voice Data Collection:
Defined AI provided various scenarios, and I partnered with a fellow freelance linguist to perform unscripted, natural telephone conversations in Mandarin based on selected topics. Our goal was to capture authentic, spontaneous speech patterns and acoustic variations essential for training robust AI models.
2. Voice Data Annotation:
I also worked on the transcription of pre-recorded Mandarin speech files. This meticulous task involved converting audio into accurate, time-aligned text, creating the foundational labels that teach AI to understand human language.
This project was a deep dive into the human-centric process behind machine learning. I'm proud to have played a part in helping build more natural and effective voice AI, and I'm excited to apply this unique experience to future challenges at the intersection of language and technology.


AI Project
Data Collection and Annotation