We have an ongoing audio data collection project focused on code-switching conversations in Tamil-English and Malay-English, specifically involving participants based in Singapore.
Project Overview
The objective is to record natural, unscripted, in-person conversations between two native speakers. These recordings will support the development of AI systems in multilingual speech recognition.
Participant Requirements – Singapore
Tamil-English: 24 audio hours (12 speaker pairs / 24 participants)
Malay-English: 20 audio hours (10 speaker pairs / 20 participants)
Each participant contributes 1 hour of speech (2 speakers = 2-hour session)
Recording Details
Tool: Mobile-based app (Shaip App 2.0)
Device: Mobile phone microphone only (no external equipment)
Format: Scenario-guided, free-flowing conversations (not scripted)
Age Requirement: 18+
Gender Balance: Minimum 45% male and 45% female (±5%)
Language Ratio: ~50% English / 50% Tamil or Malay
Interested candidates may contact me at phoebe@unioglobal.com.
https://www.upwork.com/jobs/~021926513065417019064