Project Description
We are developing a web-based training tool for business English learners. The core functionality allows students to practice realistic voice roleplays with an AI that simulates natural conversations and provides automated feedback.
We’re seeking an experienced developer to take over an in-progress project and finalize the implementation of the following features:
Requirements:
Experience with OpenAI APIs (especially GPT-4o, Whisper for STT, and TTS integration).
Proven ability to build real-time voice interaction systems.
Familiarity with latency optimization, streaming STT, and prompt engineering.
Backend experience with Node.js or Python (FastAPI preferred).
Ability to integrate audio streaming and handle frontend/backend voice flow.
Strong communication and documentation skills.
Nice to Have:
Experience with Redis for queue management.
Familiarity with educational or language-learning platforms.
Understanding of prompt tuning for interactive scenarios.
Deliverables:
Real-time roleplay experience with minimal latency (max 2–3s delay).
Natural turn-taking logic between user and AI.
Audio and text transcript storage.
Final MVP ready for internal pilot testing.
Start: Immediate.Duration: 3–5 weeks for MVP.
Please include examples of similar projects or links to GitHub repos.