I have a list of 350,000 words' spellings as well as the phonemes that comprise their pronunciation (the CMU 0.7 dataset). I also have a comprehensive list of all of the ways each phoneme can be spelled. For example, the long A phoneme can be spelled with a , a_e, ai, ay, ei, eigh, and ea.
I need to determine the spelling of each phoneme in each word. I need an engineer familiar with the CMU 0.7 dataset who can use effective tools such as the G2P toolkits from Nemo and others to give me an accurate map of each phoneme to its grapheme in each word.
AI tells me I need this:
Automated Approach for a Large Dataset: With a dataset of 350,000 words, a manual approach is not practical. Using or developing a grapheme-to-phoneme (G2P) alignment system or algorithm would be needed.
- Utilize an Existing G2P Tool or Library: Several tools and libraries are designed to handle grapheme-to-phoneme conversion and alignment, often using machine learning models trained on large datasets. Some examples include:
- DeepPhonemizer: A PyTorch library for G2P conversion using Transformer models.
- G2P-seq2seq: A G2P model for PyTorch based on a transformer architecture.
- G2PU: A joint CTC-attention model that can perform simultaneous optimization of G2P, acoustic model, and acoustic alignment to a corpus, according to Illinois Experts.
My hope would be to finish this within 4 weeks. I'm not sure of the cost for this.
Could you please let me know of your experience performing jobs like this, your estimated time to completion, and cost?