FZZ: Multilingual Grapheme-to-Phoneme and Lyrics-to-Audio Alignment

This topic has 0 replies, 1 voice, and was last updated 1 month ago by wpfan.

Viewing 1 post (of 1 total)

Author

Posts
October 21, 2024 at 6:04 am #165
wpfan
Participant
🎉🎉🎉Congratulations to MIREX for being held again in 2024 after a three-year hiatusv🎉🎉🎉.

🥳 We present our system for MIREX 2024 Lyrics-to-Audio Alignment task. Our system utilizes separated vocal tracks as input and a joint training objective with pitch prediction to train an acoustic module. We introduce pitch extraction and voice activity detection (VAD) module in the alignment pipeline to further augment the result of the trained model, and improve the overall performance of lyrics-to-phonemes transcription to retain sufficient alignment in multilingual application scenarios. The experimental results show that our system can perform well in multilingual lyric alignment scenarios.

I hope our work can promote the development of the entire MIREX community and the AI music field. Thank you for your support.
Attachments:
1. FZZ-1.pdf
Author

Posts

Viewing 1 post (of 1 total)

You must be logged in to reply to this topic.