[Question] Segments : what are their role in the training? #468

New Issue

DoctorPopi · 2024-01-25T20:00:42Z

DoctorPopi commented

2024-01-25 20:00:42 +00:00

Hey there,

I'm building a tool to help me process and prepare audiobooks to feed the ai-voice-cloning tool. I've come to the point where I want to transcribe my audiofiles. After diving into the ai-voice-cloning code (utils.py in particular), I found out that I had to use "medium.en" as a model to get the same segments as you (like 2 instead of 4 for instance).

But what I'm wondering right now is : what are segments exactly? How are they calculated, and, above all, does a different segmenting change the training a lot?

Thank you for any light you could shed on this!

Hey there, I'm building a tool to help me process and prepare audiobooks to feed the ai-voice-cloning tool. I've come to the point where I want to transcribe my audiofiles. After diving into the ai-voice-cloning code (utils.py in particular), I found out that I had to use "medium.en" as a model to get the same segments as you (like 2 instead of 4 for instance). But what I'm wondering right now is : what are segments exactly? How are they calculated, and, above all, does a different segmenting change the training a lot? Thank you for any light you could shed on this!

Sign in to join this conversation.