Is it possible to introduce voice from file instead of mic? #315
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#315
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This is really a great contribution.
Is there any way to do like taking sample from mp3 or wav?
Sorry to comment that from only the user experience, but it's presumable perform via
WAV seems to be at a sample rate of 22050 Hz https://git.ecker.tech/lightmare/tortoise-tts, at least around 10 seconds of data, if different emotions would be used, then might need to express them with explicit word mention to later map to the emotion prompts, different speed and variety (if later training / finetuning #307, or performing emotion transfer from other voices https://github.com/neonbjb/tortoise-tts/issues/16), clearly spoken, no background noises, only one speaker, audio which ends after a sentence ends.
Thank you so much. Great help. So nice of you.