Does bark feature fine tuning? #324

Closed
opened 2023-08-09 08:15:17 +00:00 by Bluebomber182 · 1 comment

If not, can you add this feature from this fork that features fine tuning?
https://github.com/serp-ai/bark-with-voice-clone

If not, can you add this feature from this fork that features fine tuning? https://github.com/serp-ai/bark-with-voice-clone
Owner

mmm...

  • desu Bark's integration with the web UI isn't going to be any more tightly coupled than it loosely is now. I genuinely don't remember why I added it outside of having it be an easy way to test its output and draw conclusion from it (it didn't wow me enough to continue).
  • the patches I lifted from that fork itself to allow for user-provided voices isn't working. I don't know why, and I've made three attempts to try to get it to work, and nothing will make it sound decent. It'd just be better to use the original fork that allows it at that point.
  • integrating another training script is a bit too much; I've effectively neglected the training script for what was my VALL-E fork, since I never actually bothered using the web UI to do my test trainings with VALL-E. Any and all code that's going to be used to do Bark finetunes of the fine/course models is just going to be pain.
  • extending on that, the web UI's code is already a giant mess that I'm still dreading with making the initiative to clean it up rewrite it. Stapling on training for another model is just going to exacerbate issues.

and above all:

  • for an extreme lack of a better term, the """meta""" seems to be around just using a normal TTS (base TorToiSe or base Bark) and running it through RVC for "great" quality output (I still need to actually see/hear for myself). Aside from the rewrite, I imagine it would just be better to add in using RVC into the "stack".
    • and as an addendum, I feel that the RVC sphere of utilities is rather solved. From what pieces I gleaned over time, people just prefer using RVC.

If by some miracle I manage to get around to it, sure, I'll try. It's just, as things currently stand, Bark is already very low priority, much less finetuning it.

mmm... * desu Bark's integration with the web UI isn't going to be any more tightly coupled than it loosely is now. I genuinely don't remember why I added it outside of having it be an easy way to test its output and draw conclusion from it (it didn't wow me enough to continue). * the patches I lifted from that fork itself to allow for user-provided voices isn't working. I don't know why, and I've made three attempts to try to get it to work, and nothing will make it sound *decent*. It'd just be better to use the original fork that allows it at that point. * integrating another training script is a bit too much; I've effectively neglected the training script for what was my VALL-E fork, since I never actually bothered using the web UI to do my test trainings with VALL-E. Any and all code that's going to be used to do Bark finetunes of the fine/course models is just going to be pain. * extending on that, the web UI's code is already a giant mess that I'm still dreading with making the initiative to ~~clean it up~~ rewrite it. Stapling on training for another model is just going to exacerbate issues. and above all: * for an extreme lack of a better term, the """meta""" seems to be around just using a normal TTS (base TorToiSe or base Bark) and running it through RVC for "great" quality output (I still need to actually see/hear for myself). Aside from the rewrite, I imagine it would just be better to add in using RVC into the "stack". - and as an addendum, I feel that the RVC sphere of utilities is rather solved. From what pieces I gleaned over time, people just prefer using RVC. ***If*** by some miracle I manage to get around to it, sure, I'll try. It's just, as things currently stand, Bark is already very low priority, much less finetuning it.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#324
No description provided.