Is it possible to generate using the command-line? #242

Open
opened 2023-05-19 14:20:00 +00:00 by gshawn3 · 6 comments

Apologies if this is a dumb question, but I could not find an answer on the wiki or in existing issues.

Is it possible to generate outputs from the command line, or is that only doable from the Web UI? Thanks!

Apologies if this is a dumb question, but I could not find an answer on the wiki or in existing issues. Is it possible to generate outputs from the command line, or is that only doable from the Web UI? Thanks!
Owner

Right, I keep forgetting to re-implement a CLI interface. When I get a moment I can whip up something.

Right, I keep forgetting to re-implement a CLI interface. When I get a moment I can whip up something.
Author

Oh that would be super cool, thanks!

In the meantime, I figured out that I can use the autogenerated Gradio API to create files programmatically. The Gradio API kind of sucks but it's better than generating every file manually using the GUI.

Oh that would be super cool, thanks! In the meantime, I figured out that I can use the autogenerated Gradio API to create files programmatically. The Gradio API kind of sucks but it's better than generating every file manually using the GUI.
Owner

Shit, I forgot to do this. Gomen. I'm a bit tied up for the rest of the week, so...

You might get lucky with just copying an existing do_tts.py (such as from 152334H's fork, I think it has one) and plunk it with my fork.

If I was a good dev, I would have:

  • kept all the added features both backwards compatible and localized in that repo
    • I think all the shit like per-model voice latents loading and voice latents calculation are handled on the tortoise-tts side, and can have their behaviors passed in with an arg
    • model loading can be passed in as an arg to the TortoiseTTS constructor
  • all the other higher level stuff like split-lines that are in the web UI are already existent in competent do_tts.py scripts
    • stuff like VoiceFixer can easily be done after the fact in a CLI

The way I had in mind was just another script that's guided by a bunch of argument flags that then get fed into ./src/utils.py's generate() function.

Shit, I forgot to do this. Gomen. I'm a bit tied up for the rest of the week, so... You ***might*** get lucky with just copying an existing `do_tts.py` (such as from 152334H's fork, I think it has one) and plunk it with [my fork](https://git.ecker.teck/mrq/tortoise-tts/). If I was a good dev, I would have: * kept all the added features both backwards compatible and localized in that repo - I think all the shit like per-model voice latents loading and voice latents calculation are handled on the tortoise-tts side, and can have their behaviors passed in with an arg - model loading can be passed in as an arg to the TortoiseTTS constructor * all the other higher level stuff like split-lines that are in the web UI are already existent in competent `do_tts.py` scripts - stuff like VoiceFixer can easily be done after the fact in a CLI --- The way I had in mind was just another script that's guided by a bunch of argument flags that then get fed into `./src/utils.py`'s `generate()` function.

Shit, I forgot to do this. Gomen. I'm a bit tied up for the rest of the week, so...

You might get lucky with just copying an existing do_tts.py (such as from 152334H's fork, I think it has one) and plunk it with my fork.

If I was a good dev, I would have:

  • kept all the added features both backwards compatible and localized in that repo
    • I think all the shit like per-model voice latents loading and voice latents calculation are handled on the tortoise-tts side, and can have their behaviors passed in with an arg
    • model loading can be passed in as an arg to the TortoiseTTS constructor
  • all the other higher level stuff like split-lines that are in the web UI are already existent in competent do_tts.py scripts
    • stuff like VoiceFixer can easily be done after the fact in a CLI

The way I had in mind was just another script that's guided by a bunch of argument flags that then get fed into ./src/utils.py's generate() function.

bruh you owe noone nothing im just grateful you gave us an interface to experiment

> Shit, I forgot to do this. Gomen. I'm a bit tied up for the rest of the week, so... > > You ***might*** get lucky with just copying an existing `do_tts.py` (such as from 152334H's fork, I think it has one) and plunk it with [my fork](https://git.ecker.teck/mrq/tortoise-tts/). > > If I was a good dev, I would have: > * kept all the added features both backwards compatible and localized in that repo > - I think all the shit like per-model voice latents loading and voice latents calculation are handled on the tortoise-tts side, and can have their behaviors passed in with an arg > - model loading can be passed in as an arg to the TortoiseTTS constructor > * all the other higher level stuff like split-lines that are in the web UI are already existent in competent `do_tts.py` scripts > - stuff like VoiceFixer can easily be done after the fact in a CLI > > --- > > The way I had in mind was just another script that's guided by a bunch of argument flags that then get fed into `./src/utils.py`'s `generate()` function. bruh you owe noone nothing im just grateful you gave us an interface to experiment
Owner

Sorry it took so long, especially for how simple the cli.py script ended up being. My work ethic has taken quite the nosedive.

Should be added in commit 76ed34ddd2. Use (after activating the venv):

python ./src/cli.py --text="input text" --voice="voice_name"

and additional flags should be pretty much whatever is under ./cfg/generate.json up to but not including experimentals (but it will use the JSON as defaults).

Sorry it took so long, especially for how simple the `cli.py` script ended up being. My work ethic has taken quite the nosedive. Should be added in commit 76ed34ddd2e610d466cf2c6fb76e13eea6df27bb. Use (after activating the venv): ``` python ./src/cli.py --text="input text" --voice="voice_name" ``` and additional flags should be pretty much whatever is under `./cfg/generate.json` up to but not including `experimentals` (but it will use the JSON as defaults).
Author

Thank you so much, this is fantastic!

Thank you so much, this is fantastic!
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#242
No description provided.