Question; How to continually run from CLI #261

New Issue

wiznat · 2023-06-09T19:49:47Z

wiznat commented

2023-06-09 19:49:47 +00:00

Just a question,

I'm trying to run TTTS in a python script. Right now I'm just using the line as such :

!python /content/ai-voice-cloning/ai-voice-cloning/modules/tortoise-tts/scripts/tortoise_tts.py --seed 42 "The quick brown fox " -O /content/ai-voice-cloning/results

This works, but every time I run it I have to do all the bootup stuff again.

Loaded tokenizer
Loading autoregressive model: /content/ai-voice-cloning/ai-voice-cloning/models/tortoise/autoregressive.pth
Loaded autoregressive model
Loaded diffusion model
Loading vocoder model: None
Loading vocoder model: bigvgan_24khz_100band.pth
Removing weight norm...
Loaded vocoder model

Is there a way to keep my current 'session' open and generate a couple lines? I think thats how the webui works and then it only has to reload when you change up the settings/model ?

Just a question, I'm trying to run TTTS in a python script. Right now I'm just using the line as such : !python /content/ai-voice-cloning/ai-voice-cloning/modules/tortoise-tts/scripts/tortoise_tts.py --seed 42 "The quick brown fox " -O /content/ai-voice-cloning/results This works, but every time I run it I have to do all the bootup stuff again. Loaded tokenizer Loading autoregressive model: /content/ai-voice-cloning/ai-voice-cloning/models/tortoise/autoregressive.pth Loaded autoregressive model Loaded diffusion model Loading vocoder model: None Loading vocoder model: bigvgan_24khz_100band.pth Removing weight norm... Loaded vocoder model Is there a way to keep my current 'session' open and generate a couple lines? I think thats how the webui works and then it only has to reload when you change up the settings/model ?

wiznat commented

2023-06-09 19:56:57 +00:00

I should probably also ask is this actually utilizing this fork or is this just using the base TTTS

psammites commented

2023-06-09 20:32:06 +00:00

See notes from #242 regarding generation from CLI.

wiznat commented

2023-06-09 22:34:53 +00:00

Okay,

I got the Gradio API method working,

However after it finishes generating, I keep hitting an assertion

8 frames
/usr/local/lib/python3.10/dist-packages/gradio_client/serializing.py in _deserialize_single(self, x, save_dir, root_url, hf_token)
292 else:
293 data = x.get("data")
--> 294 assert data is not None, f"The 'data' field is missing in {x}"
295 file_name = utils.decode_base64_to_file(data, dir=save_dir).name
296 else:

AssertionError: The 'data' field is missing in {'visible': False, 'value': None, 'type': 'update'}

/usr/local/lib/python3.10/dist-packages/gradio_client/serializing.py(294)_deserialize_single()
292 else:
293 data = x.get("data")
--> 294 assert data is not None, f"The 'data' field is missing in {x}"
295 file_name = utils.decode_base64_to_file(data, dir=save_dir).name
296 else:

It actually generates the audio file, but I think when its trying to return its messing up somewhere. Probably in output or source ?

According to the API it returns

Returns:
 - [Audio] output: str (filepath or URL to file) 
 - [Audio] source_sample: str (filepath or URL to file) 
 - [Dropdown] candidates: Any (any valid value) 
 - [Dataframe] results: str (filepath to JSON file)

Any ideas?

Okay, I got the Gradio API method working, However after it finishes generating, I keep hitting an assertion 8 frames /usr/local/lib/python3.10/dist-packages/gradio_client/serializing.py in _deserialize_single(self, x, save_dir, root_url, hf_token) 292 else: 293 data = x.get("data") --> 294 assert data is not None, f"The 'data' field is missing in {x}" 295 file_name = utils.decode_base64_to_file(data, dir=save_dir).name 296 else: AssertionError: The 'data' field is missing in {'visible': False, 'value': None, '__type__': 'update'} > /usr/local/lib/python3.10/dist-packages/gradio_client/serializing.py(294)_deserialize_single() 292 else: 293 data = x.get("data") --> 294 assert data is not None, f"The 'data' field is missing in {x}" 295 file_name = utils.decode_base64_to_file(data, dir=save_dir).name 296 else: It actually generates the audio file, but I think when its trying to return its messing up somewhere. Probably in output or source ? According to the API it returns Returns: - [Audio] output: str (filepath or URL to file) - [Audio] source_sample: str (filepath or URL to file) - [Dropdown] candidates: Any (any valid value) - [Dataframe] results: str (filepath to JSON file) Any ideas?

psammites commented

2023-06-10 01:24:03 +00:00

That's a new one for me. Does it happen if you revert to a previous commit?

wiznat commented

2023-06-10 07:57:20 +00:00

I can give it a try this weekend, but honestly I just decided to catch the exception and move on because since its still generating the files its good enough lol

bluetooth commented

2023-07-07 12:32:01 +00:00

I'm also wondering if this is possible. Whenever I use the cli.py script, it still goes through the whole Loading/Loaded boot sequence. Is there a way to keep the loading "in session"?

jtfl28 commented

2023-07-23 22:48:36 +00:00

`
import subprocess
import time
import requests
import os
import json

cmd = 'cd ai-voice-cloning && start.bat'
process = subprocess.Popen(cmd, shell=True)

time.sleep(30)

voice_name = "name"

response = requests.post("http://127.0.0.1:7860/run/generate", json={
"data": [
"Hello World", #'Input Prompt'
"\n", #'Line Delimiter'
"Happy", #'Emotion'
"",# 'Custom Emotion'
voice_name,# 'Voice' Dropdown component
{"name":"audio.wav","data":"data:audio/wav;base64,UklGRiQAAABXQVZFZm10IBAAAAABAAEARKwAAIhYAQACABAAZGF0YQAAAAA="}, # represents audio data as object with filename and base64 string of 'Microphone Source' Audio component
0,#'Voice Chunks'
1,#'Candidates'
0,#'Seed'
256,#Samples
400,#Iterations
0.8,#Temperature
"DDIM",#'Diffusion Samplers'
2,#Pause Size
0,#CVVP Weight
0.55,#'Top P'
1,#'Diffusion Temperature'
1,#'Length Penalty'
2,#'Repetition Penalty'
2,#'Conditioning-Free K'
["Half Precision"],# 'Experimental Flags'
False,#'Use Original Latents Method (AR)'
False,#Use Original Latents Method (Diffusion)
]
}).json()

data = response["data"]
`
I am able to generate audio with the above code, but it comes out as a garbled mess, I think it has to do with

{"name":"audio.wav","data":"data:audio/wav;base64,UklGRiQAAABXQVZFZm10IBAAAAABAAEARKwAAIhYAQACABAAZGF0YQAAAAA="}, # represents audio data as object with filename and base64 string of 'Microphone Source' Audio component
not looking at the correct sample audio clips, but I couldnt find any examples on how to get the data

` import subprocess import time import requests import os import json cmd = 'cd ai-voice-cloning && start.bat' process = subprocess.Popen(cmd, shell=True) time.sleep(30) voice_name = "name" response = requests.post("http://127.0.0.1:7860/run/generate", json={ "data": [ "Hello World", #'Input Prompt' "\n", #'Line Delimiter' "Happy", #'Emotion' "",# 'Custom Emotion' voice_name,# 'Voice' Dropdown component {"name":"audio.wav","data":"data:audio/wav;base64,UklGRiQAAABXQVZFZm10IBAAAAABAAEARKwAAIhYAQACABAAZGF0YQAAAAA="}, # represents audio data as object with filename and base64 string of 'Microphone Source' Audio component 0,#'Voice Chunks' 1,#'Candidates' 0,#'Seed' 256,#Samples 400,#Iterations 0.8,#Temperature "DDIM",#'Diffusion Samplers' 2,#Pause Size 0,#CVVP Weight 0.55,#'Top P' 1,#'Diffusion Temperature' 1,#'Length Penalty' 2,#'Repetition Penalty' 2,#'Conditioning-Free K' ["Half Precision"],# 'Experimental Flags' False,#'Use Original Latents Method (AR)' False,#Use Original Latents Method (Diffusion) ] }).json() data = response["data"] ` I am able to generate audio with the above code, but it comes out as a garbled mess, I think it has to do with ` {"name":"audio.wav","data":"data:audio/wav;base64,UklGRiQAAABXQVZFZm10IBAAAAABAAEARKwAAIhYAQACABAAZGF0YQAAAAA="}, # represents audio data as object with filename and base64 string of 'Microphone Source' Audio component ` not looking at the correct sample audio clips, but I couldnt find any examples on how to get the data

NoM commented

2023-07-24 01:31:47 +00:00

What I do is use the trained voice data on the original tortoise-tts, as it's quite a bit faster on Linux than the version bundled in this project.

And then I hacked do_tts.py to read through a txt file and generate each row, without having to reload training data each time.

What I do is use the trained voice data on the original tortoise-tts, as it's quite a bit faster on Linux than the version bundled in this project. And then I hacked do_tts.py to read through a txt file and generate each row, without having to reload training data each time.

Sign in to join this conversation.