actually make the node.js script preprocess from an argument

master
mrq 2022-10-13 21:29:19 +07:00
parent 7676d56de5
commit c8475149f9
3 changed files with 37 additions and 22 deletions

@ -310,13 +310,28 @@ If you're using an embedding primarily focused on an artstyle, and you're also u
Lastly, when you do use your embedding, make sure you're using the same model you trained against. You *can* use embeddings on different models, as you'll definitely get usable results, but don't expect it to give stellar ones.
## Testimonials
## What To Expect
Here I'll try to catalog results my results, and results I've caught from other anons (without consent)
Here I'll try to catalog results my results, and results I've caught from other anons (without consent). This is not necessarily a repository for embeddings/hypernetworks, but more as a list of showing results and their settings, so you can get a good idea on what can be achieved:
* Katt from Breath of Fire: ttps://desuarchive.org/trash/thread/51599762/#51607820
* [`aeris_(vg_cats)`](https://e621.net/posts?tags=aeris_%vg_cats%29): [/trash/](https://desuarchive.org/trash/thread/51378045/#51380745) (with download)
- Textual Inversion embedding
* [`katt_(breath_of_fire)`](https://e621.net/posts?tags=katt_%28breath_of_fire%29): [/trash/](https://desuarchive.org/trash/thread/51599762/#51607820) (with download)
- Hypernetwork named `2bofkatt`
- 40,000 iterations, learning rate of 0.000000025
- Notes: [post](https://desuarchive.org/trash/thread/51599762/#51608587) highlights the difference with using and not using a hypernetwork
* [`leib_(tas)`](https://e621.net/posts?tags=leib_%28tas%29): (/trash/: [1](https://desuarchive.org/trash/thread/51397474/#51400626), [2](https://desuarchive.org/trash/thread/51419493/#51420988), [3](https://desuarchive.org/trash/thread/51450284/#51455288), [4](https://desuarchive.org/trash/thread/51450284/#51456619), [5](https://desuarchive.org/trash/thread/51463059/#51473173), [img2img](https://desuarchive.org/trash/thread/51502820/#51515396))
- Textual Inversion embedding
- ~100k iterations, 144 manually curated and cropped images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown)
- Notes: second out of four attempts, ended up excelling, exact style for prompts is unknown
* [`tsathoggua`](https://e621.net/posts?tags=tsathoggua): (/trash/, [1](https://desuarchive.org/trash/thread/51419493/#51425440), [2](https://desuarchive.org/trash/thread/51426117/#51428189))
- Textual Inversion embedding
- ~100k iterations, 194 manually cropped and curated images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown)
- Notes: accentuates the strengths/weaknesses of the embedding not replicating hard-to-describe features
* [`juuichi_mikazuki`](https://e621.net/posts?tags=juuichi_mikazuki): ([/trash/](https://desuarchive.org/trash/thread/51436448/#51439697))
- Textual Inversion embedding
- ~170k iterations, 184 manually cropped and curated images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown)
- Notes: accentuates the strengths/weaknesses of the embedding not replicating hard-to-describe features
## After Words
@ -337,30 +352,30 @@ Textual Inversion embeddings serve as mini-"models" to extend a current one. Whe
* you are not required to use the prompts similar to what you trained it on
Contrarily, hypernetworks are another variation of extending the model with another mini-"model". They apply to the last outer layers as a whole, allowing you to effectively re-tune the model. They effectively will modify what comes out of the prompt and into the image, effectively amplifying/modifying their effects. This is evident through:
* using a verbose prompt with one enabled, your output will have more detail in what you prompted
* in the context of NovelAI, you're still somewhat required to prompt what you want, but the associated hypernetwork will strongly bring about what you want.
* using a verbose prompt with one enabled, your output will have more detail in what you prompted
* in the context of NovelAI, you're still somewhat required to prompt what you want, but the associated hypernetwork will strongly bring about what you want.
### Hiccups With Assessing Training A Hypernetwork
I don't have a concrete way of getting consistent training results with Hypernetworks at the moment. Most of the headache seems to be from:
* working around a very sensitive learning rate, and finding the sweet spot between "too high, it'll fry" and "too low, it's so slow"
* figuring out what exactly is the best way to try and train it, and the best *thing* to train it on, such as:
- should I train it with tags like I do for Textual Inversion (the character + descriptor tags), or use more generalized tags (like all the various species, very generic tags like anthro male)
- should I train it the same as my best embedding of a character, to try and draw comparisons between the two?
- should I train it on a character/art style I had a rough time getting accurate results from, to see if it's better suited for it?
+ given the preview training output at ~52k iterations w/ 184 images, I found it to not have any advantages over a regular Textual Inversion embedding
- should I train it on a broader concept, like a series of characters or a specific tag (fetish), to go ahead and recommend it quicker for anyone interested in it, then train to draw conclusions of the above after?
+ given the preview training output at ~175k iterations w/ 9322 images, I found it to be *getting there* in looking like the eight or so characters I'm group batching for a "series of characters", but this doesn't really seem to be the way to go.
+ as for training it on a specific tag (fetish), I'd have to figure out which one I'd want to train it on, as I don't necessarily have any specific fetishes (at least, any that would be susbtantial to train against)
* it takes a long, long time to get to ~150k iterations, the sweet spot I found Textual Inversions to sit at. I feel it's better to just take the extra half hour to keep training it rather than waste it fiddling with the output.
* working around a very sensitive learning rate, and finding the sweet spot between "too high, it'll fry" and "too low, it's so slow"
* figuring out what exactly is the best way to try and train it, and the best *thing* to train it on, such as:
- should I train it with tags like I do for Textual Inversion (the character + descriptor tags), or use more generalized tags (like all the various species, very generic tags like anthro male)
- should I train it the same as my best embedding of a character, to try and draw comparisons between the two?
- should I train it on a character/art style I had a rough time getting accurate results from, to see if it's better suited for it?
+ given the preview training output at ~52k iterations w/ 184 images, I found it to not have any advantages over a regular Textual Inversion embedding
- should I train it on a broader concept, like a series of characters or a specific tag (fetish), to go ahead and recommend it quicker for anyone interested in it, then train to draw conclusions of the above after?
+ given the preview training output at ~175k iterations w/ 9322 images, I found it to be *getting there* in looking like the eight or so characters I'm group batching for a "series of characters", but this doesn't really seem to be the way to go.
+ as for training it on a specific tag (fetish), I'd have to figure out which one I'd want to train it on, as I don't necessarily have any specific fetishes (at least, any that would be susbtantial to train against)
* it takes a long, long time to get to ~150k iterations, the sweet spot I found Textual Inversions to sit at. I feel it's better to just take the extra half hour to keep training it rather than waste it fiddling with the output.
There doesn't seem to be a good resource for the less narrower concepts like the above.
A rentry I found for hypernetwork training in the /g/ thread is low quality.
The other resources seems to be "lol go to the training discord".
The discussion on it on the Web UI github is pretty much just:
* *"I want to to face transfers onto Tom Cruise / a woman / some other thing"*
* *"habibi i want this art style please sir help"*
* dead end discussion about learning rates
* hopeless conjecture about how quick it is to get decent results, but it failing to actually apply to anything for e621-related applications
* *"I want to to face transfers onto Tom Cruise / a woman / some other thing"*
* *"habibi i want this art style please sir help"*
* dead end discussion about learning rates
* hopeless conjecture about how quick it is to get decent results, but it failing to actually apply to anything for e621-related applications
I doubt anyone else can really give some pointers in the right direction, so I have to bang my head against the wall to figure the best path, as I feel if it works for even me, it'll work for (You).

@ -55,7 +55,6 @@ let config = {
tagDelimiter: ",", // what separates each tag in the filename, web UI will accept comma separated filenames, but will insert it without commas
}
let files = FS.readdirSync(config.input);
let csv = FS.readFileSync(config.tags)
csv = csv.toString().split("\n")
config.tags = {}
@ -95,6 +94,7 @@ for ( let k in {"input":null, "output":null} ) {
}
}
let files = FS.readdirSync(config.input);
console.log(`Parsing ${files.length} files from ${config.input} => ${config.output}`)
let parse = async () => {
@ -185,7 +185,7 @@ let parse = async () => {
let joined = filtered.join(config.tagDelimiter)
// NOOOOOO YOU'RE SUPPOSE TO DO IT ASYNCHRONOUSLY SO IT'S NOT BLOCKING
FS.copyFileSync(`${config.input}/${file}`, `${config.output}/${file.replace(md5, " "+joined).trim()}`)
FS.copyFileSync(`${config.input}/${file}`, `${config.output}/${file.replace(md5, joined).trim()}`)
if ( rateLimit && config.rateLimit ) await new Promise( (resolve) => {
setTimeout(resolve, config.rateLimit)

@ -175,7 +175,7 @@ def parse():
filtered.append(i)
joined = config['tagDelimiter'].join(filtered)
shutil.copy(os.path.join(config['input'], file), os.path.join(config['output'], file.replace(md5, " "+joined).strip()))
shutil.copy(os.path.join(config['input'], file), os.path.join(config['output'], file.replace(md5, joined).strip()))
if rateLimit and config['rateLimit']:
time.sleep(config['rateLimit'] / 1000.0)