From c8475149f9208cd5000b05f97bab654848ff83d6 Mon Sep 17 00:00:00 2001 From: mrq Date: Thu, 13 Oct 2022 21:29:19 +0000 Subject: [PATCH] actually make the node.js script preprocess from an argument --- README.md | 53 ++++++++++++++++++++++++++++++----------------- src/preprocess.js | 4 ++-- src/preprocess.py | 2 +- 3 files changed, 37 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 173ea46..3b9aa05 100755 --- a/README.md +++ b/README.md @@ -310,13 +310,28 @@ If you're using an embedding primarily focused on an artstyle, and you're also u Lastly, when you do use your embedding, make sure you're using the same model you trained against. You *can* use embeddings on different models, as you'll definitely get usable results, but don't expect it to give stellar ones. -## Testimonials +## What To Expect -Here I'll try to catalog results my results, and results I've caught from other anons (without consent) +Here I'll try to catalog results my results, and results I've caught from other anons (without consent). This is not necessarily a repository for embeddings/hypernetworks, but more as a list of showing results and their settings, so you can get a good idea on what can be achieved: -* Katt from Breath of Fire: ttps://desuarchive.org/trash/thread/51599762/#51607820 +* [`aeris_(vg_cats)`](https://e621.net/posts?tags=aeris_%vg_cats%29): [/trash/](https://desuarchive.org/trash/thread/51378045/#51380745) (with download) + - Textual Inversion embedding +* [`katt_(breath_of_fire)`](https://e621.net/posts?tags=katt_%28breath_of_fire%29): [/trash/](https://desuarchive.org/trash/thread/51599762/#51607820) (with download) - Hypernetwork named `2bofkatt` - 40,000 iterations, learning rate of 0.000000025 + - Notes: [post](https://desuarchive.org/trash/thread/51599762/#51608587) highlights the difference with using and not using a hypernetwork +* [`leib_(tas)`](https://e621.net/posts?tags=leib_%28tas%29): (/trash/: [1](https://desuarchive.org/trash/thread/51397474/#51400626), [2](https://desuarchive.org/trash/thread/51419493/#51420988), [3](https://desuarchive.org/trash/thread/51450284/#51455288), [4](https://desuarchive.org/trash/thread/51450284/#51456619), [5](https://desuarchive.org/trash/thread/51463059/#51473173), [img2img](https://desuarchive.org/trash/thread/51502820/#51515396)) + - Textual Inversion embedding + - ~100k iterations, 144 manually curated and cropped images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown) + - Notes: second out of four attempts, ended up excelling, exact style for prompts is unknown +* [`tsathoggua`](https://e621.net/posts?tags=tsathoggua): (/trash/, [1](https://desuarchive.org/trash/thread/51419493/#51425440), [2](https://desuarchive.org/trash/thread/51426117/#51428189)) + - Textual Inversion embedding + - ~100k iterations, 194 manually cropped and curated images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown) + - Notes: accentuates the strengths/weaknesses of the embedding not replicating hard-to-describe features +* [`juuichi_mikazuki`](https://e621.net/posts?tags=juuichi_mikazuki): ([/trash/](https://desuarchive.org/trash/thread/51436448/#51439697)) + - Textual Inversion embedding + - ~170k iterations, 184 manually cropped and curated images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown) + - Notes: accentuates the strengths/weaknesses of the embedding not replicating hard-to-describe features ## After Words @@ -337,30 +352,30 @@ Textual Inversion embeddings serve as mini-"models" to extend a current one. Whe * you are not required to use the prompts similar to what you trained it on Contrarily, hypernetworks are another variation of extending the model with another mini-"model". They apply to the last outer layers as a whole, allowing you to effectively re-tune the model. They effectively will modify what comes out of the prompt and into the image, effectively amplifying/modifying their effects. This is evident through: - * using a verbose prompt with one enabled, your output will have more detail in what you prompted - * in the context of NovelAI, you're still somewhat required to prompt what you want, but the associated hypernetwork will strongly bring about what you want. +* using a verbose prompt with one enabled, your output will have more detail in what you prompted +* in the context of NovelAI, you're still somewhat required to prompt what you want, but the associated hypernetwork will strongly bring about what you want. ### Hiccups With Assessing Training A Hypernetwork I don't have a concrete way of getting consistent training results with Hypernetworks at the moment. Most of the headache seems to be from: - * working around a very sensitive learning rate, and finding the sweet spot between "too high, it'll fry" and "too low, it's so slow" - * figuring out what exactly is the best way to try and train it, and the best *thing* to train it on, such as: - - should I train it with tags like I do for Textual Inversion (the character + descriptor tags), or use more generalized tags (like all the various species, very generic tags like anthro male) - - should I train it the same as my best embedding of a character, to try and draw comparisons between the two? - - should I train it on a character/art style I had a rough time getting accurate results from, to see if it's better suited for it? - + given the preview training output at ~52k iterations w/ 184 images, I found it to not have any advantages over a regular Textual Inversion embedding - - should I train it on a broader concept, like a series of characters or a specific tag (fetish), to go ahead and recommend it quicker for anyone interested in it, then train to draw conclusions of the above after? - + given the preview training output at ~175k iterations w/ 9322 images, I found it to be *getting there* in looking like the eight or so characters I'm group batching for a "series of characters", but this doesn't really seem to be the way to go. - + as for training it on a specific tag (fetish), I'd have to figure out which one I'd want to train it on, as I don't necessarily have any specific fetishes (at least, any that would be susbtantial to train against) - * it takes a long, long time to get to ~150k iterations, the sweet spot I found Textual Inversions to sit at. I feel it's better to just take the extra half hour to keep training it rather than waste it fiddling with the output. +* working around a very sensitive learning rate, and finding the sweet spot between "too high, it'll fry" and "too low, it's so slow" +* figuring out what exactly is the best way to try and train it, and the best *thing* to train it on, such as: + - should I train it with tags like I do for Textual Inversion (the character + descriptor tags), or use more generalized tags (like all the various species, very generic tags like anthro male) + - should I train it the same as my best embedding of a character, to try and draw comparisons between the two? + - should I train it on a character/art style I had a rough time getting accurate results from, to see if it's better suited for it? + + given the preview training output at ~52k iterations w/ 184 images, I found it to not have any advantages over a regular Textual Inversion embedding + - should I train it on a broader concept, like a series of characters or a specific tag (fetish), to go ahead and recommend it quicker for anyone interested in it, then train to draw conclusions of the above after? + + given the preview training output at ~175k iterations w/ 9322 images, I found it to be *getting there* in looking like the eight or so characters I'm group batching for a "series of characters", but this doesn't really seem to be the way to go. + + as for training it on a specific tag (fetish), I'd have to figure out which one I'd want to train it on, as I don't necessarily have any specific fetishes (at least, any that would be susbtantial to train against) +* it takes a long, long time to get to ~150k iterations, the sweet spot I found Textual Inversions to sit at. I feel it's better to just take the extra half hour to keep training it rather than waste it fiddling with the output. There doesn't seem to be a good resource for the less narrower concepts like the above. A rentry I found for hypernetwork training in the /g/ thread is low quality. The other resources seems to be "lol go to the training discord". The discussion on it on the Web UI github is pretty much just: - * *"I want to to face transfers onto Tom Cruise / a woman / some other thing"* - * *"habibi i want this art style please sir help"* - * dead end discussion about learning rates - * hopeless conjecture about how quick it is to get decent results, but it failing to actually apply to anything for e621-related applications +* *"I want to to face transfers onto Tom Cruise / a woman / some other thing"* +* *"habibi i want this art style please sir help"* +* dead end discussion about learning rates +* hopeless conjecture about how quick it is to get decent results, but it failing to actually apply to anything for e621-related applications I doubt anyone else can really give some pointers in the right direction, so I have to bang my head against the wall to figure the best path, as I feel if it works for even me, it'll work for (You). \ No newline at end of file diff --git a/src/preprocess.js b/src/preprocess.js index 116b230..acb45b0 100755 --- a/src/preprocess.js +++ b/src/preprocess.js @@ -55,7 +55,6 @@ let config = { tagDelimiter: ",", // what separates each tag in the filename, web UI will accept comma separated filenames, but will insert it without commas } -let files = FS.readdirSync(config.input); let csv = FS.readFileSync(config.tags) csv = csv.toString().split("\n") config.tags = {} @@ -95,6 +94,7 @@ for ( let k in {"input":null, "output":null} ) { } } +let files = FS.readdirSync(config.input); console.log(`Parsing ${files.length} files from ${config.input} => ${config.output}`) let parse = async () => { @@ -185,7 +185,7 @@ let parse = async () => { let joined = filtered.join(config.tagDelimiter) // NOOOOOO YOU'RE SUPPOSE TO DO IT ASYNCHRONOUSLY SO IT'S NOT BLOCKING - FS.copyFileSync(`${config.input}/${file}`, `${config.output}/${file.replace(md5, " "+joined).trim()}`) + FS.copyFileSync(`${config.input}/${file}`, `${config.output}/${file.replace(md5, joined).trim()}`) if ( rateLimit && config.rateLimit ) await new Promise( (resolve) => { setTimeout(resolve, config.rateLimit) diff --git a/src/preprocess.py b/src/preprocess.py index 0ea0b76..45100a9 100755 --- a/src/preprocess.py +++ b/src/preprocess.py @@ -175,7 +175,7 @@ def parse(): filtered.append(i) joined = config['tagDelimiter'].join(filtered) - shutil.copy(os.path.join(config['input'], file), os.path.join(config['output'], file.replace(md5, " "+joined).strip())) + shutil.copy(os.path.join(config['input'], file), os.path.join(config['output'], file.replace(md5, joined).strip())) if rateLimit and config['rateLimit']: time.sleep(config['rateLimit'] / 1000.0)