actually make the node.js script preprocess from an argument

2022-10-13 21:29:19 +00:00 · 2022-10-13 21:29:19 +00:00 · c8475149f9
commit c8475149f9
parent 7676d56de5
3 changed files with 37 additions and 22 deletions
--- a/README.md
+++ b/README.md
@ -310,13 +310,28 @@ If you're using an embedding primarily focused on an artstyle, and you're also u

 Lastly, when you do use your embedding, make sure you're using the same model you trained against. You *can* use embeddings on different models, as you'll definitely get usable results, but don't expect it to give stellar ones.

-## Testimonials
+## What To Expect

-Here I'll try to catalog results my results, and results I've caught from other anons (without consent)
+Here I'll try to catalog results my results, and results I've caught from other anons (without consent). This is not necessarily a repository for embeddings/hypernetworks, but more as a list of showing results and their settings, so you can get a good idea on what can be achieved:

-* Katt from Breath of Fire: ttps://desuarchive.org/trash/thread/51599762/#51607820
+* [`aeris_(vg_cats)`](https://e621.net/posts?tags=aeris_%vg_cats%29): [/trash/](https://desuarchive.org/trash/thread/51378045/#51380745) (with download)
+	- Textual Inversion embedding
+* [`katt_(breath_of_fire)`](https://e621.net/posts?tags=katt_%28breath_of_fire%29): [/trash/](https://desuarchive.org/trash/thread/51599762/#51607820) (with download)
 	- Hypernetwork named `2bofkatt`
 	- 40,000 iterations, learning rate of 0.000000025
+	- Notes: [post](https://desuarchive.org/trash/thread/51599762/#51608587) highlights the difference with using and not using a hypernetwork
+* [`leib_(tas)`](https://e621.net/posts?tags=leib_%28tas%29): (/trash/: [1](https://desuarchive.org/trash/thread/51397474/#51400626), [2](https://desuarchive.org/trash/thread/51419493/#51420988), [3](https://desuarchive.org/trash/thread/51450284/#51455288), [4](https://desuarchive.org/trash/thread/51450284/#51456619), [5](https://desuarchive.org/trash/thread/51463059/#51473173), [img2img](https://desuarchive.org/trash/thread/51502820/#51515396))
+	- Textual Inversion embedding
+	- ~100k iterations, 144 manually curated and cropped images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown)
+	- Notes: second out of four attempts, ended up excelling, exact style for prompts is unknown
+* [`tsathoggua`](https://e621.net/posts?tags=tsathoggua): (/trash/, [1](https://desuarchive.org/trash/thread/51419493/#51425440), [2](https://desuarchive.org/trash/thread/51426117/#51428189))
+	- Textual Inversion embedding
+	- ~100k iterations, 194 manually cropped and curated images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown)
+	- Notes: accentuates the strengths/weaknesses of the embedding not replicating hard-to-describe features
+* [`juuichi_mikazuki`](https://e621.net/posts?tags=juuichi_mikazuki): ([/trash/](https://desuarchive.org/trash/thread/51436448/#51439697))
+	- Textual Inversion embedding
+	- ~170k iterations, 184 manually cropped and curated images, learning rate from 0.005 to 0.0005 to 0.0003 (stepping unknown)
+	- Notes: accentuates the strengths/weaknesses of the embedding not replicating hard-to-describe features

 ## After Words

@ -337,30 +352,30 @@ Textual Inversion embeddings serve as mini-"models" to extend a current one. Whe
 * you are not required to use the prompts similar to what you trained it on

 Contrarily, hypernetworks are another variation of extending the model with another mini-"model". They apply to the last outer layers as a whole, allowing you to effectively re-tune the model. They effectively will modify what comes out of the prompt and into the image, effectively amplifying/modifying their effects. This is evident through:
-	* using a verbose prompt with one enabled, your output will have more detail in what you prompted
-	* in the context of NovelAI, you're still somewhat required to prompt what you want, but the associated hypernetwork will strongly bring about what you want.
+* using a verbose prompt with one enabled, your output will have more detail in what you prompted
+* in the context of NovelAI, you're still somewhat required to prompt what you want, but the associated hypernetwork will strongly bring about what you want.

 ### Hiccups With Assessing Training A Hypernetwork

 I don't have a concrete way of getting consistent training results with Hypernetworks at the moment. Most of the headache seems to be from:
-	* working around a very sensitive learning rate, and finding the sweet spot between "too high, it'll fry" and "too low, it's so slow"
-	* figuring out what exactly is the best way to try and train it, and the best *thing* to train it on, such as:
-		- should I train it with tags like I do for Textual Inversion (the character + descriptor tags), or use more generalized tags (like all the various species, very generic tags like anthro male)
-		- should I train it the same as my best embedding of a character, to try and draw comparisons between the two?
-		- should I train it on a character/art style I had a rough time getting accurate results from, to see if it's better suited for it?
-			+ given the preview training output at ~52k iterations w/ 184 images, I found it to not have any advantages over a regular Textual Inversion embedding
-		- should I train it on a broader concept, like a series of characters or a specific tag (fetish), to go ahead and recommend it quicker for anyone interested in it, then train to draw conclusions of the above after?
-			+ given the preview training output at ~175k iterations w/ 9322 images, I found it to be *getting there* in looking like the eight or so characters I'm group batching for a "series of characters", but this doesn't really seem to be the way to go.
-			+ as for training it on a specific tag (fetish), I'd have to figure out which one I'd want to train it on, as I don't necessarily have any specific fetishes (at least, any that would be susbtantial to train against)
-	* it takes a long, long time to get to ~150k iterations, the sweet spot I found Textual Inversions to sit at. I feel it's better to just take the extra half hour to keep training it rather than waste it fiddling with the output.
+* working around a very sensitive learning rate, and finding the sweet spot between "too high, it'll fry" and "too low, it's so slow"
+* figuring out what exactly is the best way to try and train it, and the best *thing* to train it on, such as:
+	- should I train it with tags like I do for Textual Inversion (the character + descriptor tags), or use more generalized tags (like all the various species, very generic tags like anthro male)
+	- should I train it the same as my best embedding of a character, to try and draw comparisons between the two?
+	- should I train it on a character/art style I had a rough time getting accurate results from, to see if it's better suited for it?
+		+ given the preview training output at ~52k iterations w/ 184 images, I found it to not have any advantages over a regular Textual Inversion embedding
+	- should I train it on a broader concept, like a series of characters or a specific tag (fetish), to go ahead and recommend it quicker for anyone interested in it, then train to draw conclusions of the above after?
+		+ given the preview training output at ~175k iterations w/ 9322 images, I found it to be *getting there* in looking like the eight or so characters I'm group batching for a "series of characters", but this doesn't really seem to be the way to go.
+		+ as for training it on a specific tag (fetish), I'd have to figure out which one I'd want to train it on, as I don't necessarily have any specific fetishes (at least, any that would be susbtantial to train against)
+* it takes a long, long time to get to ~150k iterations, the sweet spot I found Textual Inversions to sit at. I feel it's better to just take the extra half hour to keep training it rather than waste it fiddling with the output.

 There doesn't seem to be a good resource for the less narrower concepts like the above.
 A rentry I found for hypernetwork training in the /g/ thread is low quality.
 The other resources seems to be "lol go to the training discord".
 The discussion on it on the Web UI github is pretty much just:
-	* *"I want to to face transfers onto Tom Cruise / a woman / some other thing"*
-	* *"habibi i want this art style please sir help"*
-	* dead end discussion about learning rates
-	* hopeless conjecture about how quick it is to get decent results, but it failing to actually apply to anything for e621-related applications
+* *"I want to to face transfers onto Tom Cruise / a woman / some other thing"*
+* *"habibi i want this art style please sir help"*
+* dead end discussion about learning rates
+* hopeless conjecture about how quick it is to get decent results, but it failing to actually apply to anything for e621-related applications

 I doubt anyone else can really give some pointers in the right direction, so I have to bang my head against the wall to figure the best path, as I feel if it works for even me, it'll work for (You).
--- a/src/preprocess.js
+++ b/src/preprocess.js
@ -55,7 +55,6 @@ let config = {
 	tagDelimiter: ",", // what separates each tag in the filename, web UI will accept comma separated filenames, but will insert it without commas
 }

-let files = FS.readdirSync(config.input);
 let csv = FS.readFileSync(config.tags)
 csv = csv.toString().split("\n")
 config.tags = {}
@ -95,6 +94,7 @@ for ( let k in {"input":null, "output":null} ) {
 	}
 }

+let files = FS.readdirSync(config.input);
 console.log(`Parsing ${files.length} files from ${config.input} => ${config.output}`)

 let parse = async () => {
@ -185,7 +185,7 @@ let parse = async () => {
 		let joined = filtered.join(config.tagDelimiter)

 		// NOOOOOO YOU'RE SUPPOSE TO DO IT ASYNCHRONOUSLY SO IT'S NOT BLOCKING
-		FS.copyFileSync(`${config.input}/${file}`, `${config.output}/${file.replace(md5, " "+joined).trim()}`)
+		FS.copyFileSync(`${config.input}/${file}`, `${config.output}/${file.replace(md5, joined).trim()}`)

 		if ( rateLimit && config.rateLimit ) await new Promise( (resolve) => {
 			setTimeout(resolve, config.rateLimit)
--- a/src/preprocess.py
+++ b/src/preprocess.py
@ -175,7 +175,7 @@ def parse():
 			filtered.append(i)
 		joined = config['tagDelimiter'].join(filtered)

-		shutil.copy(os.path.join(config['input'], file), os.path.join(config['output'], file.replace(md5, " "+joined).strip()))
+		shutil.copy(os.path.join(config['input'], file), os.path.join(config['output'], file.replace(md5, joined).strip()))

 		if rateLimit and config['rateLimit']:
 			time.sleep(config['rateLimit'] / 1000.0)