A ResNet-based image classifier for """specific""" images

Go to file

mrq 62cba62bbb An amazing commit (:		2023-08-05 03:44:38 +00:00
captcha	An amazing commit (:	2023-08-05 03:44:38 +00:00
scripts	An amazing commit :)	2023-08-05 03:40:14 +00:00
.gitignore	An amazing commit :)	2023-08-05 03:40:14 +00:00
LICENSE	An amazing commit :)	2023-08-05 03:40:14 +00:00
README.md	An amazing commit (:	2023-08-05 03:44:38 +00:00
setup.py	An amazing commit :)	2023-08-05 03:40:14 +00:00

README.md

Tentative Title For A ResNet-Based Image Classifier

This is a simple ResNet based image classifier for """specific images""", using a similar training framework I use to train VALL-E.

Training

Throw the images you want to train under ./data/images/.
Modify the ./data/config.yaml accordingly.
Install using pip3 install -e ./captcha/.
Train using python3 -m captcha.train yaml='./data/config.yaml'.
Wait.

Inferencing

Simply invoke the inferencer with the following command: python3 -m captcha "./data/path-to-your-image.png" yaml="./data/config.yaml" --temp=1.0

Caveats

This was cobbled together in a night, partly to test how well my training framework fares when not married to my VALL-E implementation, and partly to solve a problem I have recently faced. Since I've been balls deep in learning the ins and outs of making VALL-E work, why not do the exact opposite (a tiny, image classification model of fixed lengths) to test the framework and my knowledge? Thus, this """ambiguous""" project is born.

This is by no ways state of the art, as it just leverages an existing ResNet arch provided by torchvision.