A ResNet-based image classifier for """specific""" images
Go to file
2023-08-05 03:44:38 +00:00
captcha An amazing commit (: 2023-08-05 03:44:38 +00:00
scripts An amazing commit :) 2023-08-05 03:40:14 +00:00
.gitignore An amazing commit :) 2023-08-05 03:40:14 +00:00
LICENSE An amazing commit :) 2023-08-05 03:40:14 +00:00
README.md An amazing commit (: 2023-08-05 03:44:38 +00:00
setup.py An amazing commit :) 2023-08-05 03:40:14 +00:00

Tentative Title For A ResNet-Based Image Classifier

This is a simple ResNet based image classifier for """specific images""", using a similar training framework I use to train VALL-E.

Training

  1. Throw the images you want to train under ./data/images/.

  2. Modify the ./data/config.yaml accordingly.

  3. Install using pip3 install -e ./captcha/.

  4. Train using python3 -m captcha.train yaml='./data/config.yaml'.

  5. Wait.

Inferencing

Simply invoke the inferencer with the following command: python3 -m captcha "./data/path-to-your-image.png" yaml="./data/config.yaml" --temp=1.0

Caveats

This was cobbled together in a night, partly to test how well my training framework fares when not married to my VALL-E implementation, and partly to solve a problem I have recently faced. Since I've been balls deep in learning the ins and outs of making VALL-E work, why not do the exact opposite (a tiny, image classification model of fixed lengths) to test the framework and my knowledge? Thus, this """ambiguous""" project is born.

This is by no ways state of the art, as it just leverages an existing ResNet arch provided by torchvision.