Go to file

mrq 0093a70c51 slight rewrite, mostly as functional as before		2023-05-22 15:40:11 +00:00
samples	an amazing commit	2023-04-29 04:14:56 +00:00
src	slight rewrite, mostly as functional as before	2023-05-22 15:40:11 +00:00
LICENSE	Initial commit	2023-04-29 03:37:11 +00:00
README.md	slight rewrite, mostly as functional as before	2023-05-22 15:40:11 +00:00
requirements.txt	updating for new langchain, more tunes	2023-05-03 01:01:58 +00:00

README.md

Generative Agents

This serves as yet-another cobbled together application of generative agents utilizing LangChain as the core dependency and subjugating a "proxy" for GPT4.

In short, by utilizing a language model to summarize, rank, and query against information using NLP queries/instructions, immersive agents can be attained.

Features

gradio web UI
saving and loading of agents
works with non-OpenAI LLMs and embeddings (tested llamacpp)
modified prompts for use with vicuna

Installation

pip install -r requirements.txt

Usage

Set your environment variables accordingly:

LLM_TYPE: (oai, llamacpp): the LLM backend to use in LangChain. OpenAI requires some additional environment variables:
- OPENAI_API_BASE: URL for your target OpenAI
- OPENAI_API_KEY: authentication key for OpenAI
- OPENAI_API_MODEL: target model
LLM_MODEL: (./path/to/your/llama/model.bin): path to your GGML-formatted LLaMA model, if using llamacpp as the LLM backend
LLM_EMBEDDING_TYPE: (oai, llamacpp, hf): the embedding model to use for similarity computing.
LLM_VECTORSTORE_TYPE: (chromadb): the vector store to use for "infinite" context.
LLM_PROMPT_TUNE: (oai, vicuna, supercot, cocktail): prompt formatting to use, for variants with specific finetunes for instructions, etc.
LLM_CONTEXT: sets maximum context size

To run:

python .\src\main.py

Plans

I do not plan on making this uber-user friendly like mrq/ai-voice-cloning, as this is just a stepping stone for a bigger project integrating generative agents.
I need to re-implement grabbing relevant context, as I moved away mostly entirely from the provided LangChain implementation (not to knock it, it's just not in my nature to piggyback off of it).
the """grand""" endgoal of this is to either have it host as an addon server (more likely, given the... prohibitive VRAM requirements), or live in a C++ library to be used in other programs.

Caveats

A local LLM isn't quite ready yet to truly rely on:

It works, but it's a bit slow.
You're really, really dependent on how well your variant is performing.
- You have to condition your prompts properly for decent results.
- Some flavors will handle the instruction-based method used here better than others.
Model size is also a factor with how much memory it will consume.
- for example, if this were to be used strictly in C++ in a game engine, you're having to compete with limited (V)RAM
  - however, I don't think mine consumes all that much resources.
ChromaDB seems to be strictly Python (node.js bindings still use Python, to my knowledge), which everyone seems to be preferring to use as a vector store over FAISS.

Utilizing GPT4 (or Anthropic Claude) Just Works a little too nicely even without conditioning the prompts for a "chat" model. But SaaS models are inherently limited by cost-per-generation, and not everyone will have that luxury for [enter use case for this application of generative agents here]