# Generative Agents This serves as yet-another cobbled together application of [generative agents](https://arxiv.org/pdf/2304.03442.pdf) utilizing [LangChain](https://github.com/hwchase17/langchain/tree/master/langchain) as the core dependency and subjugating a "proxy" for GPT4. In short, by utilizing a language model to summarize, rank, and query against information using NLP queries/instructions, immersive agents can be attained. ## Features * gradio web UI * saving and loading of agents * works with non-OpenAI LLMs and embeddings (tested llamacpp) * modified prompts for use with vicuna ## Installation ``` pip install -r requirements.txt ``` ## Usage Set your environment variables accordingly: * `LLM_TYPE`: (`oai`, `llamacpp`): the LLM backend to use in LangChain. OpenAI requires some additional environment variables: - `OPENAI_API_BASE`: URL for your target OpenAI - `OPENAI_API_KEY`: authentication key for OpenAI - `OPENAI_API_MODEL`: target model * `LLM_MODEL`: (`./path/to/your/llama/model.bin`): path to your GGML-formatted LLaMA model, if using `llamacpp` as the LLM backend * `LLM_EMBEDDING_TYPE`: (`oai`, `llamacpp`, `hf`): the embedding model to use for similarity computing. * `LLM_VECTORSTORE_TYPE`: (`chromadb`): the vector store to use for "infinite" context. * `LLM_PROMPT_TUNE`: (`oai`, `vicuna`, `supercot`, `cocktail`): prompt formatting to use, for variants with specific finetunes for instructions, etc. * `LLM_CONTEXT`: sets maximum context size To run: ``` python .\src\main.py ``` ## Plans * I ***do not*** plan on making this uber-user friendly like [mrq/ai-voice-cloning](https://git.ecker.tech/mrq/ai-voice-cloning), as this is just a stepping stone for a bigger project integrating generative agents. * I need to re-implement grabbing relevant context, as I moved away mostly entirely from the provided LangChain implementation (not to knock it, it's just not in my nature to piggyback off of it). * the """grand""" endgoal of this is to either have it host as an addon server (more likely, given the... prohibitive VRAM requirements), or live in a C++ library to be used in other programs. ## Caveats A local LLM isn't *quite* ready yet to truly rely on: * It works, but it's a bit slow. * You're really, really dependent on how well your variant is performing. - You ***have*** to condition your prompts properly for decent results. - Some flavors will handle the instruction-based method used here better than others. * Model size is also a factor with how much memory it will consume. - for example, if this were to be used strictly in C++ in a game engine, you're having to compete with limited (V)RAM + however, I don't think mine consumes all that much resources. * ChromaDB seems to be strictly Python (node.js bindings still use Python, to my knowledge), which everyone seems to be preferring to use as a vector store over FAISS. Utilizing GPT4 (or Anthropic Claude) Just Works a little too nicely even without conditioning the prompts for a "chat" model. But SaaS models are inherently limited by cost-per-generation, and not everyone will have that luxury for `[enter use case for this application of generative agents here]`