Go to file
2023-05-15 07:28:04 +00:00
samples an amazing commit 2023-04-29 04:14:56 +00:00
src reworked saving/loading agent by just saving the memory documents themselves and adding them on load, rather than serializing the entire memory object and it breaking between systems / wizard prompt tune / more tunings 2023-05-15 07:28:04 +00:00
LICENSE Initial commit 2023-04-29 03:37:11 +00:00
README.md reworked saving/loading agent by just saving the memory documents themselves and adding them on load, rather than serializing the entire memory object and it breaking between systems / wizard prompt tune / more tunings 2023-05-15 07:28:04 +00:00
requirements.txt updating for new langchain, more tunes 2023-05-03 01:01:58 +00:00

Generative Agents

This serves as yet-another cobbled together application of generative agents utilizing LangChain as the core dependency and subjugating a "proxy" for GPT4.

In short, by utilizing a language model to summarize, rank, and query against information using NLP queries/instructions, immersive agents can be attained.

Features

  • gradio web UI
  • saving and loading of agents
  • works with non-OpenAI LLMs and embeddings (tested llamacpp)
  • modified prompts for use with vicuna

Installation

pip install -r requirements.txt

Usage

Set your environment variables accordingly:

  • LLM_TYPE: (oai, llamacpp): the LLM backend to use in LangChain. OpenAI requires some additional environment variables:
    • OPENAI_API_BASE: URL for your target OpenAI
    • OPENAI_API_KEY: authentication key for OpenAI
    • OPENAI_API_MODEL: target model
  • LLM_MODEL: (./path/to/your/llama/model.bin): path to your GGML-formatted LLaMA model, if using llamacpp as the LLM backend
  • LLM_EMBEDDING_TYPE: (oai, llamacpp, hf): the embedding model to use for similarity computing.
  • LLM_PROMPT_TUNE: (oai, vicuna, supercot, cocktail): prompt formatting to use, for variants with specific finetunes for instructions, etc.
  • LLM_CONTEXT: sets maximum context size

To run:

python .\src\main.py

Plans

I do not plan on making this uber-user friendly like mrq/ai-voice-cloning, as this is just a stepping stone for a bigger project integrating generative agents.

Caveats

A local LM is quite slow. Things seem to be getting faster as llama.cpp is being developed. GPU offloading (and the OpenCL PR) seems to bring some very nice hope in just scrapping this in Python and just integrate it entirely in C++.

Even using one that's more instruction-tuned like Vicuna (with a SYSTEM:\nUSER:\nASSISTANT: structure of prompts), it's still inconsistent.

However, I seem to be getting consistent results with SuperCOT 33B, it's just, well, slow. SuperCOT 13B seems to be giving better answers over Vicuna-1.1 13B, so. Cocktail 13B seems to be the best of the 13Bs.

A lot of prompt wrangling is needed, and a lot of the routines could be polished up (for example, an observation queries the LM for a rating, and each response reaction requires quering for the observed entity, then the relationship between an agent and observed entity which ends up just summarizing relevant context/memories, and then queries for a response), and if one of these steps fails, then the fail rate is higher. If anything, I might as well just work from the ground up and only really salvage the use of FAISS to store embedded-vectors.

GPT4 seems to Just Work, unfortunately.