Go to file

mrq 287406e7ba tunings		2023-05-09 23:57:54 +00:00
samples	an amazing commit	2023-04-29 04:14:56 +00:00
src	tunings	2023-05-09 23:57:54 +00:00
LICENSE	Initial commit	2023-04-29 03:37:11 +00:00
README.md	tunings	2023-05-09 23:57:54 +00:00
requirements.txt	updating for new langchain, more tunes	2023-05-03 01:01:58 +00:00

README.md

Generative Agents

This serves as yet-another cobbled together application of generative agents utilizing LangChain as the core dependency and subjugating a "proxy" for GPT4.

In short, by utilizing a language model to summarize, rank, and query against information using NLP queries/instructions, immersive agents can be attained.

Features

gradio web UI
saving and loading of agents
works with non-OpenAI LLMs and embeddings (tested llamacpp)
modified prompts for use with vicuna

Installation

pip install -r requirements.txt

Usage

Set your environment variables accordingly:

LLM_TYPE: (oai, llamacpp): the LLM backend to use in LangChain. OpenAI requires some additional environment variables:
- OPENAI_API_BASE: URL for your target OpenAI
- OPENAI_API_KEY: authentication key for OpenAI
- OPENAI_API_MODEL: target model
LLM_MODEL: (./path/to/your/llama/model.bin): path to your GGML-formatted LLaMA model, if using llamacpp as the LLM backend
LLM_EMBEDDING_TYPE: (oai, llamacpp, hf): the embedding model to use for similarity computing.
LLM_PROMPT_TUNE: (oai, vicuna, supercot, cocktail): prompt formatting to use, for variants with specific finetunes for instructions, etc.
LLM_CONTEXT: sets maximum context size

To run:

python .\src\main.py

Plans

I do not plan on making this uber-user friendly like mrq/ai-voice-cloning, as this is just a stepping stone for a bigger project integrating generative agents.

Caveats

A local LM is quite slow. Things seem to be getting faster as llama.cpp is being developed.

Even using one that's more instruction-tuned like Vicuna (with a SYSTEM:\nUSER:\nASSISTANT: structure of prompts), it's still inconsistent.

However, I seem to be getting consistent results with SuperCOT 33B, it's just, well, slow. SuperCOT 13B seems to be giving better answers over Vicuna-1.1 13B, so. Cocktail 13B seems to be the best of the 13Bs.

A lot of prompt wrangling is needed, and a lot of the routines could be polished up (for example, an observation queries the LM for a rating, and each response reaction requires quering for the observed entity, then the relationship between an agent and observed entity which ends up just summarizing relevant context/memories, and then queries for a response), and if one of these steps fails, then the fail rate is higher. If anything, I might as well just work from the ground up and only really salvage the use of FAISS to store embedded-vectors.

GPT4 seems to Just Work, unfortunately.