Go to file
mrq 0093a70c51 slight rewrite, mostly as functional as before 2023-05-22 15:40:11 +07:00
samples an amazing commit 2023-04-29 04:14:56 +07:00
src slight rewrite, mostly as functional as before 2023-05-22 15:40:11 +07:00
LICENSE Initial commit 2023-04-29 03:37:11 +07:00
README.md slight rewrite, mostly as functional as before 2023-05-22 15:40:11 +07:00
requirements.txt updating for new langchain, more tunes 2023-05-03 01:01:58 +07:00

README.md

Generative Agents

This serves as yet-another cobbled together application of generative agents utilizing LangChain as the core dependency and subjugating a "proxy" for GPT4.

In short, by utilizing a language model to summarize, rank, and query against information using NLP queries/instructions, immersive agents can be attained.

Features

  • gradio web UI
  • saving and loading of agents
  • works with non-OpenAI LLMs and embeddings (tested llamacpp)
  • modified prompts for use with vicuna

Installation

pip install -r requirements.txt

Usage

Set your environment variables accordingly:

  • LLM_TYPE: (oai, llamacpp): the LLM backend to use in LangChain. OpenAI requires some additional environment variables:
    • OPENAI_API_BASE: URL for your target OpenAI
    • OPENAI_API_KEY: authentication key for OpenAI
    • OPENAI_API_MODEL: target model
  • LLM_MODEL: (./path/to/your/llama/model.bin): path to your GGML-formatted LLaMA model, if using llamacpp as the LLM backend
  • LLM_EMBEDDING_TYPE: (oai, llamacpp, hf): the embedding model to use for similarity computing.
  • LLM_VECTORSTORE_TYPE: (chromadb): the vector store to use for "infinite" context.
  • LLM_PROMPT_TUNE: (oai, vicuna, supercot, cocktail): prompt formatting to use, for variants with specific finetunes for instructions, etc.
  • LLM_CONTEXT: sets maximum context size

To run:

python .\src\main.py

Plans

  • I do not plan on making this uber-user friendly like mrq/ai-voice-cloning, as this is just a stepping stone for a bigger project integrating generative agents.
  • I need to re-implement grabbing relevant context, as I moved away mostly entirely from the provided LangChain implementation (not to knock it, it's just not in my nature to piggyback off of it).
  • the """grand""" endgoal of this is to either have it host as an addon server (more likely, given the... prohibitive VRAM requirements), or live in a C++ library to be used in other programs.

Caveats

A local LLM isn't quite ready yet to truly rely on:

  • It works, but it's a bit slow.
  • You're really, really dependent on how well your variant is performing.
    • You have to condition your prompts properly for decent results.
    • Some flavors will handle the instruction-based method used here better than others.
  • Model size is also a factor with how much memory it will consume.
    • for example, if this were to be used strictly in C++ in a game engine, you're having to compete with limited (V)RAM
      • however, I don't think mine consumes all that much resources.
  • ChromaDB seems to be strictly Python (node.js bindings still use Python, to my knowledge), which everyone seems to be preferring to use as a vector store over FAISS.

Utilizing GPT4 (or Anthropic Claude) Just Works a little too nicely even without conditioning the prompts for a "chat" model. But SaaS models are inherently limited by cost-per-generation, and not everyone will have that luxury for [enter use case for this application of generative agents here]