Skip to content

AI chat llama

Llama.cpp based VoiceDock AI chat implementation. Provides gRPC API for AI chat based on llama.cpp project. Provides download of new model via API.

Features

  • Browse and download AI model (models in ggml format)
  • Query-based text generation through AI model
  • GPU support
  • Fast performance on cpu

Installation

Create directories for model data and configuration:

mkdir dataset
mkdir config
Create a config/aichatllama.json configuration file or download an example configuration:
curl -o config/aichatllama.json https://raw.githubusercontent.com/voicedock/aichatllama/main/config/aichatllama.json

docker run --rm \
    -v "$(pwd)/config:/data/config" \
    -v "$(pwd)/dataset:/data/dataset" \
    -p 9999:9999 \
    ghcr.io/voicedock/aichatllama:latest aichatllama
docker run --rm --gpus all \
    -v "$(pwd)/config:/data/config" \
    -v "$(pwd)/dataset:/data/dataset" \
    -e LLAMA_GPU_LAYERS=2 \
    -p 9999:9999 \
    ghcr.io/voicedock/aichatllama:gpu aichatllama

Configuration

See on Github.