AI chat llama
Llama.cpp based VoiceDock AI chat implementation. Provides gRPC API for AI chat based on llama.cpp project. Provides download of new model via API.
Features
- Browse and download AI model (models in ggml format)
- Query-based text generation through AI model
- GPU support
- Fast performance on cpu
Installation
Create directories for model data and configuration:
mkdir dataset
mkdir config
config/aichatllama.json
configuration file or download an example configuration:
curl -o config/aichatllama.json https://raw.githubusercontent.com/voicedock/aichatllama/main/config/aichatllama.json
docker run --rm \
-v "$(pwd)/config:/data/config" \
-v "$(pwd)/dataset:/data/dataset" \
-p 9999:9999 \
ghcr.io/voicedock/aichatllama:latest aichatllama
docker run --rm --gpus all \
-v "$(pwd)/config:/data/config" \
-v "$(pwd)/dataset:/data/dataset" \
-e LLAMA_GPU_LAYERS=2 \
-p 9999:9999 \
ghcr.io/voicedock/aichatllama:gpu aichatllama