1.8 KiB
1.8 KiB
How To Run
- Install dependencies:
python3 -m venv .venv
. .venv/bin/activate
python -m pip install -e ".[dev]"
- Configure:
cp .env.example .env
The default DUCK_MAIN_MODEL_PATH points to ./models/Qwen3.6/nonMTP/Qwen3.6-35B-A3B-UD-Q4_K_M.gguf.
- Start DuckLM:
bash scripts/duck.sh start
This starts both processes:
llama-serveronhttp://127.0.0.1:8081/v1- DuckLM API/WebChat on
http://127.0.0.1:8000/
Useful process commands:
bash scripts/duck.sh status
bash scripts/duck.sh logs --follow
bash scripts/duck.sh restart
bash scripts/duck.sh stop
Use live probes when you need backend diagnostics, not just process status:
bash scripts/duck.sh status --probe
curl --noproxy '*' 'http://127.0.0.1:8000/v1/status?probe=true'
- Open WebChat:
http://127.0.0.1:8000/
Low-level llama-only commands are still available when needed:
bash scripts/llama/start_main.sh status
bash scripts/llama/start_main.sh logs --follow
MTP/speculative variant:
bash scripts/duck.sh stop
bash scripts/duck-mtp.sh start
bash scripts/duck-mtp.sh status
bash scripts/duck-mtp.sh logs --follow
duck-mtp.sh keeps DuckLM on http://127.0.0.1:8000/ and starts the MTP-backed
llama-server on the normal role endpoint http://127.0.0.1:8081/v1, so
config/models.yaml does not need to change.
- Send a task:
curl -X POST http://127.0.0.1:8000/v1/chat \
-H "Content-Type: application/json" \
-d '{"message":"Скажи коротко, что ты DuckLM","workspace":"./workspace","debug":true}'
- Inspect events:
curl http://127.0.0.1:8000/v1/tasks/<task_id>/events
- Approvals:
curl http://127.0.0.1:8000/v1/approvals/pending
- Stop services:
bash scripts/duck.sh stop
docker compose -f docker-compose.memory.yml down