divi-ai: AI Coding Assistant¶
Important
divi-ai is experimental. It runs on CPU with small local models, so answers may be inaccurate and knowledge is limited to what was indexed at build time. Always verify code against the official documentation.
divi-ai is a coding assistant for Divi that runs directly in your terminal. It answers questions, generates code examples, and explains APIs — all using a local LLM on your machine. No API keys required. After the first launch (which downloads the model), divi-ai works fully offline.
Installation¶
pip install qoro-divi[ai]
If installation fails due to llama-cpp-python, see
Troubleshooting below.
Choosing a Model¶
On first launch, an interactive selector lets you pick a model. Choose one before launching so you know what to expect:
Key |
Model |
Download |
Est. RAM |
Context |
|---|---|---|---|---|
|
Qwen 2.5 Coder 1.5B |
1.0 GB |
~1.2 GB |
8K |
|
Qwen 2.5 Coder 3B |
1.9 GB |
~2.3 GB |
8K |
|
Qwen 2.5 Coder 7B |
4.5 GB |
~5.4 GB |
16K |
|
Qwen 2.5 Coder 14B |
8.4 GB |
~10.1 GB |
16K |
|
Gemma 4 E2B |
2.9 GB |
~3.5 GB |
8K |
|
Gemma 4 E4B |
4.6 GB |
~5.5 GB |
8K |
The Qwen Coder models are code-specialized and generally give better results for code generation. The Gemma models are general-purpose alternatives that work well for explanations and conceptual questions. Larger models produce better answers but need more RAM and run slower.
Hardware recommendations:
Apple Silicon with 16+ GB RAM:
7bor14bx86 with 32+ GB RAM:
7bor14bx86 with 16+ GB RAM:
e4bor7bLess than 16 GB RAM:
1.5b,3b, ore2b
First Launch¶
divi-ai
On the first run:
The interactive model selector opens (arrow keys to navigate, Enter to confirm).
The selected model is downloaded from HuggingFace (~1–9 GB depending on your choice). This requires an internet connection and may take a few minutes.
The model and search index are loaded into memory. This can take 30–60 seconds depending on your hardware.
The TUI opens and you can start asking questions.
Subsequent launches skip the download step and are much faster. Models are
cached locally (the exact location is platform-dependent, determined by
platformdirs). Delete model folders from the cache directory to free
disk space.
Using the Chat Interface¶
Type a question and press Enter to get an answer. The header bar tracks how much of the model’s context window your conversation has used.
Press Escape to cancel generation mid-stream.
Use
/resetbefore switching to a new topic to free context.Use
/retryif the answer seems incomplete or off.
Slash Commands¶
Command |
Description |
|---|---|
|
Save the last code block to a file (relative to your working directory). Automatically runs a syntax check on the saved file. |
|
Copy the last code block to the clipboard.
Requires |
|
Syntax-check all Python code blocks from the last response. |
|
Re-run the query (including retrieval) to get a different response. |
|
Clear conversation history and free context window space. |
|
Clear the screen and reset history. |
|
Exit the TUI. |
CLI Options¶
divi-ai [OPTIONS]
--reselect-modelForget the saved model preference and re-prompt for selection.
--top-k NNumber of documentation chunks retrieved per query (default: 8). Higher values give the model more context but use more of the context window. Lower values are faster but may miss relevant information.
--max-tokens NMaximum tokens the model can generate per response (default: 1024).
--debugShow index loading info and library messages.
--devDeveloper mode: show retrieved chunks, FAISS scores, sources, and token generation speed after each response.
Troubleshooting¶
- llama-cpp-python fails to install
On Windows this is the most common installation failure.
llama-cpp-pythonhas no prebuilt wheels on PyPI, sopip installalways downloads the source and compiles it. On Linux and macOS this usually succeeds silently when a C++ toolchain is present; on Windows it frequently fails. Try these in order:Install a prebuilt wheel from abetlen’s index (works on Windows, Linux, and macOS):
pip install "llama-cpp-python==0.3.19" \ --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu \ --only-binary=:all: pip install "qoro-divi[ai]"
The explicit
==0.3.19is required: PyPI hosts no wheels for this package, so without a version pin pip downloads the latest source release and compiles it.--only-binary=:all:causes pip to report an error immediately if no wheel matches your Python version and architecture, instead of silently compiling from source.If you must build from source, install a C++ toolchain:
Linux:
build-essential(Debian/Ubuntu) or equivalent.macOS: Xcode Command Line Tools (
xcode-select --install).Windows: Visual Studio Build Tools with the Desktop development with C++ workload. Run
pip installfrom an x64 Native Tools Command Prompt for VS so MSVC’scl.exeis onPATH.
Note
Windows: Strawberry Perl on PATH. If a Windows source build fails and the build log shows
C:/Strawberry/c/bin/gcc.exeor any path underC:\Strawberry, CMake is picking up the MinGW compiler bundled with Strawberry Perl instead of MSVC. The vendoredllama.cppsources do not build cleanly with that toolchain. Either removeC:\Strawberry\c\binfromPATHin the current Command Prompt window, or launch an x64 Native Tools Command Prompt for VS before running pip.- “Context window exceeded” / answers cut off mid-sentence
The conversation has filled the model’s context window. Use
/resetto clear history. If this happens frequently, switch to a model with a 16K context window (7bor14b).- Slow or unusable on my machine
Try
1.5bore2b— they run acceptably on most hardware. If even those are too slow, divi-ai may not be practical on your system.- Answers seem wrong or hallucinated
Try a larger model if your hardware allows it. See the important notice at the top of this page.
For Contributors¶
These commands are for Divi contributors rebuilding or evaluating the
search index. Install the AI dependencies first with uv sync --extra ai
to pull in only the AI stack, or uv sync --all-extras if you are also
working on docs, tests, or other areas that need the full development
environment.
python -m divi.ai help # Show commands and workflow overview
python -m divi.ai build # Rebuild the FAISS index from source
python -m divi.ai search # Interactive search against the index
python -m divi.ai inspect # Inspect assembled prompts (no LLM)
python -m divi.ai eval # Run eval queries, save results
python -m divi.ai compare # Compare two eval runs side-by-side
Typical development workflow:
Change source code or docs.
python -m divi.ai buildto rebuild the index.python -m divi.ai searchorinspectto verify retrieval quality.divi-aito test end-to-end.
Note
If you run out of memory during build, reduce the batch size:
python -m divi.ai build --batch-size 4