Agentic Adventures -Side Quest LlamaLauncher Part 5
What TODO!
I have now created a TODO list for the agent to work through, this will make it easier for me to focus on what to add.
The current list looks like this.
# TODO
## Phase 1: Web viewer
[ x] Add ability to set --host HOST and --port PORT this should be below the api section. Default values should be 127.0.0.1 and 8080 respectively.
[ x] Update the central widget so the current UI is in a tab group called Model
[ x] Add another Tab group for Server
[ x] In the server tab group add a QWebEngineView that displays the server from above
## Phase 2 : Context Parameters
[ ] Need to set the context -c parameter this will be the conversation context size, A tool tip should be added to the context combobox to explain what it does.
| Display name | Value passed to `--ctx-size` | Use case |
|---|---:|---|
| Auto (model default) | `0` | Recommended default; uses GGUF model context |
| 2K | `2048` | Very small models / low memory |
| 4K | `4096` | Basic chat, small coding tasks |
| 8K | `8192` | General purpose |
| 16K | `16384` | Better coding/chat history |
| 32K | `32768` | Large files, coding assistants |
| 64K | `65536` | Long documents, repo context |
| 128K | `131072` | Modern long-context models |
[] need to add the other most common parameters used with llama.cpp such as including tool tips.
| Parameter | Purpose | Typical Value |
|---|---|---|
| `--temp` | Temperature; randomness of token selection | `0.1–0.4` |
| `--top-k` | Restrict to K highest probability tokens | `20–50` |
| `--top-p` | Nucleus sampling probability cutoff | `0.8–0.95` |
| `--min-p` | Remove very unlikely tokens | `0.05–0.1` |
| `--typical-p` | Select tokens near the “typical” probability distribution | `0.9–1.0` |
| `--repeat-penalty` | Penalise repeated tokens | `1.05–1.15` |
| `--repeat-last-n` | How many previous tokens to check for repetition | `64–256` |
| `--presence-penalty` | Penalise tokens that already appeared | `0–0.5` |
| `--frequency-penalty` | Penalise frequent tokens | `0–0.5` |
| `--mirostat` | Adaptive sampling algorithm | Usually off |
| `--mirostat-lr` | Mirostat learning rate | `0.1` |
| `--mirostat-ent` | Target entropy for Mirostat | `5–7` |
## Phase 3 : Testing
[ ] Not test we need to add them with pytest
## Phase 4 : Configuration save and load
[ ] add the ability to save configuration to a json file
[ ] add the ability to load configuration from a json file
[ ] Use QSettings to save/load the last setup
## Phase 5 : exit management
[ ] if the server is running as to quit and stop the server before doing so.
## Phase 6 : exe location
[ ] Add the ability to set the executable location
[ ] Save the executable location in the configuration
## Phase 7 : Optional CLI support
[ ] Add optional CLI support
[ ] Create own terminal for the cli in app
Phase 2 Context Parameters
The first thing I am going to add is the -c context parameter. To test the model more I will manually add the combo box in a position I want then ask it to do the rest. For the first promt I have used.
[@TODO.md (10:32)](file:///Volumes/teaching/Code/LLamaLauncher/TODO.md#L10:32) [@llama_launch.ui](file:///Volumes/teaching/Code/LLamaLauncher/ui/llama_launch.ui) [@main.py](file:///Volumes/teaching/Code/LLamaLauncher/main.py) Add the items in the TODO to the model_contex_size combobox with tool tips for each entry. This should then add the --ctx-size flag to the launch with the value from the 2nd column in the table.
First run of this I ran out of context, lets try again with different params.
llama-server --api-key 12345\
-m ~/teaching/AI_Models/Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf \
-ngl 99 -np 1 --flash-attn on -fit off --no-mmap --jinja --ctx-size 65536 \
--temp 0.6 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05
This is not looking good, it keeps getting stuck in a loop AgentChat18.md
Let me implement both changes now.
I need to modify `main.py` to:
1. Add a class-level constant for context size options with display names, values, and tooltips
2. Add a `_populate_context_size_combobox()` method that adds items to the combobox with their values as userData and tooltips
3. Call it in `_setup_ui()` after loading the UI
4. Modify `_launch_model()` to read the selected value using `currentData()` and add `--ctx-size` to the command
Let me implement both changes now.
I need to modify `main.py` to:
1. Add a class-level constant for context size options with display names, values, and tooltips
2. Add a `_populate_context_size_combobox()` method that adds items to the combobox with their values as userData and tooltips
3. Call it in `_setup_ui()` after loading the UI
4. Modify `_launch_model()` to read the selected value using `currentData()` and add `--ctx-size` to the command
Lets compact and try again.
[@Populate Context Size Combobox with Tooltips](zed:///agent/thread/bfd78bfa-7ff3-4150-aaee-b0cd22e070ce?name=Populate+Context+Size+Combobox+with+Tooltips)
please continue you got stuck in a loop
As you can see from AgentChat19.md and AgentChat20.md I eventually got things to work and we now have the combo box with the context sizes.
Interestingly it decided to make the -c 0 flag the default and not actually send that parameter if nothing was selected. I finally made the default context size 16K as you can see here
Need to do some more tidying up but first I am going to address the issue of context size by looking at some more tools I can add to the mix.