Agentic Adventures -Side Quest LlamaLauncher Part 5

Jun 12, 2026 5 min read

What TODO!

I have now created a TODO list for the agent to work through, this will make it easier for me to focus on what to add.

The current list looks like this.

# TODO

## Phase 1: Web viewer

[ x] Add ability to set --host HOST  and --port PORT this should be below the api section. Default values should be 127.0.0.1 and 8080 respectively.
[ x] Update the central widget so the current UI is in a tab group called Model
[ x] Add another Tab group for Server
[ x] In the server tab group add a QWebEngineView that displays the server from above

## Phase 2 : Context Parameters

[ ] Need to set the context -c parameter this will be the conversation context size, A tool tip should be added to the context combobox to explain what it does.
| Display name | Value passed to `--ctx-size` | Use case |

|---|---:|---|

| Auto (model default) | `0` | Recommended default; uses GGUF model context |

| 2K | `2048` | Very small models / low memory |

| 4K | `4096` | Basic chat, small coding tasks |

| 8K | `8192` | General purpose |

| 16K | `16384` | Better coding/chat history |

| 32K | `32768` | Large files, coding assistants |

| 64K | `65536` | Long documents, repo context |

| 128K | `131072` | Modern long-context models |

[] need to add the other most common parameters used with llama.cpp such as including tool tips.
| Parameter | Purpose | Typical Value |

|---|---|---|

| `--temp` | Temperature; randomness of token selection | `0.1–0.4` |

| `--top-k` | Restrict to K highest probability tokens | `20–50` |

| `--top-p` | Nucleus sampling probability cutoff | `0.8–0.95` |

| `--min-p` | Remove very unlikely tokens | `0.05–0.1` |

| `--typical-p` | Select tokens near the “typical” probability distribution | `0.9–1.0` |

| `--repeat-penalty` | Penalise repeated tokens | `1.05–1.15` |

| `--repeat-last-n` | How many previous tokens to check for repetition | `64–256` |

| `--presence-penalty` | Penalise tokens that already appeared | `0–0.5` |

| `--frequency-penalty` | Penalise frequent tokens | `0–0.5` |

| `--mirostat` | Adaptive sampling algorithm | Usually off |

| `--mirostat-lr` | Mirostat learning rate | `0.1` |

| `--mirostat-ent` | Target entropy for Mirostat | `5–7` |

## Phase 3 : Testing

[ ] Not test we need to add them with pytest

## Phase 4 : Configuration save and load

[ ] add the ability to save configuration to a json file
[ ] add the ability to load configuration from a json file
[ ] Use QSettings to save/load the last setup

## Phase 5 : exit management

[ ] if the server is running as to quit and stop the server before doing so.

## Phase 6 : exe location

[ ] Add the ability to set the executable location
[ ] Save the executable location in the configuration

## Phase 7 : Optional CLI support 
[ ] Add optional CLI support 
[ ] Create own terminal for the cli in app

Phase 2 Context Parameters

The first thing I am going to add is the -c context parameter. To test the model more I will manually add the combo box in a position I want then ask it to do the rest. For the first promt I have used.

[@TODO.md (10:32)](file:///Volumes/teaching/Code/LLamaLauncher/TODO.md#L10:32) [@llama_launch.ui](file:///Volumes/teaching/Code/LLamaLauncher/ui/llama_launch.ui) [@main.py](file:///Volumes/teaching/Code/LLamaLauncher/main.py) Add the items in the TODO to the model_contex_size combobox with tool tips for each entry. This should then add the --ctx-size flag to the launch with the value from the 2nd column in the table.

First run of this I ran out of context, lets try again with different params.

llama-server --api-key 12345\
    -m ~/teaching/AI_Models/Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf \
     -ngl 99 -np 1 --flash-attn on  -fit off  --no-mmap  --jinja --ctx-size 65536 \
    --temp 0.6 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05

This is not looking good, it keeps getting stuck in a loop AgentChat18.md

Let me implement both changes now.

I need to modify `main.py` to:

1. Add a class-level constant for context size options with display names, values, and tooltips
2. Add a `_populate_context_size_combobox()` method that adds items to the combobox with their values as userData and tooltips
3. Call it in `_setup_ui()` after loading the UI
4. Modify `_launch_model()` to read the selected value using `currentData()` and add `--ctx-size` to the command

Let me implement both changes now.

I need to modify `main.py` to:

1. Add a class-level constant for context size options with display names, values, and tooltips
2. Add a `_populate_context_size_combobox()` method that adds items to the combobox with their values as userData and tooltips
3. Call it in `_setup_ui()` after loading the UI
4. Modify `_launch_model()` to read the selected value using `currentData()` and add `--ctx-size` to the command

Lets compact and try again.

[@Populate Context Size Combobox with Tooltips](zed:///agent/thread/bfd78bfa-7ff3-4150-aaee-b0cd22e070ce?name=Populate+Context+Size+Combobox+with+Tooltips)
please continue you got stuck in a loop

As you can see from AgentChat19.md and AgentChat20.md I eventually got things to work and we now have the combo box with the context sizes.

Interestingly it decided to make the -c 0 flag the default and not actually send that parameter if nothing was selected. I finally made the default context size 16K as you can see here

Need to do some more tidying up but first I am going to address the issue of context size by looking at some more tools I can add to the mix.

Agentic Adventures -Side Quest LlamaLauncher Part 5

What TODO!

Phase 2 Context Parameters

Related