Agentic Adventures -Side Quest LlamaLauncher Part 3

Jun 12, 2026 5 min read

LLamaLauncher updates

In the last session I manually setup the tool to work with my ui_loader and adjusted the overall layout / ui to work.

I’m now going to start adding new elements to the ui and start to work on getting the system to launch. I will treat each new feature as new chat. I have started the server with the same setting as the previous post.

Add the -mmproj option

I want to add an ability to add an optional –mmproj to enable multimodal input this flag also has an optional –no-mmproj-offload argument as well.

The logic around this needs to be that if no model has been chosen the flag and argument are not used. I will attempt to add this feature with the following prompt.

[@main.py](file:///Volumes/teaching/Code/LLamaLauncher/main.py) [@llama_launch.ui](file:///Volumes/teaching/Code/LLamaLauncher/ui/llama_launch.ui) in the model_selector_layout section of the ui add a  new field called "Multi Modal Input" It should have a text field for the model similar to the model_path_edit and a button to choose the file as a .gguf with the file dialog. There is also a tick box for the flag --no-mmproj-offload When clicking the launch button if no model is selected then nothing is passed, if there is a model you should pass --mmproj ModelName.gguf and if enabled --no-mmproj-offload

Let see how it does.

That worked really well, I’m going to get it to move the checkbox and tidy up the ui a little.

move the no_mmproj_offload_checkbox to be on the same line as the rest of the Multi mnodal input

That’s better I can use designer later to tidy up a little more. See AgentChat10.md

Getting ready for launch

There are a number of way we can run an app from python, at it’s simplest level we could use system but that is terrible. Python has the subprocess module which is ideal and we can capture the output to put into the model output panel.

Another approach would be to use QProcess as this integrates into Qt far better.

Lets ask the agent what they suggest, so far I have been using the “write” mode in the model but we can change this in zed.

What approach would you use to the launch the llama.cpp application and capture the output to put it into the output_display?

Oh Wow, Im impressed. AgentChat11.md

Recommended Approach: QProcess

Since you’re already using PySide6, the natural and most robust choice is Qt’s built-in QProcess. Here’s why it beats subprocess.Popen for this case:

Concern	`QProcess`	`subprocess.Popen` + polling
Thread safety	Signals/slots auto-marshall to main thread	Must use `QThread` or `QTimer` to avoid blocking GUI
Non-blocking	Event-loop driven, no busy-waiting	Requires manual async reading loop
Cleanup	Automatic on app close	Need explicit `kill()`/`terminate()` handling
Stderr capture	Separate `readyReadStandardError` signal	Need separate threads or `select()`/`asyncio`

It has also suggested some code to use (as well as a subprocess version). I will switch to write mode and tell it to implement the QProcess version.

Well it produced some code but I get the following error.

Launching: /Volumes/teaching/Code/LLamaLauncher/.venv/bin/python3 -m llama_cpp.server --model Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf --temp 0.3 --top-p 0.9 --top-k 40.0
---

SyntaxError: Non-UTF-8 code starting with '\xcf' in file /Volumes/teaching/Code/LLamaLauncher/.venv/bin/python3 on line 1, but no encoding declared; see https://peps.python.org/pep-0263/ for details


--- Process exited with code 1 ---

I should perhaps have been more specific about the llamer-server installation (will need to set this up better). It has the following

cmd = [
            sys.executable,
            "-m",

So it thinks this is a python tool. I have prompted it to update things and you can see the whole discussion in AgentChat11.md I am running out of context now so will start a new chat.

When loading the model it is only using the short name and not the full path, can we display the short name in the textedit but add as a custom property the full path which will be used when running the model.

This is now fixed and we have a working launcher.

AI2.mov

I can now use this to launch models and experiment ( dogfooding at it’s finest :-) ). AgentChat12.md shows the rest of the process.

Stopping the process.

I think I need to update the logic so the launch button becomes a stop button and stops the process. I am going to go into ask mode again to see if we can stop the process with a message rather than killing it.

When no model is selected the launch button should be deactivated, once a model is selected launch should be made actve. When the launcvh is clicked the name of the button should change to Stop and when pressed again it should stop the server, how would you implement this? Is there a way to stop the process llama-server without killing the process (in the terminal CTRL + C) works

That worked well now have the ability to stop the server (or kill if taking too long). See AgentChat13.md

API Key

Just noticed I forgot the API key command line option so will not work with zed as yet. Let’s add it.

below the --mproj setting add a new flag option called --api-key with a default value of 12345 in the line edit. The line edit should be editable.

That worked well but it used the label –api-key, lets launch a model in the app and get it to change it!

That worked. See AgentChat14.md

What next.

This is working well, I need to make a plan for extra features. I think I will create a TODO list and get the agent to work through it.