I've been exploring the AI ecosystem and have been trying tools like pi and opencode. I encountered a bug and instead of relying on old habits I took a new approach.

The problem

There's a lot of things to like about pi, top of the list is the ability to use a wide array of models from various providers. It's easy to enable a provider like opencode simply by adding the OPENCODE_API_KEY to the environment, pi will recognize the variable and add models to the list that is shown when the /model command.

Most of the models I've tried have worked flawlessly but one did fail: glm-5 a model from Zhipu AI (Z.ai) that was released earlier this month. The model would load normally and I was able to send regular text messages back and forth to the model but it broke trying to call tools.

The habit

Normally if I had a problem like this where a model worked perfectly fine in the official desktop app but not in some independent 3rd party tool I would just use the official tool and move on until a fix was eventually pushed.

The other thing I really like about pi is the ability to customize and configure it exactly as I like. I've already created a lot of project-specific commands that have become very useful during development, I'd lose the use of those commands and didn't really want to have to figure out how to re-implement them in the opencode app.

Something new

Instead of taking the path of least resistance I decided to see how well I could use the tool to fix itself. Strictly speaking the LLM updating its own architecture and adding novel capabilities would be the tool building/fixing itself but from an outside perspective I'm prompting a tool (the llm and its harness) to fix itself. I fired up pi and used a model that worked (sonnet 4.6) and asked it to help me debug the issue. Sonnet did a great job trying to reason about how the API calls were being made and we went through several rounds of troubleshooting.

We (I'm using the term "we" loosely here) tried to create extensions that called the opencode APIs with specific parameters turned on or off. It was fascinating to watch the agent walk through the installed js code and methodically eliminate the potential root causes. At one point it even created a proxy server to intercept the API calls directly to verify exactly what was being sent.

I think this is not something I'm used to, normally if I'm using a product from one provider I wouldn't expect it to "know" anything about any other providers products. I wouldn't expect a Maytag user manual to tell me anything useful about how to fix a Samsung washing machine for example.

LLMs obviously do not follow that pattern, the corpus of data they've been trained on means you can point any one of them at a text based problem and it really doesn't matter if they've been trained specifically on the problem at hand. It's all text that gets loaded into the context and they are able to reason their way through the issue.

The fix

Here's what the agent reported after we'd resolved the issue:

Root cause: The OpenCode Zen proxy streams GLM-5 tool call chunks with a unique id on every chunk.
The OpenAI streaming spec expects id only on the first chunk of a tool call, with subsequent chunks having no id.
Pi's stream parser at openai-completions.js:199 used id to detect new tool calls, so every chunk was treated as a separate tool call — resulting in 45 "tool calls" with fragmented arguments like {}, "command", "git", etc.

I can say with 100% certainty I would have never spent the time to dig though the pi source, compare it to the opencode source and trace the problem down until it was resolved. Especially not in a language I'm not very familiar with (js) and with an app I've only just started using.

Lessons learned

Lesson 1: open source software is great

None of this would have been possible if I didn't have access to the source code for pi and opencode. Being able to patch the pi code directly to apply the fix was a rewarding experience. The working implementation of opencode allowed the agent to narrow down exactly what was going on with pi. I've been using open source software for a very long time and I can't think I've ever made a change like this.

Lesson 2: context is king

I started the process by asking the agent to help debug pi and it had no idea what I was talking about. Just pointing the agent to the docs allowed it to get much more targeted in troubleshooting. After each troubleshooting session I would have the agent write its findings to a file. Every new session could pick up from where we left off and it became much faster to narrow down the potential cause.

Lesson 3: agility is important

A lot of this also may not have been possible for a much more complicated app. I'm not sure we would have been able to get to a resolution if we were trying to debug the opencode desktop app instead. pi being a lightweight tui means the agents also didn't have to use much context trying to understand the application.

The results

In case it's useful to anyone else who may want to use glm5 with pi until an official fix is released, here are the results of the investigation we did: Patch file with instructions: https://gist.github.com/epequeno/c1219fb15f18aa7c5c0f49b2f0fe8957

The problem#

The habit#

Something new#

The fix#

Lessons learned#

Lesson 1: open source software is great#

Lesson 2: context is king#

Lesson 3: agility is important#

The results#