> youcanbuildthings.com
tutorials books topics about
reference from: The opencode Stack

OpenCode Not Working? The 5 Failure Modes

by J Cook · 7 min read·

Summary:

  1. A reference for when OpenCode is not working: the 5 documented failure modes.
  2. Each gets a detection signal, root cause, one-line fix, and a prevention lever.
  3. You leave able to triage a broken session in three minutes instead of six hours on Reddit.
  4. Bonus: the verified zod-boundary fix and a copy-paste weekly health-check script.

OpenCode not working is almost never a rotted setup. It is one of five known failure modes, each documented in r/opencodeCLI threads or GitHub issues with a one-line fix. Two posts hit the subreddit the same week: one asking “is opencode getting more broken every day?”, the other a five-star rave about the exact same product. Same product, opposite verdicts. The gap was entirely which bug bit which user, and whether they knew the fix.

A symptom-to-fix troubleshooting matrix for the five OpenCode failure modes: output quality regressed from Go-tier quantization, /compact timing out at 120 seconds, quality dropping mid-session from a full context window, a plugin throwing not-a-Zod-schema from a dual zod instance, and install-time edge cases on Windows and locale, each with a cause, fix, and prevention, gated by a weekly health-check.sh.

What are the five failure modes at a glance?

Start here, then jump to the section that matches your symptom:

#SymptomCauseFixPrevention
1Output quality regressedGo-tier quantizationSwitch model: Kimi K2.6 not K2.5Cost-dashboard quality-vs-cost column
2/compact times out (120s)Large contextOmniRoute PR #1363 / shorter sessionsNew session every ~50K tokens
3Quality drops mid-sessionContext window fullRestart, hand off summaryCheckpoint at the 30% mark
4Plugin throws “not a Zod schema”Dual zod instancepeerDependencies / --external zodRead the plugin’s package.json
5Install fails / Windows TLS errors / Chinese outputInstall-time edge caseIssue #22055 / WSL2 / LANG=en_US.UTF-8Run health-check.sh weekly

Failure mode 1: output quality regressed

Detection. A fixed benchmark prompt, saved from a clean session against a known repo state, starts producing worse diffs (more thrash, test failures that did not happen last week) with no config change.

Cause. The OpenCode Go bundled model variant can differ from the upstream provider’s direct API (quantization level, deployment hardware, sometimes a slightly older release). For typical engineering tasks the difference is below the noise floor; for adversarial prompts and research-grade text it is measurable.

Fix. For the specific tasks that need the original variant, switch the model slot (set Kimi K2.6 instead of K2.5), or move just that task to OpenCode Zen, the per-request tier that routes to upstream APIs directly. For 95% of work, stay on Go.

Prevention. Keep the cost dashboard running. Quality shifts often show up alongside cost shifts when a Go-tier model gets re-bundled. The quality-vs-cost column is the early warning.

Failure mode 2: /compact times out

Detection. A long active session (over an hour, more than 30 turns) hits a 500 error on /compact after a 120-second timeout, or hangs until you kill it.

Cause. Two corrections the community conflates. First, the verbatim “120s timeout” PR is diegosouzapw/OmniRoute PR #1363, in a third-party LLM gateway, not anomalyco/opencode. If you route OpenCode through OmniRoute you inherit it; direct-to-provider, you do not. Second, that PR was closed, not merged — even OmniRoute users do not get the fix upstream. OpenCode core’s own /compact works fine at normal session lengths.

Fix. Restart the session and start fresh. At the depth where compaction is itself a heavy model call near a timeout, compaction was not going to save the session anyway.

Prevention. Aggressive context discipline. Skills cut raw-file dumps so context fills slower. Start a new session every ~50K tokens rather than waiting for the cliff.

Failure mode 3: quality drops mid-session

Detection. The model repeats itself, re-calls a tool it ran two turns ago, ignores a file it already read, or its plans get less coherent. The activity feed shows redundant tool calls.

Cause. Model attention degrades as the context fills with abandoned plans, half-finished sub-tasks, and tool noise. Past about 30% of the window, signal-to-noise drops and so does output.

Fix. End the session, start a fresh one, hand it a tight one-paragraph summary (must-keep state, next 2-3 sub-tasks). Fresh context beats the bloated session every time. If the drop is on the fixer’s edits specifically, hot-swap the fixer model (Qwen 3.5 Plus to DeepSeek V4 Flash) before touching the orchestrator.

Prevention. Skills plus the cost dashboard’s input-token-share signal. The disciplined pattern: restart every 30-45 minutes of active work, “done” or not. Checkpoint at the 30% mark.

Failure mode 4: plugin throws “not a Zod schema”

Detection. Your plugin throws "schema is not a Zod schema" at runtime, or an instanceof check on a zod schema returns false even though both sides obviously use zod.

Cause. Both OpenCode and your plugin bundle their own copy of zod v4. zod v4 uses an instance-identity check (schema._zod.def); a schema produced by the plugin’s copy fails the host’s check across the plugin/host boundary.

Fix. This is the verified fix from oh-my-opencode-slim PR #344. Move zod to a peer dependency:

{
  "peerDependencies": {
    "zod": "^4.0.0"
  }
}

And mark zod external in the build so your bundle does not include it (the host provides it at runtime):

bun build src/index.ts --outdir dist --target node --format esm \
  --external @opencode-ai/plugin --external @opencode-ai/sdk --external zod

PR #344’s verification log: _zod internals 0 occurrences in the built bundle, 1032 tests pass, plugin loads cleanly inside a fresh OpenCode host.

Prevention. Never bundle a library that uses cross-instance identity checks. Declare it as a peer dependency and read every plugin’s package.json before installing.

Failure mode 5: install fails, Windows, or Chinese output

Detection. Install fails or throws Windows TLS errors, the Windows TUI closes the terminal on exit, or the agent responds in Chinese on Kimi/Qwen models.

Cause. Install-time edge cases: native Windows cmd.exe (issue #22003 / #22055), WSL2 plugin path mismatch, or a non-English system locale Bun reads as the default.

Fix. Windows TUI exit closing the terminal → install under WSL2 via curl, never native cmd.exe. WSL2 plugins not loading → install Bun inside WSL, then opencode plugin <name> from the WSL shell. Chinese output → set LANG=en_US.UTF-8 in the shell profile and add “Respond in English unless I ask otherwise” to the session. The locale fix stops the default; the directive locks the model.

Prevention. Run a weekly health check (next section).

What is the weekly health check?

A ten-minute Monday script, pass/fail per check. Copy-paste:

#!/usr/bin/env bash
# health-check.sh: weekly OpenCode setup verification.
set +e
PASS=0; FAIL=0
check() {
  printf "%-38s " "$1"
  if eval "$2" >/dev/null 2>&1; then echo PASS; PASS=$((PASS+1))
  else echo FAIL; FAIL=$((FAIL+1)); fi
}
check "opencode on PATH"        "command -v opencode"
check "opencode --version OK"   "opencode --version"
check "config parses"           "opencode debug paths"
check "ripgrep present"         "opencode debug rg"
check "skills dir readable"     "test -d ~/.config/opencode/skills"
check "providers resolve"       "opencode providers list"
echo "Summary: $PASS PASS, $FAIL FAIL"
[[ $FAIL -gt 0 ]] && exit 1 || exit 0

Drop it at tools/health-check.sh, chmod +x, schedule it for Monday 9am via cron or launchd. It does not catch quality regressions (failure modes 1 and 3 need a human noticing in a real session), but install-state, config-state, and provider-state are exactly the things you forget to verify until they break.

What should you actually do?

  • If output got worse with no change → failure mode 1. Switch the model slot before you touch anything else.
  • If /compact hangs → failure mode 2. Stop chasing OmniRoute PR #1363. Restart the session.
  • If the agent is talking to itself → failure mode 3. Restart with a one-paragraph handoff. Do not push through it.
  • If you write plugins → failure mode 4. Apply the PR #344 peer-dependency + external-zod fix as your template for any cross-boundary library.
  • Always → schedule health-check.sh weekly so failure mode 5 is caught on a Monday, not mid-deadline.

bottom_line

  • OpenCode is a young open-source project moving fast. The rough edges are real, finite, and documented. Tolerate them for the cost win.
  • Most “it’s broken” is one of five known modes with a one-line fix. The skill is recognizing which one in three minutes, not heroics.
  • The cheapest fix is the weekly health check. Mechanical state drift is the failure you can fully automate away.

Frequently Asked Questions

Why is OpenCode not working after it worked yesterday?+

If the same prompt on the same repo produces worse output with no config change, the Go-tier model variant shifted (failure mode 1). Switch the model slot (Kimi K2.6 instead of K2.5) or move that specific task to OpenCode Zen.

Why does OpenCode /compact time out?+

The 120-second /compact 500 error is an OmniRoute gateway issue (PR #1363, closed), not OpenCode core. If you route through OmniRoute you inherit it; otherwise your session is just too long. Restart with a fresh context.

How do I stop OpenCode quality dropping mid-session?+

Context-fill decay starts around 30% of the window. End the session, hand a one-paragraph summary to a fresh one. Prevent it by restarting every 30-45 minutes or every ~50K tokens.