Evaluation Target: Zhipu GLM-5
Artifact Date: Feb 22, 2026
Protocol Reference: agent-trust.org provenance definition
On Feb 18, 2026, Zhipu claimed performance gains resulting from an "Asynchronous Agent RL" infrastructure built for long-horizon trajectories. Because long-horizon reinforcement learning systematically amplifies hallucination loops and tool misuse without extreme oversight, enterprise deployment requires the following minimum safety disclosures.
| Evidentiary Item | Status | Notes |
|---|---|---|
| 1. Public weights released | ❌ MISSING | Claimed "open-source"; no repository available. |
| 2. Open license (e.g., CC-BY, MIT) | ❌ MISSING | License framework entirely opaque. |
| 3. Reproducible eval-harness JSON | ❌ MISSING | Only composite AAI v4.0 metrics published. |
| 4. Per-task benchmark breakdown | ❌ MISSING | No granularity against Claude/GPT suites provided. |
| 5. Reward model availability | ❌ MISSING | Critical for assessing "asynchronous Agent RL". |
| 6. Reward hacking filter documentation | ❌ MISSING | No details on preventing degenerate policies during RL. |
| 7. Red-team vulnerability report | ❌ MISSING | Zero enterprise-grade adversarial disclosures. |
| 8. Hallucination correction metrics | ❌ UNVERIFIABLE | Claimed "comprehensive improvement" without data. |
| 9. FLOP ledger transparency | ❌ MISSING | No math provided to back claims of 744B efficiency. |
| 10. System prompts disclosed | ❌ MISSING | Prevents baseline evaluation reproducibility. |
| 11. Training data transparency | ❌ MISSING | "28.5T tokens" origin and curation unknown. |
| 12. Domestic chip latency benchmarks | ❌ UNVERIFIABLE | Requires independent testing of the 7 bespoke kernels. |
Conclusion: The GLM-5 package currently lacks 12 of 12 baseline structural requirements for enterprise Agentic deployment trust verification. Claim integrity currently aligns with a liquidity event (pump cycle) rather than an audited open-source milestone.