Chinese language startup Z.ai launches highly effective open supply GLM-4.5 mannequin household with PowerPoint creation

Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, knowledge, and safety leaders. Subscribe Now

One other week in the summertime of 2025 has begun, and in a continuation of the development from final week, with it arrives extra highly effective Chinese language open supply AI fashions.

Little-known (no less than to us right here within the West) Chinese language startup Z.ai has launched two new open supply LLMs — GLM-4.5 and GLM-4.5-Air — casting them as go-to options for AI reasoning, agentic habits, and coding.

And in accordance with Z.ai’s weblog submit, the fashions carry out close to the highest of the pack of different proprietary LLM leaders within the U.S.

For instance, the flagship GLM-4.5 matches or outperforms main proprietary fashions like Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Professional on evaluations reminiscent of BrowseComp, AIME24, and SWE-bench Verified, whereas rating third total throughout a dozen aggressive checks.

The AI Influence Sequence Returns to San Francisco – August 5

The following part of AI is right here – are you prepared? Be part of leaders from Block, GSK, and SAP for an unique take a look at how autonomous brokers are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

Safe your spot now – house is restricted: https://bit.ly/3GuuPLF

Its lighter-weight sibling, GLM-4.5-Air, additionally performs throughout the prime six, providing sturdy outcomes relative to its smaller scale.

Each fashions function twin operation modes: a considering mode for advanced reasoning and power use, and a non-thinking mode for immediate response eventualities. They will automatically generate full PowerPoint shows from a single title or immediate, making them helpful for assembly preparation, training, and inside reporting.

They additional provide artistic writing, emotionally conscious copywriting, and script era to create branded content material for social media and the net. Furthermore, z.ai says they assist digital character improvement and turn-based dialogue techniques for buyer assist, roleplaying, fan engagement, or digital persona storytelling.

Whereas each fashions assist reasoning, coding, and agentic capabilities, GLM-4.5-Air is designed for groups searching for a lighter-weight, extra cost-efficient various with quicker inference and decrease useful resource necessities.

Z.ai additionally lists a number of specialised fashions within the GLM-4.5 household on its API, together with GLM-4.5-X and GLM-4.5-AirX for ultra-fast inference, and GLM-4.5-Flash, a free variant optimized for coding and reasoning duties.

They’re obtainable now to make use of instantly on Z.ai and thru the Z.ai utility programming interface (API) for builders to connect with third-party apps, and their code is obtainable on HuggingFace and ModelScope. The corporate additionally gives a number of integration routes, together with assist for inference by way of vLLM and SGLang.

Licensing and API pricing

GLM-4.5 and GLM-4.5-Air are launched below the Apache 2.0 license, a permissive and commercially pleasant open-source license.

This permits builders and organizations to freely use, modify, self-host, fine-tune, and redistribute the fashions for each analysis and business functions.

For individuals who don’t need to obtain the mannequin code or weights and self-host or deploy on their very own, z.ai’s cloud-based API affords the mannequin for the next costs.

GLM-4.5:
- $0.60 / $2.20 per 1 million enter/output tokens
GLM-4.5-Air:
- $0.20 / $1.10 per 1M enter/output tokens

A CNBC article on the fashions reported that z.ai would cost solely $0.11 / $0.28 per million enter/output tokens, which can also be supported by a Chinese language graphic the corporate posted on its API documentation for the “Air mannequin.”

Nonetheless, this seems to be the case just for inputting as much as 32,000 tokens and outputting 200 tokens at a single time. (Recall tokens are the numerical designations the LLM makes use of to characterize totally different semantic ideas and phrase elements, the LLM’s native language, with every token translating to a phrase or portion of a phrase).

The truth is, the Chinese language graphic reveals way more detailed pricing for each fashions per batches of tokens inputted/outputted. I’ve tried to translate it under:

One other observe: since z.ai relies in China, these within the West who’re centered on knowledge sovereignty will need to due diligence by means of inside insurance policies to pursue utilizing the API, as it might be topic to Chinese language content material restrictions.

Aggressive efficiency on third-party benchmarks, approaching that of main closed/proprietary LLMs

GLM-4.5 ranks third throughout 12 trade benchmarks measuring agentic, reasoning, and coding efficiency—trailing solely OpenAI’s GPT-4 and xAI’s Grok 4. GLM-4.5-Air, its extra compact sibling, lands in sixth place.

In agentic evaluations, GLM-4.5 matches Claude 4 Sonnet in efficiency and exceeds Claude 4 Opus in web-based duties. It achieves a 26.4% accuracy on the BrowseComp benchmark, in comparison with Claude 4 Opus’s 18.8%. Within the reasoning class, it scores competitively on duties reminiscent of MATH 500 (98.2%), AIME24 (91.0%), and GPQA (79.1%).

For coding, GLM-4.5 posts a 64.2% success fee on SWE-bench Verified and 37.5% on Terminal-Bench. In pairwise comparisons, it outperforms Qwen3-Coder with an 80.8% win fee and beats Kimi K2 in 53.9% of duties. Its agentic coding means is enhanced by integration with instruments like Claude Code, Roo Code, and CodeGeex.

The mannequin additionally leads in tool-calling reliability, with a hit fee of 90.6%, edging out Claude 4 Sonnet and the new-ish Kimi K2.

A part of the wave of open supply Chinese language LLMs

The discharge of GLM-4.5 arrives amid a surge of aggressive open-source mannequin launches in China, most notably from Alibaba’s Qwen Workforce.

Within the span of a single week, Qwen launched 4 new open-source LLMs, together with the reasoning-focused Qwen3-235B-A22B-Pondering-2507, which now tops or matches main fashions reminiscent of OpenAI’s o4-mini and Google’s Gemini 2.5 Professional on reasoning benchmarks like AIME25, LiveCodeBench, and GPQA.

This week, Alibaba continued the development with the discharge of Wan 2.2, a strong new open supply video mannequin.

Alibaba’s new fashions are, like z.ai, licensed below Apache 2.0, permitting business utilization, self-hosting, and integration into proprietary techniques.

The broad availability and permissive licensing of Alibaba’s choices and Chinese language startup Moonshot earlier than it with its Kimi K2 mannequin displays an ongoing strategic effort by Chinese language AI corporations to place open-source infrastructure as a viable various to closed U.S.-based fashions.

It additionally locations strain on the U.S.-based mannequin supplier efforts to compete in open supply. Meta has been on a hiring spree after its Llama 4 mannequin household debuted earlier this yr to a combined response from the AI group, together with a healthy dose of criticism for what some AI energy customers noticed as benchmark gaming and inconsistent efficiency.

In the meantime, OpenAI co-founder and CEO Sam Altman lately introduced that OpenAI’s long-awaited and much-hyped frontier open supply LLM — its first since earlier than ChatGPT launched in late 2022 — could be delayed from its initially deliberate July launch to an as-yet unspecified later date.

Structure and coaching classes revealed

GLM-4.5 is constructed with 355 billion whole and 32 billion energetic parameters. Its counterpart, GLM-4.5-Air, affords a lighter-weight design at 106 billion whole and 12 billion energetic parameters.

Each use a Combination-of-Consultants (MoE) structure, optimized with loss-free stability routing, sigmoid gating, and elevated depth for enhanced reasoning.

The self-attention block consists of Grouped-Question Consideration and a better variety of consideration heads. A Multi-Token Prediction (MTP) layer allows speculative decoding throughout inference.

Pre-training spans 22 trillion tokens cut up between general-purpose and code/reasoning corpora. Mid-training provides 1.1 trillion tokens from repo-level code knowledge, artificial reasoning inputs, and long-context/agentic sources.

Z.ai’s post-training course of for GLM-4.5 relied upon a reinforcement studying part powered by its in-house RL infrastructure, slime, which separates knowledge era and mannequin coaching processes to optimize throughput on agentic duties.

Among the many strategies they used have been mixed-precision rollouts and adaptive curriculum studying.
The previous assist the mannequin prepare quicker and extra effectively by utilizing lower-precision math when producing knowledge, with out sacrificing a lot accuracy.

In the meantime, adaptive curriculum studying means the mannequin begins with simpler duties and progressively strikes to more durable ones, serving to it be taught extra advanced duties progressively over time.

GLM-4.5’s structure prioritizes computational effectivity. In response to CNBC, Z.ai CEO Zhang Peng acknowledged that the mannequin runs on simply eight Nvidia H20 GPUs — customized silicon designed for the Chinese language market to adjust to U.S. export controls. That’s roughly half the {hardware} requirement of DeepSeek’s comparable fashions.

Interactive demos

Z.ai highlights full-stack improvement, slide creation, and interactive artifact era as demonstration areas on its weblog submit.

Examples embrace a Flappy Chicken clone, Pokémon Pokédex internet app, and slide decks constructed from structured paperwork or internet queries.

Customers can work together with these options on the Z.ai chat platform or by means of API integration.

Firm background and market place

Z.ai was based in 2019 below the identify Zhipu, and has since grown into one in every of China’s most distinguished AI startups, in accordance with CNBC.

The corporate has raised over $1.5 billion from buyers together with Alibaba, Tencent, Qiming Enterprise Companions, and municipal funds from Hangzhou and Chengdu, with extra backing from Aramco-linked Prosperity7 Ventures.

Its GLM-4.5 launch coincides with the World Synthetic Intelligence Convention in Shanghai, the place a number of Chinese language companies showcased developments. Z.ai was additionally named in a June OpenAI report highlighting Chinese language progress in AI, and has since been added to a U.S. entity record limiting enterprise with American companies.

What it means for enterprise technical decision-makers

For senior AI engineers, knowledge engineers, and AI orchestration leads tasked with constructing, deploying, or scaling language fashions in manufacturing, the GLM-4.5 household’s launch below the Apache 2.0 license presents a significant shift in choices.

The mannequin affords efficiency that rivals prime proprietary techniques throughout reasoning, coding, and agentic benchmarks — but comes with full weight entry, business utilization rights, and versatile deployment paths, together with cloud, personal, or on-prem environments.

For these managing LLM lifecycles — whether or not main mannequin fine-tuning, orchestrating multi-stage pipelines, or integrating fashions with inside instruments — GLM-4.5 and GLM-4.5-Air cut back obstacles to testing and scaling.

The fashions assist customary OpenAI-style interfaces and tool-calling codecs, making it simpler to guage in sandboxed environments or drop into present agent frameworks.

GLM-4.5 additionally helps streaming output, context caching, and structured JSON responses, enabling smoother integration with enterprise techniques and real-time interfaces. For groups constructing autonomous instruments, its deep considering mode gives extra exact management over multi-step reasoning habits.

For groups below price range constraints or these searching for to keep away from vendor lock-in, the pricing construction undercuts main proprietary options like DeepSeek and Kimi K2. This issues for organizations the place utilization quantity, long-context duties, or knowledge sensitivity make open deployment a strategic necessity.

For professionals in AI infrastructure and orchestration, reminiscent of these implementing CI/CD pipelines, monitoring fashions in manufacturing, or managing GPU clusters, GLM-4.5’s assist for vLLM, SGLang, and mixed-precision inference aligns with present finest practices in environment friendly, scalable mannequin serving. Mixed with open-source RL infrastructure (slime) and a modular coaching stack, the mannequin’s design affords flexibility for tuning or extending in domain-specific environments.

In brief, GLM-4.5’s launch offers enterprise groups a viable, high-performing basis mannequin they will management, adapt, and scale, with out being tied to proprietary APIs or pricing buildings. It’s a compelling choice for groups balancing innovation, efficiency, and operational constraints.

Day by day insights on enterprise use circumstances with VB Day by day

If you wish to impress your boss, VB Day by day has you coated. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you’ll be able to share insights for max ROI.

Learn our Privateness Coverage

Thanks for subscribing. Take a look at extra VB newsletters right here.

An error occured.

What's Hot

Warmth maps present half the U.S. below advisories, warnings as excessive temperatures proceed

U.Okay. will acknowledge Palestine as a state except Israel strikes towards ceasefire in Gaza, prime minister says

Google Chrome provides on-line retailer evaluations that will help you know in the event that they’re legit

Chinese language startup Z.ai launches highly effective open supply GLM-4.5 mannequin household with PowerPoint creation

Google Chrome provides on-line retailer evaluations that will help you know in the event that they’re legit

ChatGPT’s Examine Mode Is Right here. It Received’t Repair Schooling’s AI Issues

AMD CEO says its chips made within the US will likely be as much as 20% costlier, however claims that it’ll be definitely worth the worth hike

‘Eat your personal pet food’: How Microsoft popularized one of many yuckiest phrases in tech historical past

Warmth maps present half the U.S. below advisories, warnings as excessive temperatures proceed

U.Okay. will acknowledge Palestine as a state except Israel strikes towards ceasefire in Gaza, prime minister says

Google Chrome provides on-line retailer evaluations that will help you know in the event that they’re legit

Astronomers Shocked As Mysterious Flare Repeats From Similar Black Gap Two Years Later

Latest Posts

Warmth maps present half the U.S. below advisories, warnings as excessive temperatures proceed

U.Okay. will acknowledge Palestine as a state except Israel strikes towards ceasefire in Gaza, prime minister says

Google Chrome provides on-line retailer evaluations that will help you know in the event that they’re legit

What's Hot

Chinese language startup Z.ai launches highly effective open supply GLM-4.5 mannequin household with PowerPoint creation

Licensing and API pricing

Aggressive efficiency on third-party benchmarks, approaching that of main closed/proprietary LLMs

A part of the wave of open supply Chinese language LLMs

Structure and coaching classes revealed

Interactive demos

Firm background and market place

What it means for enterprise technical decision-makers

Related Posts