Meituan LongCat 2.0 Tested: My Honest Review vs GLM 5.2

Share this post

China’s latest open-source model, Meituan LongCat 2.0, just launched. I put it through hands-on testing against GLM 5.2 — here’s my honest breakdown.

The big story: it was trained entirely without Nvidia chips.

Key takeaways

LongCat 2.0 is a new open-source, 1.6-trillion-parameter Chinese model — and the model behind the free AoAlpha API.
The standout fact: it was trained entirely on Meituan’s own chips, with zero Nvidia hardware.
In my hands-on game-building tests, GLM 5.2 still beat it clearly — LongCat is interesting, but not a switch.

What Is LongCat 2.0?

LongCat 2.0 is a newly released, fully open-source AI model out of China — and it’s actually the model that was quietly powering the free AoAlpha API I’d already been testing with Hermes and Claude Code.

It’s a large model — 1.6 trillion parameters — built with some genuinely interesting architecture: Sparse Attention, Zero Compute Experts, and MIPD. On paper, that’s a serious amount of engineering.

The Biggest Story: No Nvidia At All

Here’s the detail that actually matters more than any benchmark. LongCat 2.0 was trained entirely on Meituan’s own chips — Meituan being China’s rough equivalent of DoorDash — with zero Nvidia hardware involved.

That’s a big deal. We’ve already seen companies like Xiaomi jump into building their own models this year, and now a delivery-app company is training frontier-scale models on its own silicon. Everyone is getting into AI, including companies you’d never expect.

How I Actually Tested It

The API isn’t easy to access without a Chinese setup — when I tried topping up on the site, it was broken. So I tested LongCat 2.0 directly through its website chat instead.

I built the same set of demos I use to compare every model on my channel: Dragon Realm, a Skyrim-style open-world game, and VoxelCraft. Results were mixed — some outputs were decent, others were buggy, and one demo just went completely black.

The Benchmarks

On paper, LongCat holds its own. On Terminal Bench 2.1, it’s only slightly behind GPT-5.5. On SWE-Bench Pro, it actually outperforms GPT-5.5. Opus 4.8, unsurprisingly, is still crushing everything, including LongCat.

So the raw numbers look respectable. But my own hands-on testing told a different story once I compared it directly against GLM 5.2.

LongCat 2.0 vs GLM 5.2: The Real Verdict

This is the comparison that actually matters, because GLM 5.2 is the other major open-source Chinese model, it’s very cheap on its coding plan, and it’s built to be agentic too.

Side by side on every single game demo — the crawl/dungeon game, Dragon Realm, the Skyrim-style world, and VoxelCraft — GLM 5.2 won clearly. GLM 5.2’s outputs were smoother, more playable, and noticeably less buggy. LongCat’s versions felt like an older generation by comparison.

My Honest Take

Would I call LongCat 2.0 the best model I’ve used? No. Is it fun to test and genuinely interesting to see coming out of China? Absolutely. Fair play to Meituan for shipping something like this.

But if you’re choosing between the two open-source Chinese models right now, GLM 5.2 is by far the stronger option out of everything I’ve tested. I won’t be switching to LongCat any time soon — but I’ll keep testing whatever comes out next.

Where I Actually Run My Models

What I have plugged into production is GLM 5.2, wired directly into Claude Code inside my Agent OS — the mission control dashboard where all my agents work together.

If you want that exact setup — GLM 5.2, Claude Code, and the full Agent OS wired together — it’s all inside my AI Profit Boardroom, with weekly coaching and daily updates on whatever new model actually turns out to be worth using. New to this? Start free with my AI Money Lab.

Get The Full Agent OS Setup →

Why Everyone Is Suddenly Releasing Models

It’s worth zooming out for a second. This year alone we’ve seen companies with zero history in AI — Xiaomi, and now a delivery-app company like Meituan — ship serious, large-scale models. That’s not a coincidence.

It tells you the barrier to training a competent frontier-ish model has dropped enough that companies outside Big Tech can now compete, especially in China where chip access and infrastructure investment are being treated as strategic priorities.

What This Means If You’re Choosing A Model

For anyone deciding what to actually build on, the lesson isn’t “try every new release” — it’s “test before you trust the benchmarks”. LongCat 2.0 looks solid on paper (Terminal Bench, SWE-Bench Pro) but underperformed GLM 5.2 in every real build I tried.

Benchmarks tell you part of the story. Hands-on testing on the actual kind of work you do tells you the rest — and that’s the gap that decides which model is actually worth your time.

FAQ

What is LongCat 2.0?

A new open-source, 1.6-trillion-parameter Chinese AI model, and the model behind the free AoAlpha API.

Was it really trained without Nvidia?

Yes — Meituan trained it entirely on their own chips, with no Nvidia hardware involved.

Is LongCat 2.0 better than GLM 5.2?

No — in my hands-on testing, GLM 5.2 clearly outperformed LongCat 2.0 across every demo I built.

How do I access LongCat 2.0?

The API is difficult to use without a Chinese setup; testing it via the website chat is the easiest route.

Should I switch to LongCat 2.0?

I wouldn’t — it’s interesting to test, but GLM 5.2 remains the stronger open-source option for now.

The Bottom Line

LongCat 2.0 is a fascinating release, particularly the no-Nvidia training angle, but GLM 5.2 won clearly in my side-by-side tests.

Worth watching, not worth switching to yet — GLM 5.2 stays in my production stack.