RT X Freeze: Grok 4.20 just claimed #1 on IFBench (Artificial Analysis) - the gold standard for instruction following 81% score. Outranking every othe...

来源：马斯克X | 发布时间：2026-04-12 00:14

RT X Freeze
Grok 4.20 just claimed #1 on IFBench (Artificial Analysis) - the gold standard for instruction following

81% score. Outranking every other model

And here is what that actually means -
When you ask Grok to do something, it doesn't give you a close enough answer. It doesn't approximate. It doesn't go off-script

It follows your instructions. Precisely. Every time

xAI is not just racing to build the most intelligent AI - they are also building the most reliable one

An AI that actually listens to you...