Test Run #3 Analysis

Comparing model performance for the MARKETS Benchmark benchmark.

Global Filters

Languages

Models

Tags

Overall Avg. Score

0.539

Best Model

Gemini 2.5 Pro

Highest Model Score

0.541

Model Scores per Language

© 2026 LLM Benchmarker. All rights reserved.