Test Run #5 Analysis
Comparing model performance for the SCHEMAS Benchmark benchmark.
Global Filters
Languages
Models
Tags
Overall Avg. Score
0.535
Best Model
GPT O3
Highest Model Score
0.564
Comparing model performance for the SCHEMAS Benchmark benchmark.
0.535
GPT O3
0.564