AI Visual Reasoning Faces Setback: Human Clock Reading Accuracy at 89.1%, AI Visual Model Only 13.3%
Issues Revealed by the ClockBench Test The core of this study lies in a testing platform called ClockBench, which simulates the scenario of humans reading clocks and rigorously evaluates the visual ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results