All data below is auto-generated from raw run outputs. Every game includes the model's full reasoning chain — click any row to inspect. Raw data: results/ · Paper: DOI 10.5281/zenodo.18771523