Demo paper at EDM 2023.
How do human learners perform on a puzzle video game compared to AI learners?
Accepted to TMLR 2023.
How well do large language models perform on the MATH dataset?
How well do large language models perform on an unnatural in-context learning task?
What are the next steps in teaching foundation models formal reasoning and using them?
© 2019–2025