come see all the popular super-duper-autocomplete systems failing hard at really simple reasoning questions and babbling nonsense from latent space!
come see all the popular super-duper-autocomplete systems failing hard at really simple reasoning questions and babbling nonsense from latent space!
arxiv.org
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models