Skip Navigation

You're viewing a single thread.

109 comments
  • Real headline: Apple research presents possible improvements in benchmarking LLMs.

    • Not even close. The paper is questioning LLMs ability to reason. The article talks about fundamental flaws of LLMs and how we might need different approaches to achieve reasoning. The benchmark is only used to prove the point. It is definitely not the headline.

You've viewed 109 comments.