Phind is now using a V7 of their model for their own platform, as they have found that people overall prefer that output vs GPT4. This is extremely impressive because it's not just a random benchmark that can be gamed, but instead crowd sourced opinion on real tasks
The one place everything still lags behind GPT4 is question comprehension, but this is a huge accomplishment
this is one of the most plausible claims to date because it is supported by anecdotal data from actual use scenarios rather than only benchmark games. puppet hockey
Another what? Claiming to be better than gpt4? If so, I think this might be one of the most reasonable times it's been claimed, with, albeit anecdotal, evidence from real use cases instead of just gaming a benchmark