MIT CSAIL researchers used a natural language-based logical inference dataset to create smaller language models that outperformed much larger counterparts.
It's interesting that they were able to get a model with 350M parameters to outperform others with 175B parameters
interesting indeed, even though it seems to work only on specific tasks. I definitely support this direction though. LLMs are getting out of hand (have actually been for a while now), slipped from researchers' grasp into big tech companies'. I think the work that the open source and research community is doing already with the chatgpt lookalike models is incredible