Today's Large Language Models are Essentially BS Machines
Today's Large Language Models are Essentially BS Machines
Today's Large Language Models are Essentially BS Machines - Ryan McGreal
Today's Large Language Models are Essentially BS Machines
Today's Large Language Models are Essentially BS Machines - Ryan McGreal
And everyone in tech who has worked on ML before collectively says "yeah that's what we've been trying to tell you". Don't get me wrong, LLMs are a huge leap, but god did it show how greedy corporations are, just immediately jumping to "how quick can we lay people off?". The tech is not to that spec. Yet. It will get there, but goddamn do we need to be demanding some regulations now
The tech is not to that spec. Yet.
I'm not sure it will. At least, not this tech, not this approach to the problem. From my understanding there's fundamentally no comprehension; it's not bugged, broken, or incomplete, it's just not there... it's missing from the design.
I was mostly posting this because the last time LLMs came up, people kept on going on and on about how much their thoughts are like ours and how they know so much information. But as this article makes clear, they have no thoughts and know no information.
In many ways they are simply a mathematical party trick; formulas trained on so much language, they can produce language themselves. But there is no “there” there.
have no thoughts
True
know no information
False. There's plenty of information stored in the models, and plenty of papers that delve into how it's stored, or how to extract or modify it.
I guess you can nitpick over the work "know", and what it means, but as someone else pointed out, we don't actually know what that means in humans anyway. But LLMs do use the information stored in context, they don't simply regurgitate it verbatim. For example (from this article):
If you ask an LLM what's near the Eiffel Tower, it'll list location in Paris. If you edit its stored information to think the Eiffel Tower is in Rome, it'll actually start suggesting you sights in Rome instead.
Sadly we don't even know what "knowing" is, considering human memory changes every time it is accessed. We might just need language and language only. Right now they're testing if generating verbalized trains of thought helps (it might?). The question might change to: Does the sum total of human language have enough consistency to produce behavior we might call consciousness? Can we brute force the Chinese room with enough data?
They are the perfect embodiment of the internet.
They know everything, but understand nothing
I've been unemployed for 7 months. Every online job I see that's been posted for at least 6 hours has over 200 applications. I'm a senior Dev with 30 years experience, and I can't find work.
I'd say generative AI is an existential threat as bad as offshoring was for steel in the early 80s. I'm now left with the prospect of spending the last 20 years of my work life at or near minimum wage.
After all, I can't afford to spend $250,000 on a new bachelor's degree, and a community college degree might get me to $25/hr, and still costs thousands. This is causing impoverishment on a massive scale.
Ignore this threat at your peril.
Your issue sounds more like a capitalism issue. FANG companies lay off thousands of employees to cut costs and prepare for changes in the economy. AI didn't make them lay off all those employees, just corporate greed. Until AI can gather requirements, accurately produce code with at least 80%, can compile the software itself, it isn't a threat.
Edit fix autocorrect
I'm a senior dev too, and at first I thought the same, but really it's a market downturn. Companies are just afraid to hire right now. I'd look into generative AI, try to understand how it works. That's how I've been spending my time, and yeah, it's intuitive the way they do it but the more you understand how it works the more you realize that it's not ready to take our jobs. Yet. Again maybe someday, but there is a lot of work that needs to be done to get something semi up and running, and the models that Google uses are not going to be usable for every company. (Take a look at all the specialized models already).
Our job never goes away, but it does constantly evolve. This is just another point where we have to learn new skills, and that may be that we all need to be model tuners some day. At the end of the day the user still needs to correctly describe what they want to have happen on the screen, and there are currently no ways to take what they describe into a full piece of software.
Hard to believe a senior dev can't find work. Those positions are the most needed. Also 25 an hour is 50k a year. No where in the US are senior devs paid that little. I suppose you may not be US based, but your cost for college seems to imply US, albeit at an expensive school.
And everyone in tech who has worked on ML before collectively says “yeah that’s what we’ve been trying to tell you”.
Everybody in tech would even have a passing understanding of the technology was collectively saying that. We understand the limits of technology and can feel out the bounds easily. But, too many of these dumbasses with dollar signs in their eyes are all "to the moon!", and tripping and failing on implementing the tech in unreasonable ways.
It was never a factoid machine, like some people wanted to believe. It was always about creatively writing something, and only one with so much attention.
It was never a factoid machine
Funny tidbit about the word "factoid": its original meaning was "an item of unreliable information that is reported and repeated so often that it becomes accepted as fact", but the modern usage is "a brief or trivial item of news or information".
This means that the modern usage of "factoid" is in itself a factoid, and that in the old sense LLMs sort of are factoid machines.
Note that I'm not saying the modern use is wrong. Languages evolve, and words taking on new meanings doesn't mean the new meanings are "wrong" (and surprisingly words changing to mean the opposite of what they used to mean isn't all that uncommon either.)
I disagree, a lot of white collar work is simply writing bullshit.
“has a model of how words relate to each other, but does not have a model of the objects to which the words refer.
It engages in predictive logic, but cannot perform syllogistic logic - reasoning to a logical conclusion from a set of propositions that are assumed to be true”
Is this true of all current LLMs?
Yes, this is how all LLMs function.
does not have a model of the objects to which the words refer
I'm not even sure what this is supposed to be saying. Sounds kind of like a bullshit generator.
Words are encodings of knowledge and their expression and use represent that knowledge, and these machines ingest a repository containing a significant percent of written human communication. It encodes that the words "dog" and "bark" are often used together, but it also encodes that "dog" and "cat" are things that are both "mammals" and "mammals" are "animals", and that the pair of them are much more likely to appear in a human household than a "porpoise". What is this other kind of model of objects that hasn't been in some way represented in all of the internet?
It is not a model of objects. It's a model of words. It doesn't know what those words themselves mean or what they refer to; it doesn't know how they relate together, except that some words are more likely to follow other words. (It doesn't even know what an object is!)
When we say "cat," we think of a cat. If we then talk about a cat, it's because we love cats, or hate them, or want to communicate something about them.
When an LLM says "cat," it has done so because a tokenization process selected it from a chain of word weights.
That's the difference. It doesn't think or reason or feel at all, and that does actually matter.
They're glorified autocompletes. Way too much attention is being given to LLMs in isolation. By themselves: Not a silver bullet.
But when called in a chain . . . eyebrows
Humans are bullshit machines as well.
This is what I find the most amusing about the criticism of LLMs and many other AI systems aswell. People often talk about them as if they're somehow uniquely flawed, while in reality what they're doing isn't that different from what humans do aswell. The biggest difference is that when a human hallucinates it's often obvious but when chatGPT does that it's harder to spot.
A chip off the ol' block, then 🙂
This reminds me of an article about journalism and the internet, from ages ago. A class was asked how they would research for a topic (it was some recent political event, I don’t remember). The class confidently answered “the internet.” The professor struggled to get them to understand that wasn’t enough. Yes, there is all kinds of stuff about this event on the internet, but how did it get there?. And more importantly, what is missing?
Sure, all the sexy AI stuff gives us goosebumps and sounds great. But how did it get there, and what is missing? Someone somewhere has to do the actual original work first, or it’s just making collages from the same library over and over and over again.
And also it's no replacement for actual research, either on the Internet or in real life.
People assume LLMs are like people, in that they won't simply spout bullshit if they can avoid it. But as this article properly points out, they can and do. You can't really trust anything they output. (At least not without verifying it all first.)
People assume LLMs are like people, in that they won’t simply spout bullshit if they can avoid it.
There are plenty of people who spout bullshit every chance they get.
As with any tool it is how you use it that matters.
Today’s LLM’s are capable of fairly amazing stuff.
It’s a BS machine? Sure. Have you read or written stuff for higher education?
You don’t get points for being short and concise, even though you should. You get points for following the BS formula.
You know who else is good at BS?
LLM’s. If you manage to provide it enough meaningful input it can do a great lot of BS legwork for you.
I see people who overuse it, don’t edit, isn’t critical. Sure. Then you end up with just BS.
But there’s plenty of useful applications, like writing boiler plate code (see also CoPilot), structuring code, tests, etc.
Is it worth all the hype? Nope.
Some of it? Probably.
No one is saying there's problems with the bots (though I don't understand why you're being so defensive of them -- they have no feelings so describing their limitations doesn't hurt them).
The problem is what humans expect from LLMs and how humans use them. Their purposes is to string words together in pretty ways. Sometimes those ways are also correct. Being aware of what they're designed to do, and their limitations, seems important for using them properly.
These AI systems do make up bullshit often enough that there's even a term for it: Hallucination.
Kind of a euphemistic term, like how religious people made up the word 'faith' to cover for the more honest term: gullible.
hmm i think we need twelve more articles on this
We should feed the ones already made to a LLM and have it write the next 12 for the irony.
What else should they be?? They reflect human language.
And what does that mean about the jobs it can replace?
They can replace the bullshit jobs of which we have many, serving the essential purpose of keeping the people doing them fed and thus the economy and society stable. 🥲
It can replace nothing. It can make the job of eg. developers easier. And on a small, private scale ML can replace writers and stock photo libraries, if they have support for pictures. However, on a larger scale, both would have massive problems of quality, diversity and copyright. You can't use the output of a ML algorithm for things you earn anything with if there are active cases exploring if the copyright belongs to the ML itself, the producers of the training data, who probably didn't give anyone consent, no one, or actually you.
They're both BS machines and fact generators. It produced bullshit when asked about him because as far as I can tell he's kind of a nobody, not because it's just a stylistic generator. If he asked about a more prominent person likely to exist more significantly within the training corpus, it would likely be largely accurate. The hallucination problem stems from the system needing to produce a result regardless of whether it has a well trained semantic model for the question.
LLMs encode both the style of language and semantic relationships. For "who is Einstein", both paths are well developed and the result is a reasonable response. For "who is Ryan McGreal", the semantic relationships are weak or non-existent, but the stylistic path is undeterred, leading to the confidently plausible bullshit.
They don't generate facts, as the article says. They choose the next most likely word. Everything is confidently plausible bullshit. That some of it is also true is just luck.
It's obviously not "just" luck. We know LLMs learn a variety of semantic models of varying degrees of correctness. It's just that no individual (inner) model is really that great, and most of them are bad. LLMs aren't reliable or predictable (enough) to constitute a human-trustable source of information, but they're not pure gibberish generators.
That's just not true. Semantic encodings work. It's not like neural networks are some new untested concept, the LLMs have some new tricks under the hood and are way more extensive in their training goal, but they're fundamentally the same thing. All neural networks are mimicry machines enabled and limited by their data, but mimicking largely correct data produces largely correct results when the answer, or interpolatable answers exists in the training data. The problem arises when asked to go further and further afield from their inputs. Some interpolation and substitutions work, but it gets increasingly unreliable the more niche the answer is.
While the LLM hype has very seriously oversold their abilities, the instinctive backlash to say they're useless is similarly way off-base.
I don't understand shit like that, they are tools, not totally accurate ones but unless you use Bing they do produce a lot of good stuff if used correctly...
I think public-facing they have to be that way, otherwise they would copyright infringe on their training material. Behind the scenes, I suspect that the wealthy can gain access to AI engines where the random response isn't set so high and they can even fact-check and cite their own training material better. It's really hard to imagine that they can debug these things without having any idea what training material influenced which pattern of associations. I sure don't buy that they don't have tools to trace back to training material.
Right now consumer-facing AI wants to put in simple prompts and get back unique term papers each time you ask it the same question.