Hallucinating sources
Hallucinating sources
Hallucinating sources
How does this surprise anyone?
LLMs are just pattern recognition machines. You give them a sequence of words and they tell you what is the most statistically likely word to follow based solely on probability, no logic or reasoning.
It's amazing that they get it right 40 % of the time then.
Hilarious that Gemini is so bad. Not like Google had a good starting position on internet search
The only thing Gemini is good for is bringing up sources that don't appear in the regular Google search results. Which only leads to another question: why are those links not in the regular Google search results?
I find the same with perplexity. It's more of a search assistant in finding some sources that a search engine likely wouldn't. Sometimes it's summarized answers are accurate, sometimes it's a jumble of several slightly unrelated sources.
Infinite money, all the data on the internet, and nothing to show for it. I wrote about my experience with Gemini assistant for people who enjoy suffering.
I've genuinely been wondering what the hell the average googler has been up to in the last 5 years. They're killing services, barely developing new features or hardware, and have been talking for so long (as in, they were genuinely at the forefront) about AI and how they're in a unique position to make the most out of data, AI, services, and hardware, then failed spectacularly to keep that advantage, and even more spectacularly to keep up.
I guess they just found some other, more profitable way to exploit that unique position, than to care about the people using their products.
Musk's gork is as stupid as he is! And he claims it's waaaaaayyyyy better than other AI. 🤡🤡🤡
Seriously, TRY and get an AI chat to give an answer without making stuff up. It is impossible. You can tell it "you made that data up, do not do that" ... and it will apologize and say you were right, then make up more dumb shit.
I have found AI to be a terrible primary source. But something I've found very useful is to ask for a detailed response, structured a certain way. Then tell the AI to grade it as a professor would. It actually does a very good job at acknowledging gaps and giving an honest grade then.
AI shouldn't be a primary source but it's great for starting a topic. Similar to talking to someone that's moderately in the know on something you interested in
That's because ALL generative AI results, even the correct ones, are "made up". They just exist on a spectrum of coincidental correspondence with reality. I'm still surprised that they manage to get as much right as they do.
Yeah, LLMs are great if you treat them like a tool to create drafts or give you ideas, rather than like an encyclopedia.
I looked up on google at one point what the minimum required depth for a cable running under a building is by NEC code. It told me it was 0 inches. I laughed and called it stupid, wtf do you mean 0 inches?? Upon further research, 0 inches is the correct answer, I felt real stupid after that -_-
You can tell it “you made that data up, do not do that”
I wish people would stop treating these tools as intelligent.
I'll get hate for this but in most tasks people use them for they are pretty dang accurate. I'm talking about frontier models fyi
AI as a search engine is terrible.
Because if you treat it as such, it will just look at the first result, which is usually wrong or has incomplete info.
If you give the AI a source document, then it is amazing as a search engine. But if the source doc is the entire internet.... its fucking bad.
Shit quality in, shit quality out. And we/corporations have made the internet abundant of shit.
just for clarity, some kind of learning algorithms have been used in web searches prior to this generative AI boom. I know for a fact that google used an AI to rank pages for its search before even gpt was a thing.
But you're totally right. generative models shouldn't be used as search engines.
Is this an ad for Perplexity? I’ve never heard of it, and now I’m googling it. So effective ad if so.
It's making ten billion calculations per second and they're all wrong!
That’s one of my skills as a certified genius. I’m wicked fast at math.
37/2.4 boom 16.38.
Is it right, maybe, maybe not. But I did it fast
Capitalism breeds innovation! Sometimes innovating means summoning... mindless lie demons... Who drink all our water. 🙃
A thousand wrong answers are more innovative than a single correct one.
The core of the scam is making people believe that "novel" is the same as "better".
Yes, having tested this myself it is absolutely correct. Hell, even when it finds something, it's usually a secondary or tertiary source that's nearly unusable-- or even one of those "we did our own research and vaccines cause autism" type sources. It's awful and idiots seem to think otherwise.
You shouldn't use them to keep up with the news. They make that option available because it's wanted, but they shouldn't.
It should only be used to research older data from its original dataset, perhaps adding to it a bit with newer knowledge if you're a specialist in the field.
When you ask the right questions in the right way, you'll get the right answers, or at least mostly - and you should always check the sources after. But it's a specialists tool at this time. And most people are not specialists.
So this whole "Fuck AI" movement is actually pretty damn stupid. It's good to point out its flaws, try and make people aware and help guide it better into the future.
But it's actually useful, and not going away. You're just using it wrong, and as the tech progresses, ways to use it wrong will decrease. You can't stop progress, humanity will always come with new things, evolution is designed that way.
Well, no, because what I'm referring to isn't even news, it's research. I'm an adjunct professor and trying to get old articles doesn't even work, even when they're readily available publicly. The linked article here is referencing citations and it doesn't get more citation-y than that. It doesn't change that when you ask differently, either, because LLMs aren't good at that even if tech bros want it to be.
Now, the information itself could be valid, and in basics it usually is. I was at least able to use it to get myself some basic ideas on a subject before ultimately having to browse abstracts for what I need. Still, you need the source of you're doing anything serious and the best I've got from AI are just authors prevalent in the field which at least is useful for my own database searches.
I was trying to see if I could sync my entire Calibre ebook library to my kobo, so i googled it. The dumbass AI result told me to hit the "sync library" button, that doesn't friggin exist...
This is the most common response from AI on search pages when I'm trying to find some kind of setting.
Yeah, even Googles own operating system.
"To disable Network Notification sounds, do a bunch of shit that doesn't exist anywhere in the settings!"
Orc from Warcraft 1: "Jobs' done!"
That’s the most infuriating thing.
I’m trying to learn how to do new things, well, basically all the time.
Right now I’m stalled out on a sorta important personal project to teach myself about containers/micro-services/certs in a homelab environment. And what I’m discovering is that I don’t know enough to know I don’t know enough - it used to be that I’d take on an ambitious project, mess up, figure out how to overcome that, then learn by looking at what did work, and do better in the future.
But every technical project lately has gotten to the point where I’m trying to just get something, anything, to work or make sense, but every convincing enough AI generated page sets me back by several days as I troubleshoot the convincing enough steps and find myself realizing they’re referencing YAML settings from apps that aren’t part of the service, that every page directs me to install Python, Node, or whatever other helper app directly on my machine that would normally run in a container (which defeats the purpose of trying to containerize things - some stuff I want to use relies on non-compatible versions/configurations). There’s a very clear disconnect from what I’m seeing and what I’m understanding, and the utter lack of authoritative information/proliferation of useless info has just crippled my ability to identify and resolve the disconnect. It’s honestly soul crushing.
Keep going! It was worse before the internet, slightly better once it started gaining content. When you're ignorant as a stump on a given tech, starting from 0 is hell.
When I began learning SQL I didn't know the search terms I wanted and my questions were too simple to get results. My first script took me 8 hours, for 8 very short lines. A year later I stumbled on that script at work and laughed, all stuff I could write from memory, easily.
Sounds like you need to back up and parse your ambitions into smaller chunks. That's too much to digest at once. You know how to eat an elephant, right?
It resembles that "have you ever tried not being mutant/gay/depressed/etc." classic line.
Gemini fails at software "how to" questions all the time. Maybe 50% of my results are accurate? To be fair, as with most software outfits, Google's own docs are often dated.
Perplexity is not looking bad, IMHO.
Perplexity is the only one I would think of using seriously, and then only when I want it to, say, summarize something I already know.
After which I fact-check it like crazy and hammer at it until it gets things right.
One annoying habit it has is that somewhere in the chain of software before or after the LLM it looks for certain key topics it doesn't want to talk about and either comes out and says it (anything involving violence or crime) or has a visibly canned hot take that it repeats without variance no matter what added information you provide or how much cajoling you try.
At other points it starts into the canned responses, but when you catch it it will try again. Like I frequently want song lyrics translated and each time I supply some that it recognizes as such it throws up a canned response about how it will not be a party to copyright breaking. Then after a few rounds back and forth about how I'm clearly not doing this commercially and am just a fan who wants to understand a song better it will begrudgingly give me the translation.
Then five minutes later in the SAME CONVERSATION it will run through that cycle all over again when I give it another song.
Lather. Rinse. Repeat.
Perplexity is by far the best for searching but still copiously hallucinates.
Perplexity Pro: we take all of the non-answers and give you completely incorrect answers!
Copilot is such garbage. Microsoft swirling the drain on business capabilities that they should be dominating is very on brand.
It’s just asking it m to find sources from excerpts. I don’t think this is something they have been trained on with much emphasis is it?
Yeah, this isn't a general test of "factualness", it's a very narrow and specific subset of capability that many of these AIs were not designed for.
That being said, it does not change my already poor opinion of generative AI.
I’m confused. These are large language models, not search engines?
But they are used like search engine... A lot... That is a huge issue.
If people were using Photoshop to create spreadsheets you don't say Photoshop is terrible spreadsheet software, you say the people are dumb for using the tool for something that it isn't designed for.
People are using LLMs as search engines and then pointing out that they're bad search engines. This is mass user error.
They do have search functionality. For Perplexity it's even the main focus. Yeah, it's hard to stop them from confidently making things up.
Tell that to the companies slowly replacing conventional search with AI.
AI search is a game-changer for those companies. It keeps you on their site instead of clicking away. So they retain your attention, and needn’t share any of the economic benefit with the sources that make it possible.
And when we criticize the quality of the results, who’s gonna hold them accountable for nonsense? “It’s just a tool, after all”, they say, Caveat emptor!”
Nevermind that they have a financial incentive to yield results that avoid disrespecting your biases, and offer no more than a homeopathic dose of utility — to keep you searching but never finding.
It’s a sprawling problem, that stems from the lack of protections around monopoly power, the attention economy, cribbing off other people’s work, and misinformation.
Your comment is technically correct. “You’re using the wrong tool” is a valid footnote. But it’s not the crux of the issue.
Except Perplexity, which is indeed a search engine.. which might explain why it does so well there.
Google and to some extent Micro$oft (and Amazon) have all sunk hundreds of billions of dollars into this bullshit technidiocy because they want AI to go out and suck up all the data on the Internet and then come back to (google or wherever) and present it as if it's "common knowledge".
Thereby rendering all authoritative (read; human, expensive) sources unnecessary.
Search and making human workers redundant has always been the goal of AI.
AI does not understand what any words mean. AI does not understand what the word "word" means. It was never going to work. It's been an insanity money pit from day one. This simple fact is only now beginning to leak out because they can't hide it anymore.
It's actively destroying their search ability as well.
My 15 year old son got a lesson in not trusting Google search yesterday. He wanted pizza for dinner so I had him call a chain and order it. So he hit the call button on the AI bullshit section and ordered it.
When we got there we found out that every phone number listed on the summary was scrambled. He ordered a pizza at a place 150 miles away.
When you clicked into the webpage or maps the numbers were right. On the AI summary, it was all screwed up.
AI has its uses: I would love to read AI written books in fantasy games (instead of the 4 page books we currently have) or talk to a AI in the next RPG game, hell it might even make better random generated quests and such things.
You know, places where hallucinations don't matter.
AI as a search engine only makes sense when/if they ever find a way to avoid hallucinations.
Is the plan for AI to give tech plausible deniablity when it lies about politics and other mis/dis information?
Now guess how much power it took for each one of those wrong answers.
The upper limit for AI right now has nothing to do with the coding or with the companies programming it. The upper limit is dictated by the amount of power it takes to generate even simple answers (and it doesn't take any less power to generate wrong answers).
Training a large language model like GPT-3, for example, is estimated to use just under 1,300 megawatt hours (MWh) of electricity; about as much power as consumed annually by 130 US homes. To put that in context, streaming an hour of Netflix requires around 0.8 kWh (0.0008 MWh) of electricity. That means you’d have to watch 1,625,000 hours to consume the same amount of power it takes to train GPT-3.
https://www.theverge.com/24066646/ai-electricity-energy-watts-generative-consumption
If the AI wars between powerful billionaire factions in the United States continues, get ready for rolling blackouts.
Time for nuclear to make a comeback.
It's a drop in the bucket compared to what's actually causing damage like vehicles and plane travel.
Estimates for [training and building] Llama 3 are a little above 500,000 kWh[b], a value that is in the ballpark of the energy use of a seven-hour flight of a big airliner.
https://cacm.acm.org/blogcacm/the-energy-footprint-of-humans-and-large-language-models/
That's around 570 average american homes.
That being said, it's a malicious and stupidly formed comparaison. It's like comparing the cost of building a house vs staying in a hotel for a night.
The model, once trained can be constantly re-used and shared. The llama model has been downloaded millions of time. It would be better to compare it to the cost of making the movie.
An average film production with a budget of $70 million leaves behind a carbon footprint of 3,370 metric tons – that’s the equivalent of powering 656 homes for a year!
The water consumed by data centers is a much bigger concern. They're straining already strained public water systems.
I mean, the tech is changing faster than science can analyize it, but isnt this now outdated?
I dont use AI but a friend showed me a query that returned the sources, most of which were academic and appeared trustworthy
I like how when you go pro with perplexity, all you get is more wrong answers
That's not true, it looks like it does improve. More correct and so-so answers.
That's probably why I end up arguing with Gemini. It's constantly lying.
Identifying the source of an article is very different from the common use case for search engines.
1:1 quotes of web pages is something conventional search engines are very good at. But usually you aren't quoting pages 1:1.
AI can be a load of shite but I’ve used it to great success with the Windows keyboard shortcut while I’m playing a game and I’m stuck or want to check something.
Kinda dumb but the act of not having to alt-tab out of the game has actually increased my enjoyment of the hobby.
Go figure, the one providing sources for answers was the most correct...But pretty wild how it basically leaves the others in the dust!
I'd rather have 40% accuracy instead of 100% ads. Search engines are dead...
Edit: Why are you guys hating so much? Do you really prefer to look at traditional search engine results and read all the AI trash that's been SEOed to appear at the top of your results?? At this point I'd rather search with AI.
Also of course both methods have their strengths and weaknesses. You should always verify any answers you get from an LLM ( and the same for traditional search engines)
And one last thing AI is not pure evil imo. But yeah, a lot of it is however searching and data filtering seems an alright use case to me. It's not like a normal web search is ad free and unbiased with however many plugins you install.
Use an ad blocker and be diligent in your search. Relying on AI makes you dumber
Have piHole set up, I have a VPN and I have been using adblocker for over 10 years. I was taking about the SEO. The search results are poisend. Whatever you search it's mostly ads packed as a "blogs" optimized to appear on top of the search results.
I'm not defending AI here however I do think that using AI as a knowlage base look up tool is a more useful application of the tech
Do you really prefer to look at traditional search engine results and read all the AI trash that's been SEOed to appear at the top of your results?? At this point I'd rather search with AI.
Nah, I’ll take Door #3: give me a good search engine that filters out that SEO’d garbage, and that doesn’t use AI at all. That’s what I want.
That would be so fucking nice. The Internet has really gotten so much shittier the longer big tech is in charge.
I recently started visiting the forums I used to basically live on prior to reddit, it's refreshing to have (mostly) civil discourse and not be in an echo chamber.
Thinking they aren't going to inject ads into AI at some point is naive.
And how do you know the AI results aren’t advertisements? I honestly prefer them to show me straight up ads rather than disguising them as content.
On Firefox, this even blocks Youtube ads.