Hallucinating sources

paequ2@lemmy.today · 1 month ago

Hallucinating sources

Wilco@lemm.ee · 1 month ago

Seriously, TRY and get an AI chat to give an answer without making stuff up. It is impossible. You can tell it “you made that data up, do not do that” … and it will apologize and say you were right, then make up more dumb shit.

Denvil@lemmy.one · 1 month ago

I looked up on google at one point what the minimum required depth for a cable running under a building is by NEC code. It told me it was 0 inches. I laughed and called it stupid, wtf do you mean 0 inches?? Upon further research, 0 inches is the correct answer, I felt real stupid after that -_-

couch1potato@lemmy.dbzer0.com · 1 month ago

Seems like a depth of 0 inches means you can just lay it on the floor?

yucandu@lemmy.world · 1 month ago

No, it means 50% of the cable must be submerged or buried. Little speed bumps all around.

Denvil@lemmy.one · edit-2 1 month ago

The 0 inches is to the top of the cable or raceway used, so it would be have to be at least perfectly flush with the ground. Obviously you can go lower.

Denvil@lemmy.one · 1 month ago

As mentioned with the other guy, 0 inches is the requirement to the top of the cable or raceway used. So at minimum, you’re allowed to be perfectly flush with the ground. Obviously you can and likely would go a little lower, although I don’t have any experience with trenching myself.

Comtief@lemm.ee · 1 month ago

Yeah, LLMs are great if you treat them like a tool to create drafts or give you ideas, rather than like an encyclopedia.

Zetta@mander.xyz · edit-2 1 month ago

I’ll get hate for this but in most tasks people use them for they are pretty dang accurate. I’m talking about frontier models fyi

shalafi@lemmy.world · 1 month ago

Google Gemini gives me solid results, but I stick to strictly factual questions, nothing ambiguous. Got a couple of responses I thought were wrong, turns out I was wrong.

medgremlin@midwest.social · 1 month ago

I got a Firefox plugin to block Gemini results because whenever I look up something for my medical studies, it runs a really high chance of spitting out garbage or outright lies when I really just wanted the google result for the NIH StatPearls article on the thing.

As a medical professional, generative AI and search adjuncts like Gemini only make my job harder.

comfy@lemmy.ml · 1 month ago

You can tell it “you made that data up, do not do that”

I wish people would stop treating these tools as intelligent.

helloworld55@lemm.ee · edit-2 1 month ago

I have found AI to be a terrible primary source. But something I’ve found very useful is to ask for a detailed response, structured a certain way. Then tell the AI to grade it as a professor would. It actually does a very good job at acknowledging gaps and giving an honest grade then.

AI shouldn’t be a primary source but it’s great for starting a topic. Similar to talking to someone that’s moderately in the know on something you interested in

BradleyUffner@lemmy.world · 1 month ago

That’s because ALL generative AI results, even the correct ones, are “made up”. They just exist on a spectrum of coincidental correspondence with reality. I’m still surprised that they manage to get as much right as they do.

Brave Little Hitachi Wand@lemmy.world · 1 month ago

Capitalism breeds innovation! Sometimes innovating means summoning… mindless lie demons… Who drink all our water. 🙃

postmateDumbass@lemmy.world · 1 month ago

A thousand wrong answers are more innovative than a single correct one.

Aceticon@lemmy.dbzer0.com · 1 month ago

The core of the scam is making people believe that “novel” is the same as “better”.

floquant@lemmy.dbzer0.com · 1 month ago

Hilarious that Gemini is so bad. Not like Google had a good starting position on internet search

frezik@midwest.social · 1 month ago

The only thing Gemini is good for is bringing up sources that don’t appear in the regular Google search results. Which only leads to another question: why are those links not in the regular Google search results?

Cort@lemmy.world · 1 month ago

My only guess is that they’re trying to see if de-enshittifying results for AI can make it profitable

Yoga@lemmy.ca · 1 month ago

I was talking about this with a webdev buddy the other day, wondering if webmasters might start optimizing for AI indexing rather than SEO.

Cort@lemmy.world · 1 month ago

That’s an interesting thought. I would wonder if there’s too much change/movement in the ai models, and would think that we won’t see something like that until there’s more stability, or one of the ai models comes out on top of all the others. Right now you’d have to optimize for half a dozen different models, and still be missing a few ‘popular’ ones.

OminousOrange@lemmy.ca · 1 month ago

I find the same with perplexity. It’s more of a search assistant in finding some sources that a search engine likely wouldn’t. Sometimes it’s summarized answers are accurate, sometimes it’s a jumble of several slightly unrelated sources.

pyre@lemmy.world · edit-2 1 month ago

Infinite money, all the data on the internet, and nothing to show for it. I wrote about my experience with Gemini assistant for people who enjoy suffering.

floquant@lemmy.dbzer0.com · edit-2 1 month ago

I’ve genuinely been wondering what the hell the average googler has been up to in the last 5 years. They’re killing services, barely developing new features or hardware, and have been talking for so long (as in, they were genuinely at the forefront) about AI and how they’re in a unique position to make the most out of data, AI, services, and hardware, then failed spectacularly to keep that advantage, and even more spectacularly to keep up.

I guess they just found some other, more profitable way to exploit that unique position, than to care about the people using their products.

real_squids@sopuli.xyz · 1 month ago

Amazing

pyre@lemmy.world · 1 month ago

isn’t it? all this tech advancement over the past half century, followed billions of dollars of investment on a tech that wastes a monumental amount of energy and water to give you the wrong answer to questions even the most basic calculators can answer.

Miles O'Brien@startrek.website · 1 month ago

It’s making ten billion calculations per second and they’re all wrong!

Zron@lemmy.world · 1 month ago

That’s one of my skills as a certified genius. I’m wicked fast at math.

37/2.4 boom 16.38.

Is it right, maybe, maybe not. But I did it fast

shyguyblue@lemmy.world · 1 month ago

I was trying to see if I could sync my entire Calibre ebook library to my kobo, so i googled it. The dumbass AI result told me to hit the “sync library” button, that doesn’t friggin exist…

snooggums@lemmy.world · 1 month ago

This is the most common response from AI on search pages when I’m trying to find some kind of setting.

shyguyblue@lemmy.world · 1 month ago

Yeah, even Googles own operating system.

“To disable Network Notification sounds, do a bunch of shit that doesn’t exist anywhere in the settings!”

Orc from Warcraft 1: “Jobs’ done!”

Monument@lemmy.sdf.org · 1 month ago

That’s the most infuriating thing.

I’m trying to learn how to do new things, well, basically all the time.
Right now I’m stalled out on a sorta important personal project to teach myself about containers/micro-services/certs in a homelab environment. And what I’m discovering is that I don’t know enough to know I don’t know enough - it used to be that I’d take on an ambitious project, mess up, figure out how to overcome that, then learn by looking at what did work, and do better in the future.
But every technical project lately has gotten to the point where I’m trying to just get something, anything, to work or make sense, but every convincing enough AI generated page sets me back by several days as I troubleshoot the convincing enough steps and find myself realizing they’re referencing YAML settings from apps that aren’t part of the service, that every page directs me to install Python, Node, or whatever other helper app directly on my machine that would normally run in a container (which defeats the purpose of trying to containerize things - some stuff I want to use relies on non-compatible versions/configurations). There’s a very clear disconnect from what I’m seeing and what I’m understanding, and the utter lack of authoritative information/proliferation of useless info has just crippled my ability to identify and resolve the disconnect. It’s honestly soul crushing.

shalafi@lemmy.world · 1 month ago

Keep going! It was worse before the internet, slightly better once it started gaining content. When you’re ignorant as a stump on a given tech, starting from 0 is hell.

When I began learning SQL I didn’t know the search terms I wanted and my questions were too simple to get results. My first script took me 8 hours, for 8 very short lines. A year later I stumbled on that script at work and laughed, all stuff I could write from memory, easily.

Sounds like you need to back up and parse your ambitions into smaller chunks. That’s too much to digest at once. You know how to eat an elephant, right?

selokichtli@lemmy.ml · 1 month ago

It resembles that “have you ever tried not being mutant/gay/depressed/etc.” classic line.

shalafi@lemmy.world · 1 month ago

Gemini fails at software “how to” questions all the time. Maybe 50% of my results are accurate? To be fair, as with most software outfits, Google’s own docs are often dated.

prototype_g2@lemmy.ml · 1 month ago

How does this surprise anyone?

LLMs are just pattern recognition machines. You give them a sequence of words and they tell you what is the most statistically likely word to follow based solely on probability, no logic or reasoning.

Lifter@discuss.tchncs.de · 1 month ago

It’s amazing that they get it right 40 % of the time then.

taiyang@lemmy.world · 1 month ago

Yes, having tested this myself it is absolutely correct. Hell, even when it finds something, it’s usually a secondary or tertiary source that’s nearly unusable-- or even one of those “we did our own research and vaccines cause autism” type sources. It’s awful and idiots seem to think otherwise.

Angelusz@lemmy.world · 1 month ago

You shouldn’t use them to keep up with the news. They make that option available because it’s wanted, but they shouldn’t.

It should only be used to research older data from its original dataset, perhaps adding to it a bit with newer knowledge if you’re a specialist in the field.

When you ask the right questions in the right way, you’ll get the right answers, or at least mostly - and you should always check the sources after. But it’s a specialists tool at this time. And most people are not specialists.

So this whole “Fuck AI” movement is actually pretty damn stupid. It’s good to point out its flaws, try and make people aware and help guide it better into the future.

But it’s actually useful, and not going away. You’re just using it wrong, and as the tech progresses, ways to use it wrong will decrease. You can’t stop progress, humanity will always come with new things, evolution is designed that way.

taiyang@lemmy.world · 1 month ago

Well, no, because what I’m referring to isn’t even news, it’s research. I’m an adjunct professor and trying to get old articles doesn’t even work, even when they’re readily available publicly. The linked article here is referencing citations and it doesn’t get more citation-y than that. It doesn’t change that when you ask differently, either, because LLMs aren’t good at that even if tech bros want it to be.

Now, the information itself could be valid, and in basics it usually is. I was at least able to use it to get myself some basic ideas on a subject before ultimately having to browse abstracts for what I need. Still, you need the source of you’re doing anything serious and the best I’ve got from AI are just authors prevalent in the field which at least is useful for my own database searches.

Angelusz@lemmy.world · 1 month ago

I understand your experience and have had it myself. It’s also highly dependent on the model you use. The most recent ChatGPT4.5 for instance, is pretty good at providing citations. The tech is being developed fast.

taiyang@lemmy.world · 1 month ago

I don’t doubt that it’ll get better, although with the black box nature of these types of models, I’m not sure if it’ll ever reach perfection. I understand neural networks, it’s not exactly something you can flip the hood of and take a look at what’s giving you weird results.

Dojan@lemmy.world · 1 month ago

I’m confused. These are large language models, not search engines?

Tartas1995@discuss.tchncs.de · 1 month ago

But they are used like search engine… A lot… That is a huge issue.

FauxLiving@lemmy.world · 1 month ago

If people were using Photoshop to create spreadsheets you don’t say Photoshop is terrible spreadsheet software, you say the people are dumb for using the tool for something that it isn’t designed for.

People are using LLMs as search engines and then pointing out that they’re bad search engines. This is mass user error.

TheBeesKnees@lemmy.sdf.org · 1 month ago

Correction: companies are implementing it into their search engines. Users are just providing feedback.

Ironically, Google’s original non-LLM summary was pretty great. That’s gone now.

erytau@programming.dev · 1 month ago

They do have search functionality. For Perplexity it’s even the main focus. Yeah, it’s hard to stop them from confidently making things up.

Optional@lemmy.world · 1 month ago

Google and to some extent Micro$oft (and Amazon) have all sunk hundreds of billions of dollars into this bullshit technidiocy because they want AI to go out and suck up all the data on the Internet and then come back to (google or wherever) and present it as if it’s “common knowledge”.

Thereby rendering all authoritative (read; human, expensive) sources unnecessary.

Search and making human workers redundant has always been the goal of AI.

AI does not understand what any words mean. AI does not understand what the word “word” means. It was never going to work. It’s been an insanity money pit from day one. This simple fact is only now beginning to leak out because they can’t hide it anymore.

The_v@lemmy.world · 1 month ago

It’s actively destroying their search ability as well.

My 15 year old son got a lesson in not trusting Google search yesterday. He wanted pizza for dinner so I had him call a chain and order it. So he hit the call button on the AI bullshit section and ordered it.

When we got there we found out that every phone number listed on the summary was scrambled. He ordered a pizza at a place 150 miles away.

When you clicked into the webpage or maps the numbers were right. On the AI summary, it was all screwed up.

kibiz0r@midwest.social · 1 month ago

Tell that to the companies slowly replacing conventional search with AI.

AI search is a game-changer for those companies. It keeps you on their site instead of clicking away. So they retain your attention, and needn’t share any of the economic benefit with the sources that make it possible.

And when we criticize the quality of the results, who’s gonna hold them accountable for nonsense? “It’s just a tool, after all”, they say, Caveat emptor!”

Nevermind that they have a financial incentive to yield results that avoid disrespecting your biases, and offer no more than a homeopathic dose of utility — to keep you searching but never finding.

It’s a sprawling problem, that stems from the lack of protections around monopoly power, the attention economy, cribbing off other people’s work, and misinformation.

Your comment is technically correct. “You’re using the wrong tool” is a valid footnote. But it’s not the crux of the issue.

Comtief@lemm.ee · 1 month ago

Except Perplexity, which is indeed a search engine… which might explain why it does so well there.

Dojan@lemmy.world · edit-2 1 month ago

I’m curious how Kagi would hold up, but the AI BS is entirely opt-in there so maybe they didn’t include it because of that.

Edit: lmao Perplexity is gross. Who would use this instead of an actual search engine?

Comtief@lemm.ee · edit-2 1 month ago

I’ve used it a few times when I struggled to find answers with regular searches and felt like giving up or just wanted to see what it has to say. I took it for a spin for a test right now, asking “Which is the safest LLM service for a company in regards to privacy? ChatGPT, Anthropic or Mistral” and it actually found stuff that I didn’t before when I was looking into it.

danc4498@lemmy.world · 1 month ago

Is this an ad for Perplexity? I’ve never heard of it, and now I’m googling it. So effective ad if so.

ChaoticNeutralCzech@feddit.org · 1 month ago

Would be weird for an ad to bash on the paid tier

danc4498@lemmy.world · 1 month ago

Yeah, it’s one of those “no bad press” kind of things. It’s bashing on AI, but Perplexity actually looks pretty good by comparison.

ChaoticNeutralCzech@feddit.org · edit-2 1 month ago

I’m saying the Perplexity paid tier is about 2x more likely to be confidently wrong than Perplexity

danc4498@lemmy.world · 1 month ago

Oh, right, good point.

Psythik@lemm.ee · 1 month ago

Well it worked cause it convinced me to give it a try. I even installed the app.

Liking it so far. It doesn’t patronize me like Gemini does; it isn’t at all afraid to correct me when I’m wrong.

Comtief@lemm.ee · 1 month ago

Idk but search is what they do, they use regular AI models like chatgpt or claude with their own search tool.

PumaStoleMyBluff@lemmy.world · 1 month ago

deleted by creator

selokichtli@lemmy.ml · 1 month ago

Perplexity is not looking bad, IMHO.

shawn1122@lemm.ee · 1 month ago

Perplexity is by far the best for searching but still copiously hallucinates.

1 month ago

Perplexity is the only one I would think of using seriously, and then only when I want it to, say, summarize something I already know.

After which I fact-check it like crazy and hammer at it until it gets things right.

One annoying habit it has is that somewhere in the chain of software before or after the LLM it looks for certain key topics it doesn’t want to talk about and either comes out and says it (anything involving violence or crime) or has a visibly canned hot take that it repeats without variance no matter what added information you provide or how much cajoling you try.

At other points it starts into the canned responses, but when you catch it it will try again. Like I frequently want song lyrics translated and each time I supply some that it recognizes as such it throws up a canned response about how it will not be a party to copyright breaking. Then after a few rounds back and forth about how I’m clearly not doing this commercially and am just a fan who wants to understand a song better it will begrudgingly give me the translation.

Then five minutes later in the SAME CONVERSATION it will run through that cycle all over again when I give it another song.

Lather. Rinse. Repeat.

potpotato@lemmy.world · 1 month ago

Perplexity Pro: we take all of the non-answers and give you completely incorrect answers!

PeteWheeler@lemmy.world · 1 month ago

AI as a search engine is terrible.

Because if you treat it as such, it will just look at the first result, which is usually wrong or has incomplete info.

If you give the AI a source document, then it is amazing as a search engine. But if the source doc is the entire internet… its fucking bad.

Shit quality in, shit quality out. And we/corporations have made the internet abundant of shit.

Rin@lemm.ee · 1 month ago

just for clarity, some kind of learning algorithms have been used in web searches prior to this generative AI boom. I know for a fact that google used an AI to rank pages for its search before even gpt was a thing.

But you’re totally right. generative models shouldn’t be used as search engines.

Prehensile_cloaca @lemm.ee · 1 month ago

Copilot is such garbage. Microsoft swirling the drain on business capabilities that they should be dominating is very on brand.

clonedhuman@lemmy.world · 1 month ago

Now guess how much power it took for each one of those wrong answers.

The upper limit for AI right now has nothing to do with the coding or with the companies programming it. The upper limit is dictated by the amount of power it takes to generate even simple answers (and it doesn’t take any less power to generate wrong answers).

Training a large language model like GPT-3, for example, is estimated to use just under 1,300 megawatt hours (MWh) of electricity; about as much power as consumed annually by 130 US homes. To put that in context, streaming an hour of Netflix requires around 0.8 kWh (0.0008 MWh) of electricity. That means you’d have to watch 1,625,000 hours to consume the same amount of power it takes to train GPT-3.

https://www.theverge.com/24066646/ai-electricity-energy-watts-generative-consumption

If the AI wars between powerful billionaire factions in the United States continues, get ready for rolling blackouts.

Grimy@lemmy.world · edit-2 1 month ago

It’s a drop in the bucket compared to what’s actually causing damage like vehicles and plane travel.

Estimates for [training and building] Llama 3 are a little above 500,000 kWh[b], a value that is in the ballpark of the energy use of a seven-hour flight of a big airliner.

https://cacm.acm.org/blogcacm/the-energy-footprint-of-humans-and-large-language-models/

That’s around 570 average american homes.

That being said, it’s a malicious and stupidly formed comparaison. It’s like comparing the cost of building a house vs staying in a hotel for a night.

The model, once trained can be constantly re-used and shared. The llama model has been downloaded millions of time. It would be better to compare it to the cost of making the movie.

An average film production with a budget of $70 million leaves behind a carbon footprint of 3,370 metric tons – that’s the equivalent of powering 656 homes for a year!

https://thestarfish.ca/journal/2025/01/understanding-the-environmental-impact-of-film-sets#%3A~%3Atext=While+it's+easy+to+get%2C656+homes+for+a+year!

queermunist she/her@lemmy.ml · 1 month ago

The water consumed by data centers is a much bigger concern. They’re straining already strained public water systems.

fishy@lemmy.today · 1 month ago

Time for nuclear to make a comeback.

Showroom7561@lemmy.ca · 1 month ago

Musk’s gork is as stupid as he is! And he claims it’s waaaaaayyyyy better than other AI. 🤡🤡🤡

argon@lemmy.today · 1 month ago

Identifying the source of an article is very different from the common use case for search engines.

1:1 quotes of web pages is something conventional search engines are very good at. But usually you aren’t quoting pages 1:1.