AI tidies up Wikipedia’s references — and boosts reliability

hedge@beehaw.org · 1 year ago

AI tidies up Wikipedia’s references — and boosts reliability

salarua@sopuli.xyz · edit-2 10 months ago

Wikipedian here - AI on Wikipedia is actually nothing new. we’ve had a machine learning model identify malicious edits since 2017, and Cluebot (an ML-powered anti-vandalism bot) has been around for even longer than that.

even so, this is pretty exciting. from what i gather, this is a transformer model turned on its side; instead of taking textual data and transforming it, it checks to see if two pieces of textual data could reasonably be transformations of each other. used responsibly, this could really help knock out those [dubious] and [failed verification] tags en masse

bionicjoey@lemmy.ca · 1 year ago

If I’m understanding you correctly, it doesn’t ever edit the actual pages, it just adds flags on certain kinds of content. Is that right?

salarua@sopuli.xyz · 1 year ago

yes. it only surfaces citations that may back up the content better, an editor still has to read the source and approve the change

Hundun@beehaw.org · 1 year ago

Fascinating, as a developer, where can I read more/contribute?

HalJor@beehaw.org · 1 year ago

The aforementioned ClueBot is here: https://en.wikipedia.org/wiki/User:ClueBot_NG

For bots in general, start here: https://en.wikipedia.org/wiki/Wikipedia:Bots

DavidGarcia@feddit.nl · 1 year ago

“AI tiddies up Wikipedia’s references…”

Chozo@kbin.social · 1 year ago

I had to do a double-take on that title, too lmao

thingsiplay@kbin.social · 1 year ago

deleted by creator

NX2@feddit.de · 1 year ago

Holy fuck my dyslexia tuned “AI tidies” into something really funny

webghost0101@sopuli.xyz · 1 year ago

you dont need dislexia for that but you are somewhat right that dilexia is an indication for certain intellectual skills.

I bet you can easily adapt and read the following famous text:

Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.[a][16]

This text dating from 2003 is incorrect though, no such research was carried out by Cambridge. You can read more on https://en.m.wikipedia.org/wiki/Transposed_letter_effect#Internet_meme

X3I@lemmy.x3i.tech · 1 year ago

Your example has little to do with dyslexia though, has it?

webghost0101@sopuli.xyz · 1 year ago

It does, dyslexia isn’t just a net negative were you make errors. Which may be a side effect of specific differently wired neurological intelligence.

Being good at “Typoglycemia” isn’t exclusive to dyslexia for sure but statistically they have an advantage at the sort of thing.

People with dyslexia can also have overlap with autism, adhd, ocd. Which is why they are considered part of the bigger neurodivergent family.

Chozo@kbin.social · 1 year ago

This is good news. One problem I’ve always had with using Wikipedia as a research source is that while most of the claims may have citations, those citations will often point to dead links, or to pages that may have been updated/edited since the Wikipedia page was originally written and no longer back up the original claims. There’ve been numerous times I’ve seen multiple citations for a single claim on an article, and every single link the citations point to are either dead links or don’t actually say what the claim was, at all.

Hopefully this helps to clear up a lot of that mess!

online@lemmy.ml · 1 year ago

There’s a bot that goes through and identifies link rot so editors have a backlog queue of them to go through.

ranandtoldthat@beehaw.org · 1 year ago

You can also check for dead links on the internet archive or archive.is.

Yote.zip@pawb.social · 1 year ago

RE: “should I believe this headline?” I would say yeah this is a reasonable thing to use AI for. I assume they are not going to let it full-auto massacre all Wikipedia citations but as long as they have someone verifying the replacements that the AI is generating then this seems like a semi-auto way to clean up citations. My only worry would be that the AI would become a full replacement for finding sources, in which case people could just start accepting its suggestions as the best answers when manual searching could find a better source.

The article does say it downranks low-quality sources, but I wonder how often you can type “what I want to be true” into it and have it find a source for nonsense.

hedge@beehaw.org · 1 year ago

deleted by creator

Vojtěch Fošnár@beehaw.org · 1 year ago

I have to test this on some article recommending alternative medicine

saigot@lemmy.ca · 1 year ago

So long as the ai can avoid citogenesis