2y ago

Scraped data of 2.6 million Duolingo users released on hacking forum

www.bleepingcomputer.com /news/security/scraped-data-of-26-million-duolingo-users-released-on-hacking-forum/

The scraped data of 2.6 million DuoLingo users was leaked on a hacking forum, allowing threat actors to conduct targeted phishing attacks using the exposed information.

Data Breaches @lemmy.zip

BrikoX @lemmy.zip

2y ago

Scraped data of 2.6 million Duolingo users released on hacking forum

www.bleepingcomputer.com /news/security/scraped-data-of-26-million-duolingo-users-released-on-hacking-forum/

87 comments

Oh no. Now they know the aliased email address, unique password, and that I didn't try very hard to learn spanish.
(please note: this is a joke, I don't see anything about them getting passwords)
- Something to note here - with AI, if you’re using any sort of heuristic for your password, it’s pretty simple to work out a pretty good set of possibilities which makes brute force even easier and puts you at risk across the board.
  Always come up with random passwords that are as random as possible. If there’s a path you took to get to a password, in theory it can be worked backward.
  For example I know some people who only change a single letter when changing their passwords which is ultimately trivial to guess if the old password was compromised (hence the need to change the password or the need to proactively work against this possibility)
  
  I wish more websites allowed random words as passwords instead of forcing numbers and special characters (but not THAT special character, you have to use one of the ones on this list).
  People change their passwords by one letter or digit because they're tied to these restrictive formats. If 5-6 random words was the norm, people would update more than just one character when needing to change passwords.
  "poison navy series ruler handshake papaya" is a fantastic password.
  "Ilovemygrandkids!123" is a horrible password.
  
  That's why I let Bitwarden generate a random 64 character password with special characters and numbers
  
  I use a heuristic to update my main passwords. It's not a character but easily guessable if you see it in plaintext and now you've made me facepalm my actions.
  I only use that for certain things because I use Google Oauth or Bitwarden for most things and you've just woken me up about what could be exposed.
  
  something I did before letting bitwarden take over my passwords, was using a phrase consisting of 2-3 words + a series of numbers and special characters. Safer than anyone I knew at the time's passwords. Admittedly it was not the most secure, as i only changed the beginning part of the 2-3 word phrase, and left the last word, numbers and symbols the same. So if one of those passwords were breached, it wouldn't be too difficult for AI to brute force the missing pieces. So yeah I don't do that anymore.
  
  That's why correcthorsebatterystaple is the best way to do passwords imo, just 4 random words with a random special character dividing them and a random number tacked onto the end. Good luck brute forcing that or using AI to guess 4 randomly generated words in the correct order.
- Que pecado!

Next email from duo: give me your credit card details
- "Mi Numero del Seguridad Social es..."

Do the people that release these get paid somehow? Or do they just do it for hacker cred and say fuck these 2.6M people?
- In January 2023, someone was selling the scraped data of 2.6 million DuoLingo users on the now-shutdown Breached hacking forum for $1,500.
  ...
  As first spotted by VX-Underground, the scraped 2.6 million user dataset was released yesterday on a new version of the Breached hacking forum for 8 site credits, worth only $2.13.
  "Today I have uploaded the Duolingo Scrape for you to download, thanks for reading and enjoy!," reads a post on the hacking forum.
  
  HODL, the value will go up again for sure
  
  This part is also, ummm, interesting...
  BleepingComputer has confirmed that this API is still openly available to anyone on the web, even after its abuse was reported to DuoLingo in January.
- They’ll send fake emails where the green owl comes to collect “late fees” for your 216-day streak of missed Spanish lessons.
  
  We've been trying to reach you about your language course's extended warranty...
  
  You'll have to pay with Bed Bath and Beyond gift cards.
- Both.

Oh no, not my German and Japanese scores!!!
I guess the email could become a spam target?? Gmail does a good job sorting that for me.
- They know your email, your name, and that you've taken German anf Japanese. Next they use that information to craft a phishing email that only the very stupid would fall for, which fools an alarming number of people. Something like "Hi, this is Duolingo suppert, and your billing information may have been comprimised. Log into this portal with your credit card credentials to confirm that you were not affected."
- They'll know my very poor scores :(.

Damn, they'll know I didn't finish that Spanish lesson the bird bothered me about!
- They'll know I'm ~1800 days into French and still shit at it.
  The shame!
  
  Salut! Enchanté, ça va bien?
  
  Bonjour!
  That means “‘Sup?”
- I hope they don't fucking send me spam.
  
  Depending on how far you got, you might not understand it anyway.

"Scraped" data suggests that it's data available on public profile pages. However, the article also says the dump is a mix of public and non-public info. So which is it, scraped or not? It's an important distinction, because data collection by scraping is technically not a breach.
- Take this with a pinch of salt but what I'm gathering is that it's essentially just taking people's public profiles but the Duolingo api also exposes users' e-mail addresses (and possibly other info) that isn't normally displayed as part of the user's public profile via their app.
  In essence, they're exposing more data than they probably should be and users were not really aware that data was being made public - that's why people are upset about it.
  
  Ok, this makes sense -- in which case the API should not be exposing data that isn't otherwise available on the public profile, so that is significant.

estamos jodidos señor búo

I pray for whoever pisses off the duolingo bird

How is that API still up after this has happened?
- I only see this comment, but it says 53 comments. I just want to know why they didn't tell their userbase.
  
  Lemmy and kbin have been having some federation issues lately, which might be why you’re only seeing one comment.
  
  Sometimes that happens for me too in the Liftoff app. But if I reload the comments with "swipe to refresh", them all the others will appear too.
  
  I see the same thing. However if you go to the link to this post on kbin.social, you can see the other comments. It's weird. https://kbin.social/m/technology@lemmy.world/t/371933 Edit: the hyperlink won't display properly in this comment. You have to copy the whole link and paste it in your browser.

Is there a list on what data exactly got leaked, that wasn't public before?
- However, Duolingo did not address the fact that email addresses were also listed in the data, which is not public information.
  From the Article, emphasis by me
  
  Rip my email I use specifically for organizations I don't trust

I'm so glad I switched to duck email. Might as well changes it again and block the old email.
- DDG email is AMAZING! I only wish it would have been around before my email got exposed.
  
  Only one thing to do... Start over fresh.
  I just did this a few months ago, and it feels really good to have a proper set-up now, with privacy respecting companies all around.
  
  Why did you choose DDG mail over Addy?
  
  Live learn and share right! 😉

oh non!

87 comments