Privacy @lemmy.ml CoderSupreme @programming.dev 2 wk. ago

"just got doxxed to within 15 miles by a vision model, from only a single photo of some random trees. the implications for privacy are terrifying. i had no idea we would get here so soon. holy shit"

87 comments

Geo guessing is related to open source intelligence techniques, and it's pretty easy to get surprisingly good at it.
People who are good at it can take a picture of someone's room and deduce enough about them (sometimes) to be able to get their name, address and phone number.

It being automatic is pretty cool, but you were already leaking the information to anyone interested.

https://www.sans.org/blog/geolocation-resources-for-osint-investigations/

https://youtu.be/p7_2ZA1HHMo?si=O19_7LA3SoyvZEm1
- Here is an alternative Piped link(s):
  
  https://piped.video/p7_2ZA1HHMo?si=O19_7LA3SoyvZEm1
  
  Piped is a privacy-respecting open-source alternative frontend to YouTube.
  
  I'm open-source; check me out at GitHub.
- Yep. If you play geoguessr.com or others you wont find it that surprising.
- Yep. If you play geoguessr.com or others you wont find it that surprising.
The tweet: (Is the preview working for you? For me, it’s not).

The game is called geoguessing and those who do this regularly are crazy good at it, taking into account the kind of trees you see, where the sun and shadows are, even the color of the dirt and the pavement.

Tom Scott did something similar and was frightened too: https://www.youtube.com/watch?v=cGqEBvlmFAQ&pp=ygUSdG9tIHNjb3R0IGZvdW5kIHVz
- important second frame for context!
  
  & no it isnt. quite sure twitter broke link previews a long time ago alongside guest accounts.
  
  I didn’t find that in the Twitter UI and wondered why OP thought it was an AI. Thanks for sharing.
  
  Andrew Gao why are you still on the fascist site
  
  which llm does he use
- "A couple of trees..."
  
  And a body of water, and a road, possibly some mountains... (smh)
- The embed works for me
- https://www.youtube.com/@GeoWizard has a couple videos in a series where he guesses historic photo locations quite accurately too.
  
  Here is an alternative Piped link(s):
  
  https://www.piped.video/@GeoWizard
  
  Piped is a privacy-respecting open-source alternative frontend to YouTube.
  
  I'm open-source; check me out at GitHub.
This Just In: Most photos uploaded to the internet are not stripped of their metadata, and one of the common things kept in metadata is... (drumroll please)... your GPS coordinates.

This is a lot less interesting than it seems to be at first glance, imho.
- Literally just after talking about how people are spouting confident misinformation on another thread I see this one.
  
  Twitter: Twitter retains minimal EXIF data, primarily focusing on technical details, such as the camera model. GPS data is generally stripped.
  
  Source
  
  Yes, this is a privacy thing, we strip the EXIF data. As long as you're not also adding location to your Tweet (which is optional) then there's no location data associated with the Tweet or the media.
  
  Source (9 years ago)
  
  People replying to a Twitter thread with photos are automatically having the location data stripped.
  
  God, I can't wait for LLMs to automate calling out well intentioned total BS in every single comment on social media eventually. It's increasing at a worrying pace.
  
  Right? And also, that gps data is often not stripped does not vitiate the legitimate concerns that models like these can and will be used to dox people like this. It’s an interesting and novel attack. We can hold multiple things in our heads at once.
  
  I mean.... that's pre-musk information
- I mean, yes, but that's not what they're doing.
  
  https://arxiv.org/abs/2307.05845 https://github.com/LukasHaas/PIGEON
  
  It's a Stanford project that does what it looks like is happening in the screenshot.
- @SnotFlickerman@lemmy.blahaj.zone @CoderSupreme@programming.dev
  
  Some digital cameras and phone cameras can also embed the GPS coordinates in the pixel data so that even if you delete the EXIF metadata the GPS location and device serial number are still present in the image. Many document printers also embed device serial number and other data on printed documents by using nearly invisible dot encodings.
  
  Ah shit. Any easy way to determine if your camera's doing that? Would that normally be in manufacterer specs?
  
  Don't use propritary camera software then, got it.
  
  That's crazy. Just read this and I'm just mystified
- Software that doesnt store private metadata
  
  grapheneOS cam
  
  opencamera (not by default!)
  
  KDE spectacle
  
  ~~android~~ GrapheneOS screenshots
  
  android screenshots
  
  I think I have read that on some versions it can store the app's package name in the metadata. Not sure if that counts private but if and when it does so, it's good to be aware of
- Pretty sure Twitter strips it out by default.
  
  What about X?
- I’m sure most people who would put this to test would strip that data or screen grab the image to do the same thing…. If you know about meta data, so does a large amount of other people mate…
  
  The people would be labeled as a fraud very fast if this wasn’t actually a real thing dude.
- I think Lemmy strips it, right? That's why pictures were uploading sideways for a while?
  
  Lemmy does not remove exif data (unless the code has changed), you need to remove it yourself (also a good practice in general)
- GPS coordinates in metadata isn't common
- Yeah, disable gps metadata in your camera settings. Wondering why it often is default on?
  
  Because people that don't care about privacy find this to be a nice feature.
  
  There are gallery apps that let's you sort by location and it's nice if you want to search for the cool thing you saw once again.
- So it has nothing to do with the trees?
4channers have been doing that since a long time .
.
this is extremely scary if true. are these algorithms obtainable by every day people? do they work only in heavily photographed areas or do they infer based on things like climate, foliage, etc? I would love some documentation on these tools if anyone has any.
- If I'm the dev, I would scrape off Google Street View with cords as data source.
  
  That seems to be how they did it, as they returned a location on a highway, which isn't featured in the picture (the dirt road itself probably wouldn't be on street view).
- https://github.com/LukasHaas/PIGEON
  
  https://arxiv.org/abs/2307.05845
  
  Basically a combination of what the game geoguesser does, and public geotagged images to be able to get a decent shot at approximate location for previously unseen areas.
  
  It's more ominous when automated, but with only a little practice it's easy enough for a human to get significantly better.
  
  EDIT: yup, looks like this is the guy from the Twitter: https://andrewgao.dev/ and he's Stanford affiliated with the same department that made the above paper and system.
  
  Are you sure? The paper you linked mentioned the model beating a top geoguesser player six times in a row.
- There are tons of machine learning algorithm libraries easily usable by any relatively amateur programmer. Aside from that all they would need is access to a sufficient quantity of geographically tagged photographs to train one with. You could probably scrape a decent corpus from google street view.
  
  The obtainability of any given AI application is directly proportional to the availability of data sets that model the problem. The algorithms are all packed up into user friendly programs and apis that are mostly freely available.
  
  It might be easier to train the AI to the specific things Geoguessr players have collected as signs that give away a location instead of letting the AI figure all those out again.
reminds me of geowizards episodes geolocating vacation photos for fun. this one was insane, similar in detail to the photo in the tweet
- Here is an alternative Piped link(s):
  
  this one
  
  Piped is a privacy-respecting open-source alternative frontend to YouTube.
  
  I'm open-source; check me out at GitHub.
It really isn't that hard if anything like a silhouette of mountains are in the background and you have a couple of rough hints that give you an idea where to start or how to narrow down possible locations, no AI needed.
- You're misunderstanding the post. It's not about whether or not someone could guess your location from a picture. It's about the automation thereof. As soon as that is possible it becomes another viable vector to compromise your privacy.
  
  And you misunderstand my point, it always has been a way to compromise your privacy. Privacy matters most in the individual case, with people who know you. If you e.g. share a picture taken at your home (outside or looking out of the window in the background) with a friend online you always had to assume that they could figure out where you lived from that if there were any of those kinds of features in there.
  
  Sure, companies might be able to do it on a larger scale but honestly, AI is just too inefficient for that right now, as in the energy-cost required to apply it to every picture you share just in case your location might be useful isn't worth it yet.
  
  If I ever upload photos publicly, I will add a background blur first
That photo was more than just some trees
what should I do if I was already expecting this level of surveillance
- @match@pawb.social @CoderSupreme@programming.dev
  
  What should you do about surveillance technology? Ask a Amish hacker!
It's just sourcing data from Street View or similar. Not that scary. If it picked you out of a crowd in a randomly sourced image from that area, then it'd be scary.

You've viewed 87 comments.