Hi all. We've been playing a silly little game with my wife lately: we send each other messages about some topic we never talk about and then wait for ads related to our conversation to start showing up in Instagram. As of the last month, they never fail to show up.
I think you partly may just be biased by happening to notice ads more if they fit the topic you sent to your wife or being lenient in deciding if ads meet the categorization. And if you dwell on an ad because it seems to match, you may get more similar ads.
Here is how I think you could design a more robust (but less fun) experiment:
- Come up with a bunch of topics, write them down on slips of paper, put the paper into a hat
- Each Monday, draw three topics from the hat, send some WhatsApp messages about the first, Messenger messages about the second, and don’t discuss the third. Don’t put the topics back in the hat.
- If you see any ads relating to one of the topics, screenshot them and save screenshots to eg your computer with a bit of the topic
- Separately, record which topic went to which platform
- After doing this for a while, go through the screenshots and (each of you and your wife or ideally other people) give a rating for how well the ad matches the topic. To avoid bias, you shouldn’t know which app saw the topic.
- Now work out average ratings / the distribution across the three products (WhatsApp vs Messenger vs none) and compare
A simpler protocol to realize that the Baader-Meinhof phenomenon is probably what's happening:
- pick said topic, something you never cared about before, talk about it but don't write any messages containing it; - for 1 month record every ad you see about it; - send a message about the topic; - for another month, record every ad you see about it
Comparing the number of occurrences will tell you what is happening.
It also has to be a topic that advertisers would pay to target you with. You can't talk about something super obscure that advertisers don't care about - like steam engines.
Thanks for this!! Now my email is full of offers to buy historical steam engines, steam engine parts, and engineer hats. Amazon is even advertising a subscription for coal deliveries!
More likely, ads about games on steam.
Your inbox is now full of ads for model trains and hobby stores.
There are assuredly advertisers for steam-engine-adjacent products. Memorabilia, experiences/outings, conventions, models, books, artwork, games.
You're saying to record ads you see about it on TV or something? (Just to eliminate the "My computer is secretly recording me" angle)
The problem is, your smart TV could be spying you too, if it's capable of voice commands or videoconference. If you discuss sex toys near it, at least some related keywords could make their way to targeted advertising.
My wife and I routinely use ad blockers, private browser windows, browser profiles, and try to use as little ad-supported products as possible. This doesn't stop targeted advertising, I guess because most devices we use connect through the same IP. A couple of days after she starts looking up a city we want to travel to, I'll start receiving ads from airline companies or travel agencies, and even tours/cruises to said city/region. Fighting tracking and spyware is nearly a lost cause unless you become a digital Amish.
Smart TVs in general use IP address to try target devices across households, which is against the privacy policies of a lot of ad tech providers because IPs are not redactable/resettable by an end-user.
The best way for small ad tech providers to compete with "big tech" has been to cross lines that the bigger companies won't cross, this is an example for why there are a lot of profitable ad tech companies in the connected TV / video ads space.
Even if you use a VPN, the TV itself likely has a unique ID for ads, so someone just needs to see one request with both the true IP and the unique device ID and then remember that for the future. It's all very shady. TVs are very far behind the level of user control that phones and browsers provide because there's less scrutiny and its more fragmented across manufacturers (all of which want to get in on ad tech).
You can usually find some opt-out of the identifiers if you dig deep enough into the menus, because multiple laws and regulations require them.
The first thing I did when helping my parents set up their new LG OLED TV's at xmas was to disable all the ads and tracking. It's exhausting how much pressure they put on you to opt-in, and how many layers there are, constantly implying the TV will be nonfunctional without it.
But sure enough, it works just fine with no ads, no "free tv channels", and no voice functionality.
Have you ever checked back to see if updates had re-enabled some of those or introduced new ones? You'd hope they'd let you know if they started getting ads all the time, but the tracking stuff is much less obvious.
LG will send out updates that require you to accept new license agreements when you turn the tv on next. It’s very obvious about what they are tracking but very obnoxious in pointing it out. The parents that OP refers to probably just clicks accept all and moves on.
We have an LG tv and one of my family members hit accept all after an update and now my remote listens to us. To fix this properly I would need to factory reset which loses all of our streaming settings. I actually don’t because I have a separate ISP only for our TVs so there’s a bit of separation between our streaming use and phone activity
I'm talking about every occurrence that might be pushed, so it's the TV ads, webradio ads, search suggestions, ...
> pick said topic, something you never cared about before, talk about it but don't write any messages containing it;
This does not work. How did you come about the topic? Answer: it was in your brain, because advertising, trends among your peers and social connections, online trends real or astroturfed, etc.
That's why you end up with people thinking their phone is "listening" to them.
The Brain -- famously incapable of organic thought.
If anyone wants to try this, a friend sent me one link to a device called Levo which does “herb oil infusion” aka it lets you make weed brownies easily. I clicked the one link my friend sent and now I get ads for Levo constantly in my YouTube adroll. Though I should say this is obviously on Google’s ad network specifically and I have no idea if this applies to other networks.
> I clicked the one link my friend sent and now I get ads for Levo constantly
Yes, this retargeting is 'expected' and is not surprising. This is completely different from what OP is describing.
No I realize the differences. I am just saying that if someone wants to see if their encrypted app is resulting in ad serves, they could try discussing this product in encrypted chat only, using the methodology described in the comment I was replying to above.
These kind of stories are always fun to analyze using the Socratic method.
-How did you learn about the product?
-Have you ever searched for it?
-Did a friend of yours tell you about it? Do you think they searched for it?
-Are a lot of ads for it playing on TV channels you like? Could instagram know you like those TV channels?
-Is it something your neighbors got? Do you think there has been a spike in shipments of this product to your neighbors?
Eventually people start to “get” that scanning the text of messages is way more helpful for humans than it is for computers. They’ve got other data they can use.
I also have a theory that sometimes when people say "we were talking about <product> and I never even typed it into my phone or anything, and suddenly I started seeing ads for it the next day!" that the person in the story may not have looked up <product>, but someone else in the conversation might have Googled it or browsed an Amazon listing or something and they have some kind of connection in their ad profiles whether it be that they know these 2 people interact a lot, they're in the same geolocation, same wifi network/IP address, etc.
I'm just not convinced of the always on microphones in phones listening for and processing every single thing considering how much battery drain that would cause, whether the processing is done on device or they're sending all that data to a server to be processed.
> I'm just not convinced of the always on microphones in phones listening for and processing every single thing considering how much battery drain that would cause,
We know our phones commonly listen for "smart assistant" prompts and audio beacons (https://www.nanalyze.com/2017/05/audio-beacons-monitor-smart...), so they don't seem to have any trouble abusing the mic access. Honestly without a whistleblower, there's little hope of really understanding how much data a company collects and what they use it for. At least sometimes we can see it in their own marketing materials. For example, https://advertising.roku.com/resources/blog/insights-analysi... tells us:
"Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution. These snapshots are scanned through a database of content and ads, which allows the exposure to be matched to what is airing. For example, if a streamer is watching an NFL football game and sees an ad for a hard seltzer, Roku’s ACR will know that the ad has appeared on the TV being watched at that time. In this way, the content on screen is automatically recognized, as the technology’s name indicates. The data then is paired with user profile data to link the account watching with the content they’re watching."
None of the people I know who use those devices knew that was happening, but the info was out there at least. When so many people are watching everything you see and do and say who can ever know what every company is doing or what the source of any one ad is?
> "Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution. These snapshots are scanned through a database of content and ads, which allows the exposure to be matched to what is airing.
There were users under the impression that Roku was unaware of the content it was displaying? Like 4K snapshot or not, if I know a user is watching an NFL stream, I know that ad played.
> There were users under the impression that Roku was unaware of the content it was displaying?
Sure, they expect Roku would know if they launched Disney+ or Netflix, but not that they would knew exactly what movie you were watching or what specific scenes you viewed and for how long. Same with personal videos cast to your screen via roku. It's pretty reasonable they'd know you were streaming content from your other devices, or which apps you were using, but less reasonable that they'd be watching over your shoulder taking notes.
I don't think it can be observed because it's likely a bug. It freaks users out. People uninstall the app and start threads like this.
It's sort of like getting mugged once and then setting up a camera in a bunch of alleys to prove that muggers exist. You can even set up a camera of yourself running into dark alleys every night, but the odds of reproducing a mugging is still extremely low.
There's a certain kind of precision that convinces me it's real though. Precision is common. I look at a book on Amazon, and a FB ad for that book appears.
But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.
> But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.
Aside from random coincidence, I could see this happening if you provided your personal information (especially email) for the loan application. It could have been shared to multiple underlying lenders alongside a data vendor who ultimately provided interest targeting (which can include car models) to an ad network.
Getting an ad for that specific model could also have been due to other online activity, such as checking the KBB.
> Getting an ad for that specific model could also have been due to other online activity, such as checking the KBB
I suspect op may have researched the car model and got retargeted: some ad networks keep track of specific products you've shown interest in (not generic interest-areas like Google ads) and track you via a cookie. You may be visiting a completely different site that uses the same network, and get ads on the exact product you've spent minutes reading about.
Ah thanks, I didn't know that was possible. It did have email.
If it's integrated into such activities, it might actually be a good explanation for all the other similar scenarios blamed on WhatsApp.
This is why incognito mode has the warnings it does about “people can still see your screen”
No matter how secure the platform, if you apply for a loan through it, the loan provider will “know” you want a loan, and happily sell that data.
> the odds of reproducing a mugging is still extremely low
I did that and I got a mugging on camera. The attacker was convicted.
This is facebook. They've been caught recording people and selling that for advertising, they deny it because technically your audio is transcripted not recorded and they can send only some keywords back so whole conversations aren't sent back to them.
Plenty of research and news stories about this if you care to search. The speculative part of my comment is about the transcription which I'm speculating because of their fervent denials despite evidence which technically their wording in their denial statements is correct.
If I had to guess, your whatsapp messages are e2e secured but keywords are sent to facebook when they match some condition. So if you message "happy birthday" to someone, they won't see that but the fact that the keyword "birthday" was found even if the word isn't included is sent to fb. That way they can say they're not snooping your messages.
We did an experiment. We talked about how hard it is to find highlighter yellow nail polish. Nobody in the house is a purchaser of nail polish nor did we do any searches for highlighter or yellow or nail or polish. A day or two later my wife got an Instagram ad for highlighter yellow nail polish. It could have been a coincidence or maybe they were listening.
Or maybe some combination of things we did previously led naturally to thinking about that yellow nail polish. I'm thinking about something like the trick where you ask somebody a bunch of addition problems that have 14 as the answer (what's 10+4? 2+12? 3+9? etc...) then ask them to name a vegetable and they will almost always say carrot.
"I remembered the time I was in my fraternity house at MIT when the idea came into my head completely out of the blue that my grandmother was dead. Right after that, there was a telephone call, just like that. It was for Pete Bernays--my grandmother wasn't dead. So I remembered that, just in case somebody told me a story that ended the other way." -- Richard Feynman, "Surely You're Joking, Mr Feynman"
Someone did this trick with me in the '80s, but the numbers were to sum to 13. I still said "carrot", however. Wish I would have thought to use a different number than 13 when I tried it out on others.
Do you have the setting that turns on end to end encryption? I thought by default it was off (and always off for group chats) ?
Are you thinking of Telegram? There is no such setting in WhatsApp.
Am I missing something? Why carrot?
I’ve always assumed it’s because of the association of 14 with 14 carat gold.
Interestingly, I've always seen it as any number of quick math problems, not just ones that equal 14, and it consistently works too.
12 year old me was not much of a scientist. I don't think I ever tried any number other than 14...
It's an association thing. I thought carrot too. Maybe it's because of the tedious nature of processing. Maybe it's because the orange bar at the top of the screen makes me think carrot.
Ah yes, a bunch of anecdotes in reply.
you’d think that a educated group like this would understand that anecdotes are not sufficient evidence for something like this.
This is a message board about tech... comments aren't welcome anymore - we need evidence to participate?
I think it's interesting when a bunch of people chime in and say "Hey, yeah, I had some crazy thing happen to me, I'm in tech and understand how this stuff works, and there's a very small to zero chance this happened through some other parallel construction by the tech company, they just straight up listened to my conversation and showed me an ad".
This is what kicks off a handful of you to go packet sniffing and write up a blog post looking for this behavior. So yes, evidence is welcome but it doesn't seem like we are quite there yet.
In general I agree, but I think when you are being explicitly asked for a "source" in response to an allegation that it is settled that FB has been "caught recording people," I would prefer to not have anecdotes in reply.
I mean… this is a conversation, not some sort of formal debate? Someone is telling you "hey, this happened to me," and your response isn't "have you considered this other explanation?" but rather "I won't discuss this further unless you do a bunch of research and present the results to me."
I'm happy to continue discuss it (not sure where you are getting the idea that I'm not from), but I think it is also fair to point out when someone asks for a source to a claim that something has been proven/caught and instead the replies are a bunch of personal stories where people think something is happening.
To me, that is indicative that, contra the original claim, no such thing has ever been proven.
Is it verboten to say that?
It's not verboten. But, candidly, it is kind of rude. What's the difference between someone at, say, the EFF "proving" something happened by running an experiment and writing about it publicly, and someone on Hacker News doing the same?
I disagree that it is rude to point out something is an anecdote.
The proof has to do with the technical details, not the authority figure posting it. If someone from the EFF wrote a blog post with the same content as these HN posters, I would be similarly dismissive of this as "proof."
They aren't saying "this happened to me"
They're saying "facebook has been caught multiple times doing this", which is not a personal anecdote, but an assertion that proof exists and is available.
So where is it?
I'd prefer to say whatever I want. Must have filed it in the wrong place.
You can say whatever you want, doesn't mean I won't criticize you for it or downvote you.
And I'll flag if you violate HN guidelines, which you have.
Cool! Which ones?
> Edit: Holy fuck there are (paid?) Facebook shills all over this like flies on shit.
From the HN guidelines: 
> Please don't post insinuations about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email firstname.lastname@example.org and we'll look at the data.
"Data" is just the plural of "anecdote", so why would they not be?
> packet sniffing software 24/7 to catch proof
I have to say, the fact that no one has done this makes me doubt it's real.
As hated as Facebook is, there's tons of motivation for people to catch them out with undeniable proof, and yet no one has done it.
Don’t iPhone have an indicator when the mic is recording? Also, this feels like it would be insanely easy to test by capturing the payloads sent to FB; you could even use something like Charles Proxy to do it.
FB having access to microphone makes sense for plenty of other completely innocent reasons (for example, if you can record a video from inside the app).
If this was actually true, I can’t help but feel that someone would have proven it technically by now instead of relying on these types of self experiment and anecdotes, especially given how commonly this is touted.
> Don’t iPhone have an indicator when the mic is recording?
Just tested this out, zero indication that the mic is hot on a recent iPhone with up-to-date software when recording a voice memo.
Edit: There ya go... downvotes for saying the iPhone has no indicator when you record audio.
Works on my iPhone.
The screen was off when the event happened...
1) Does your iPhone still record audio when the screen is off?
2) Can you see the audio indicator when the screen is off?
3) If a background app starts then stops recording audio while the screen is off, would you have an indicator that it recorded audio?
> 3) If a background app starts then stops recording audio while the screen is off, would you have an indicator that it recorded audio?
Yes. iOS displays an indicator if an app has recently used the mic.
> Note: Whenever an app uses the camera (including when the camera and microphone are used together), a green indicator appears. An orange indicator appears at the top of the screen whenever an app uses the microphone without the camera. Also, a message appears at the top of Control Center to inform you when an app has recently used either.
Have you tried looking at the screen while using the Facebook app?
Also, I feel like the goal posts are moving quite fast in one direction.
Goal posts? Is this a competition?
The phone was sitting between two people having a conversation, one of them "swiped it open" meaning it was off to begin with, then was immediately displayed an ad for that conversation, and upon hearing this the tech-savvy person in the house understood what happened, confirmed it with the mic access to facebook in the settings, and then disabled the behavior.
>Goal posts? Is this a competition?
Considering the original claim was "zero indication that the mic is hot" and now it's "zero indication that the mic is hot if the screen is off", I'd say that the goal post has moved considerably.
But if you want to know if Facebook is listening to you through the iPhone microphone, you should probably look at the screen for the indicator. iOS apps can't start recording on their own in the background, there's no API for that. If they are listening to you, they'd have to start the audio session in the foreground, which would allow you to see the indicator.
(Unless you believe that Facebook is using some kind of a private system API for this and is passing through the App Store checks)
Just a few things to note here...
I wrote the original "Wife swipes open the phone" comment, so that's the context you seem to be missing. Sure you can see a little dot on your phone when YOU run some experiment today and look for it, but was that indicator available in the exact situation where the targeted ad was displayed? No.
Also, this incident happened in the past and we know there have been dramatic API changes on both Apple and Facebook products. The limits of the API today don't reflect the capabilities that were available to developers in the past. I doubt Facebook is hacking the App Store process to use hidden APIs. It was probably just available in the past and my wife granted the facebook app complete access to the mic, so they took what they wanted.
I'd make sure to disable that permission today too, just in case.
One last thing is I just opened my iPhone again and hit record. I honestly didn't see the tiny orange pixel at the top of my phone until you pointed it out. I was basically looking for the green video indicator light to show. So I'm technically wrong about NO indication, you're welcome.
The GP didn’t say they were using iPhone. The Facebook app on Android has been known to record audio even when running in the background
That's not possible without permissions these days, same as iOS. In Android 13, background processes have no mic or camera access whatsoever.
It was an iPhone, and there is no indication from the phone when the mic is recording.
Android has a notification now when the mic is recording and has had the ability to deny microphone and lots of other access for a long time now. Thankfully it sounds like iOS is catching up
> My wife and MIL sitting at the table talking about a unique topic with an iPhone running Facebook sitting in front of them
They explicitly said they were using an iPhone?
And don’t forget the battery. A mic recording 24/7 would drain the battery much faster and would not go unnoticed unless specialized hardware is used like the one for “hey siri” and “ok google”.
Doesn't the Facebook app drain your battery?
Try any voice recording app for a few hours, now use the facebook app for the same number of hours. The impact on battery life of a mic actively recording alone is very noticeable, so noticeable that your phone has a special chip just to recognize patterns similar to “hey siri”.
It would not need to record high quality audio and could maybe even take advantage of that same chip? Just thinking out loud here - smaller, crappier audio would also be easier to send back unnoticed (or instead of even recording audio it could be transcribing on the fly to a text file using something super basic and easy with low accuracy)
> This way too I can troll people on the internet when they suspect this is happening and I can say "bUt ThE bAtTeRy LiFe!" to defend Meta: my corporate overlord business daddy.
Please, stop with the sarcasm.
Okey, let’s say they manage to record us without a huge impact on our battery life. Now, how do you send these recordings or even the extracted keywords from a popular app, a client installed on devices controlled by the users and susceptible to reverse engineering and network traffic analysis without anyone noticing it?
It’s just too much risk and they don’t even need it, see my relevant reply here: https://news.ycombinator.com/item?id=32950204#32953216.
Great question! I'd love a peek at their source code to figure out these answers too!
I swear that comment is sarcasm free.
This isn’t evidence. Even if Facebook was not listening to your conversations, there would be some rate in which you would just randomly be served an ad related to a topic you were discussing. There needs to be evidence that it is happening at a rate too high to be attributable to chance.
Sounds like a good way to engineer it... anything to improve the bottom line even if insanely-targeted ads only trickle out to users. How about limiting who sees this feature to also limit the risk of being detected? Maybe just do it once a year to everyone, or never to specific "tech-savvy" users that they have completely profiled.
So if were walking past a playing blaring the piña colada song, I’d see ads for alcohol and umbrellas? If coworkers around me are talking about activities I’m not interested in, I’d see ads for those?
They have far better information that shows I’m not interested in alcohol or extreme sports. Audio in the background is so low-signal that it isn’t worth showing ads based on it.
Even just transcribing speech something accurately is not something that was possible until the last couple of years. Yet this conspiracy theory has been around for a decade or more.
It's ironic that you're asserting this by replying to a parent message which explains why this probably isn't the case.
Yeah, it's probably not a coincidence that your wife is talking about X and is recognised by Facebook to be in a group of people that are interested in X.
Would they argue that the message goes first into a neural network that outputs potential product labels based on the message and that it all happens client-side? That's the only way I see it possible for them not to violate the E2EE.
An important thing you’re missing is the control. You should record every ad.
You need to know if you got 3 topics of ads every day and 1/3 of them are related to that secret topic, OR if you get 300 topics every day and 1/300 are related to that secret topic. If it’s the former, it’s suspicious, if it’s the latter, it’s way less suspicious.
The control is the topic you pick that you don’t discuss on WhatsApp or messenger. The idea is random differences between topics should average out over many trials.
I still think it’s important to consider the volume of topics that show up that aren’t being explicitly looked for.
I’ve gotten Instagram ads for ketamine and I absolutely am not discussing or searching for it. I probably wouldn’t even notice a random topic if it’s not so absurd. I’m sure there’s tons of topics I don’t even realize I see.
The reason for the control I suggested is to try to counteract the bias people have towards noticing things they recently thought about. I think the question of what adverts people are shown in general is interesting but quite separate.
You can’t come up with the topics yourself either, because the topics you will think up are different based on your demographic / type of person you are, and ad networks basically try to guess that.
You can if you first come up with a list of topics, and then once you have that list, randomly assign each of those topics to one of the three categories.
The idea is that you may discover the topics you don’t talk about that week still come up as much as those you do.
There will be some bias in what they choose to screenshot right? Meaning, the unrelated topic might show up in the feed but they don’t screenshot it because it doesn’t fit the narrative?
Also, what we’re interested is if the text changed what was shown. If I saw ads for X last week but didn’t notice them, then spoke to a friend and noticed them and took a screenshot, it would appear to confirm the theory. Even though I was always seeing ads for X.
Ultimately, I don’t think people who are convinced of this theory will change their minds so it’s a moot point.
Yeah, that’s the biggest flaw in the experiment I proposed, I think. This is the reason I try to have the hopefully independent grading of ad-topic-relevancy blinded to which system the topic was communicated over. It may be that one sees many vaguely related ads for the WhatsApp topic due to some selection bias but a similar number of actually related ads.
I call this the gaslighting explanation: “no, it definitely wasn’t the messaging product owned by an advertising behemoth. You must have searched for it somewhere else.” Obviously the OP remembers where they’ve seen the product. If they has seen the product elsewhere, they wouldn’t have started this thread!
(wrt some comments in this thread)
Is it so hard to believe that Meta is snooping on WhatsApp conversations? Meta, a company of unprecedented size that was built over monetizing your private data? A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?
Someone from this community, which generally means educated, tech-literate and sensitive to these topics shares a perfectly plausible observation, of something that has been experienced as well by plenty of other folks, me included; and then some people come and try to make up the most convoluted explanations (candy boxes from Kazakhstan just happened to be trending that specific day, nothing to see here, move along!) to this phenomena and try to shift the blame away from Meta. Why do you do this? Are you Meta employees? A PR agency they hired?
It's just baffling. Apparently some people DO want to be abused.
Plot twist: we all get ads about candy boxes from KZ now.
For anyone who has ever worked at a FAANG like company in the last decade, yes, this is actually very hard to believe.
Despite the shady image they have, these companies go to great lengths to avoid doing shady things (because ultimately it’s bad for business). Not to mention the hundreds of tech employees that would have to be involved and keep quiet in this type of “conspiracy”. It’s incredibly unlikely, I truly believe that.
I can imagine you haven't been involved in anything illegal, but I'm sure you've aware of Meta's documented track record of coordinated illegal actions. Do engineering teams just fall head first into a bucket of 2FA phone numbers and start using the data for ad targeting, and nobody bats an eye from the legal department to product managers? Or are they hypnotized to build services for biometric data collection without consent? Nobody does anything nefarious, but their collective actions which benefit the company just end up being illegal, again and again?
The tech companies you work for do often engage in illegal activities, and some of your collegues are complicit. I'm sure it is an uncomfortable thought for some of you, but this is all part of the public record.
I think there's a natural bias people have to want to NOT see the bad in the organizations that pay them $$$$.
This is certainly true in a lot of finance.
I completely agree (as another employee of FAANG). It's ridiculously hard to do anything against policy once it's set, and trust me, the policies are set. Media overplays a lot of things which aren't just there.
The sad reality is people are very predictable, even with basic data.
The employees obviously are told the functions and APIs that they are implementing have a completely legit use case. That is not hard to believe at all and was the case in Cambridge Analytica scandal, for example.
"bad for business" leads to systems that do unexpected things. For instance, on-device generate identifiers for any image sent, and send the identifier out-of-band. This helps catch child pornography.
I can imagine the same thing done for text. The text might be encrypted, but interest keywords might be generated on-device and sent out-of-band.
The PRISM "conspiracy" was very shady and involved probably hundreds of employees. And if they have hushed people punching holes for the government, it's not crazy to think some data could leak out into other parts of their pipelines too.
I'm not claiming this is real, but I agree with GP.
Let me start by saying I have no idea if Facebook is reading my encrypted messages or whatever. However, I will say that in my experience, whether something is bad for business if it gets discovered is usually not a concern for large corporations, if the thing being done makes them more money. Because everything is just a balance sheet.
For an example from non-FAANG companies, see illegal dumping of toxic waste by chemical companies, such as DuPont and PFOAs . Despite knowing what they did was illegal, the math works out -- products with PFOAs were something like $1 billion in annual profit, and even when they got caught the fines and legals were a fraction of that, spread out over many years.
So I personally believe these companies 100% would do shady shit if it increases their profit margins. And why wouldn't they? There is no room for morals in capitalism, and the drawbacks are slim.
The most plausible explanation is that people are just easy to predict. Might be tough to admit, but that’s actually a much simpler explanation than Facebook having a back door into our messages, which are end-to-end encrypted.
As others above me have thoroughly explained, there are numerous ways Facebook could figure out what you’re reading about/listening to/viewing on the internet, which ultimately drives what you are chatting with your friends about. Reading your messages would actually be the most difficult and low fidelity way for them to try to mine this information. They can just see your entire browsing history and extract from there, since the majority of website have a tracking cookie that in some way phones home to Facebook.
Seriously? Facebook knows their internal thoughts well enough to guess what topics they would choose when trying to pick something they "never talk about"?
If FB could do that, then FB would realize that these topics are not actually products they are interested in, so they wouldn't be showing ads.
FB can show you one ad per month about some special steam train ride and maybe you’ll scroll past it without a second thought but then maybe one day you’ve been watching a film about the golden age of steam or you’ve been talking to a friend about it and then you see the ad and remember the film or conversation and you think ‘Crikey, how on earth did Facebook come up with that as!’
Facebook show (many people) a lot of ads and they only need to get lucky a few times for you to think it’s uncanny. All the non-unique times an ad was not relevant will have blurred together and so you won’t easily remember that they were the vast majority of the ads you see. A little bit of feedback (eg if you dwell on the coincidental ad) may cause you to see more related ads.
> The most plausible explanation is that people are just easy to predict. Might be tough to admit, but that’s actually a much simpler explanation than Facebook having a back door into our messages, which are end-to-end encrypted.
I disagree with this to the extent that I would say the exact opposite is true.
Facebook (and others) have proven time and time again that they cannot correctly predict user behavior by locking out or banning users who actually did nothing wrong (because their algorithms predicted that the user was breaking terms of service or might be planning to). This happens over and over, even in cases not so complex as the "photos of my child to send to my doctor".
But on the flipside, Zuckerberg has been documented saying one thing to the public and exactly the opposite in private. Heck, Facebook has had memos and emails leaked where they talked about how they would say one thing in public (and to regulators) while doing the opposite secretly.
I believe that Facebook cheats and breaks agreements (and laws) in multiple directions all the time, often willfully. They've even been caught cheating their own ad customers by intentionally overstating the effectiveness and target accuracy of their ads.
It's hard to believe because I worked there and worked on this stuff (data and ML side) and know that they aren't.
>I worked there [...] and know that they aren't
I know that, unfortunately, this is what puts bread on your mouth.
But, really? Are you suggesting that Cambridge Analytica didn't happen? Did we all hallucinate that?
You guys jumped the shark already. These attempts at damage control are laughable.
It's not what puts bread in my mouth though. I don't work there now and don't work on anything related to ads or messaging.
CA happened but that has nothing to do with this. The policies that allowed CA to collect data were very public, Zuck enthusiastically talked about the open knowledge graph all the time prior to CA, much to the dismay of many investors. Facebook didn't lie in that case, they misjudged the potential to misuse open data access, and the potential for negative PR as a result.
By analogy, it's like you're the landlord of an apartment building and you don't lock the front door. You put up a huge sign saying "this door is unlocked, everyone is welcome". You sell ads for your building embracing the unlocked door policy. Then somebody walks in and photographs all the tenants through their windows. Suddenly people who didn't care about the unlocked policy are now very angry, and rightly so. But this is completely different from collecting data, lying about it, and operating a massive conspiracy to conceal the data use from literally tens of thousands of employees who would normally be able to see it.
Being educated and tech-literate means that you should try to think more critically than "Facebook bad." You brought up Cambridge Analytica as your scandal of choice, which is the most newsworthy scandal, but the one where Facebook is the least guilty. Everyone had the same access to the APIs that Cambridge Analytica did and Facebook had shut down those APIs before the story broke out. Acting on instinct will only lead to regulation will won't be effective at stopping what you're trying to stop, cause needless side effects, and undermine your political credibility to push for changes that solve the important issues.
We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.
Facebook has a deep culture of pathological lying. They lied to the FTC . They lied to WhatsApp and to the EU . They created an Oversight Board and then lied to it .
saying someone lied in the past is not convincing evidence that any arbitrary statement by them is affirmatively a lie
Meta controls the proprietary Whatsapp client software that decrypts your messages and they can have that decrypt and scan the messages for them and send back metrics and how often different words are used.
They can of course also have their app de-crypt and re-encrypt the messages to the key of a requesting third party like police or hired reviewers if certain keywords are used.
Authorities could also have Google or Apple ship a signed tampered Whatsapp binary to any user or group of users, like protestors, that uses a custom seeded random number generator so they can predict all encryption keys generated and no one else, including Meta, will know.
The variant of end to end encryption where third parties control the proprietary software on both ends, is called marketing.
As part of the Meta Companies, WhatsApp receives information from, and shares information (see here) with, the other Meta Companies. We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Meta Company Products. This includes:
- improving their services and your experiences using them, such as making suggestions for you (for example, of friends or group connections, or of interesting content), personalizing features and content, helping you complete purchases and transactions, and showing relevant offers and ads across the Meta Company Products
Popular theory is they can't see or store your messages, but can analyze them on the client and profile you (e.g. interested in brazil nuts)
What does “conversation “ mean in that text.
I can perfectly mean just the audio exchanges when both parties talk.
Also: E2E does not imply necessarily that they do not know the key.
It's possible that the client blindly fetches a mapping from keyword to ads, saw the keyword client-side, then requested the ad.
You'd indirectly reveal what those keywords are to Meta by which ads are being requested. If an ad for a sex toy is being requested, it's pretty obvious what the two parties are talking about.
Still information leaks the encrypted channel and the trust is broken.
Prove it. Open source the client and open the server for 3rd party apps to use.
> Is it so hard to believe that Meta is snooping on WhatsApp conversations?
Where's the evidence? I don't know what ethos "Hacker News" is supposed to capture, but surely it's not superstition?
Well, at least some people here are smart enough to know how to run disassemblers and packet captures. Clearly not everyone, but a few tech literate people.
> Is it so hard to believe that Meta is snooping on WhatsApp conversations?
for a lot of people, no
> Meta, a company of unprecedented size that was built over monetizing your private data?
one of many companies, however "meta" does have the advantage that you can opt out of them, mostly.
> A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?
CA is interesting as it started out as an academic study, which was consented fully. CA then went on to scrape people's public profiles, which often included likes, friends, etc. This combined with other opensource information allowed them to claim to have good profiles of lots of people, the PR was strong. Should FB have had such an open graph? probably not. Should they have taken the rap for everything evil on the internet since 2016? no. There are other actors who are much more predatory who we should really be questioning.
> Are you Meta employees?
I think you place far to much faith into a company that is clearly floundering. Its not like it has a master plan to invade your entire life. Its reached it's peak and has not managed to find a new product, and is slowly fading.
However, as we all think we are engineers, we should really design a test! but first we need to be mindful of how people are tracked:
1) phone ID. If you are on android, your phone is riddled with markers. Apple, supposedly they are hidden, but I don't believe that they don't leak
2) account, and account is your UUID that tracks what you like.
3) your IP. if you have IPv6, perhaps you are quite easy to track. even on V4 your home IP changes irregularly and can be combined with any of the above to work out that you are the same household.
4) your browser fingerprint. (be that cookies, or some other method)
5) your social graph
1) buy two new phones.
2) do not register them with wifi
3) create all new accounts for tiktock, gmail, instgram etc.
4) never log into anything you've created previously, or the fresh accounts on old devices.
5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"
> 5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"
Wait... If WhatsApp is really E2EE encrypted, why would any of the other steps be necessary? Dude and his wife can simply pick at page at random from a magazine in a store, never search anything online about it, start talking about it using WhatsApp as if it was something of great interest to them. If they start getting related ads, obviously something shady is going on. There's no need for new phones / new GMail accounts / etc.
> If WhatsApp is really E2EE encrypted, why would any of the other steps be necessary?
because you need to eliminate the chance of profiling by any other means.
Using the same phone as before means that the pre-existing profiles exist, which means that the relationship is already inferred. Because its trivially easy to track people, you need to eliminate all other variables.
As someone who has actually worked on end to end encryption at Meta, I can tell you I am not aware of anything where the company reads your WhatsApp messages - either in transit or device. The company takes fairly serious measures to ensure it cannot even accidentally infer such contents.
I don't know what is happening in this specific case. Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.
It's hard to convince people at this point because many have lost trust in Meta as a company, and I understand that. But I still find it stunning that so many people are making so many false claims without any actual knowledge to back it up.
Thanks for your explanation.
From using a VPN that logs all incoming and outgoing traffic (NetGuard) on an Android One device, I've noticied that the default Google keyboard gets in touch way too many times with some distant servers. Whereas, an open source keyboard from F-Droid, FlorisBoard, does no snooping and gets updated solely through the app store.
The third party keyboard apps are a big question for the OP.
Another consideration, there are companies that track and sell geolocation data. It's "anonymized" but so precise you know the street address a user resides at. It is not a stretch to consider "anonymized" retargeting from keyboard inputs.
I was dismissive of it in the past, as comments voted higher here are. However I've seen enough weird ads show up within minutes of making jokes about obscure topics that I suspect there is something going on.
The piece that might be missing here is third parties collecting signals, "anonymizing" them, and then ads get re-displayed through Facebook, Google, etc. It may not be the major ad platforms doing it directly. In theory this should be harder now with the iOS tracking restrictions.
For the skeptical, consider Avast's Jumpshot. Here millions of users thought they were protecting themselves when their raw browsing stream was being sold live to third parties. I They aren't the only company that has done that. https://www.theverge.com/2020/1/30/21115326/avast-jumpshot-s...
Proprietary encryption means users cannot verify or control the keys or the code that generates or uses the keys. The app can exfiltrate the keys or do any keyword processing on behalf of Meta as well which can include well intentioned features like forwarding paintext messages containing certain dangerous-seeming words to authorities or theoretically trusted third party review teams. Naturally they could also return -metrics- about frequency of word use back to Meta for ad targeting as well.
I too have been a champion of encryption and privacy at past companies only to have all my work undone and watch all the data become plaintext and abused for marketing by a new acquirer.
The only way end to end encryption solutions can avoid these types of abuses is when the client software is open source and can be audited, reproducibly built, and signed by any interested volunteers from the public for accountability.
Short of that it is really not that much different than TLS with promises Meta will not peek, at least not directly, today.
If they modified the RNG of person A's phone app during a forced stealth update, then shouldn't person B not be able to decrypt the message? Have you ever had an app update to Whatsapp that you cannot communicate with other people until you are forced to update? The alternative is that there is a vast internal conspiracy at meta that hundreds of engineers, and hundreds of ex-engineers are somehow silent on, which would be using 2 encryption keys, one that law enforcement can read, and one that the other end of the device can read. Isn't provable that Whatsapp the app is using the operating system level secure prng functions? If there was evidence of this, wouldn't it be great for a whistleblower to come out and make a killing shorting Meta's stock? Right now would be the perfect time to be kicked while they are down.
> then shouldn't person B not be able to decrypt the message?
The RNG example is a way to create keys that make it trivial for "C in the middle" with the RNG details to extract the contents. They are still valid, just not useful as keys.
The Juniper attack and Dual EC exploit is a good real world example of compromising an RNG for passive decryption, although Dual EC was designed to be like that.
Even with end to end encryption couldnt the app at the end also be just aggregating the data (or even transcribing audio) to send over separately?
Yeah though this is more likely to be detected with basic network analysis. Selectively compromising the RNG seed would be much harder to detect without source code.
Meta has repeatedly demonstrated they will do whatever it takes to capture user data. Kid VPNs, in app browsers, etc. Is it any surprise that people are deeply suspicious of any coincidences that arise from using a supposedly private channel.?
Given evidence at hand, it is hard to view Meta as anything but a bad actor.
> Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.
Isn't this kind of splitting hairs? Does it matter if text information came from a "side channel"?
It seems like the promise Facebook makes is that 'your communication using whats app is secure,' that's certainly my interpretation of what "end to end encrypted" means. It is a promise of security. That means text is sacred and even text sent to giphy should be privileged from the ad machine.
The question being asked here is not "is it end to end encrypted?" It's "are my communications secure?" End to end encryption is just one element of that security.
The thing is if it's a 3rd-party Android keyboard or similar that logs your messages then there's nothing Meta can do about this.
That's still Facebook's problem, no excuses. Facebook absolutely has the power and resources to lobby google and congress. Security teams at both companies will unequivocally agree that keylogging presents an extremely grave security risk that consumers are unlikely to understand the consequences of and therefore need to be protected from.
Imagine a hapless military professional/politician downloading one.
The problem is one of alignment. Facebook wants to monetize whatsapp and wants the whatsapp data. That's why there was a mass exodus to signal in the first place. Facebook was weakening the protections of the app.
Due to the alignment problem Facebook can't advertise whatsapp as the secure and private choice because they are actively working to make it less secure and private. That's why Brian Acton quit (leaving $$$$$$$ behind) in the first place.
I don’t agree that it is Facebook’s problem but I do think this is probably where a lot of data gets leaked that people don’t realize or think about.
In a perfect world sure Facebook has the power and money to do a lot of things. So do the other megacorps. They don’t do them, and you’re correct it is the misalignment of incentives to due so.
But Facebook doesn’t control what keyboard you use on your phone and if the keyboard is sending every message you type somewhere, they can’t do anything about that and they aren’t lying that they can’t read your messages.
Whether or not you believe that they do in fact harvest the message data is up to you. But certainly people using keyboards that harvest data is very plausible to me as a vector for this stuff.
In the other post in this thread, I link to a website that ostensibly has a method of warning for non standard keyboards. If "e2e" communication is part of a products marketing, do you think they have a responsibility to warn when that expectation might be violated? What about warning that text sent to giphy may be used for advertisement purposes?
If I were to summarize my entire thoughts on WhatsApp, it's that it advertises security (e2e), while they only make money from violations of the security. The behavior OP expects is exactly the behavior a person would expect from this set of alignments.
If a leak is able to be monetized (even if it is google harvesting keyboard data and selling it back to FB) do you think that would be punished or rewarded?
If this very same post were for signal, I think the response we might expect is concern and investigation, not a response of defense and deflection.
There was an article several weeks ago about how a "special master" tasked with understanding what data Facebook collects on you was stonewalled because "even Facebook don't know what data Facebook collects."
"we don't want to be accountable for any data except the data that's part of the download your data":
> Facebook contended that any data not included in this set was outside the scope of the lawsuit, ignoring the vast quantities of information the company generates through inferences, outside partnerships, and other nonpublic analysis of our habits — parts of the social media site’s inner workings that are obscure to consumers. Briefly, what we think of as “Facebook” is in fact a composite of specialized programs that work together when we upload videos, share photos, or get targeted with advertising. The social network wanted to keep data storage in those nonconsumer parts of Facebook out of court.
> Facebook’s stonewalling has been revealing on its own, providing variations on the same theme: It has amassed so much data on so many billions of people and organized it so confusingly that full transparency is impossible on a technical level.
> The remarks in the hearing echo those found in an internal document leaked to Motherboard earlier this year detailing how the internal engineering dysfunction at Meta, which owns Facebook and Instagram, makes compliance with data privacy laws an impossibility.
Facebook doesn't even want to know if the WhatsApp is leaking data.
That 3rd-party keyboard would also be able to log your Signal messages, so I don't get your point.
If the original post is true and Facebook is leaking message based data into systems that produce ads (3rd party or 1st party), they have a responsibility to diagnose and resolve the issue. Despite their responsibility to do so, they are not aligned with doing so.
Excuses like "the user did something bad" aren't productive.
A warning that the users expectations (secure communications) do not match reality (3rd party keylogged communications) seems like the minimum level of responsibility:
If WhatsApp derived information is being seen in advertisements, it is Facebook's responsibility. It is in Facebooks best (next quarters profits based) interests to not be responsible.
Are you absolutely sure that this is still the case? You say you "used" to work on it, but modus operandi for these companies is rugpulling protections like this as soon as nobody is looking
I feel quite confident based on first hand knowledge of code, system design, and the many, many privacy reviews we had to go through when building new features to ensure we didn't accidentally log or otherwise infer data we weren't supposed to.
WhatsApp architecture is designed with the assumption that the server could be compromised and yet such an event should not result in any message contents being revealed. Furthermore, the encryption function is designed to ratchet and rotate keys so that a leak of a key at a given point in time would not compromise past and future messages.
So yes, I have a strong sense of confidence that message contents are not exposed to Meta and, given the bar set by privacy reviews, I don't think Meta would do some backdoor workaround like scraping the contents off the device and sending an unencrypted copy. To be clear, my claims are specifically around message contents and when it comes to certain metadata (ex. the sender/receiver, the names of groups, etc) I don't recall the exact details of how they are treated.
Now, despite the fact that I've said all this and that my knowledge on the matter is fairly recent, I'm not sure I could ever say anything with absolute confidence. The code base is huge and not open source. I obviously have not seen every line of code and as you pointed out, there's always a chance some company policy changes happened without my awareness. So I would say "highly" confident but not "absolutely" confident.
What about spell-check data?
Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.
Meta has control over the app Sue uses. So they could send them to Meta unencrypted in addition to sending them to Joe in an encrypted fashion.
Or they just extract the relevant terms:
Sue->Joe: "Hello Joe, I'm so excited! We are going to have a baby! Let's call it Dingbert. You're not the father! Jim is. I hope you don't mind too much!".
Sue->Meta: "Sue will have a baby"
Insta->Sue: "Check out these cute baby clothes!"
More so, my wife sent me a picture of my daughter working on a puzzle. Less than 24 hours later, her Instagram was showing ads for a store that was selling the same type of puzzle as the one my daughter was playing with. So it's not just terms but images too.
She probably gave Instagram access to her photo library (not unreasonable for a photo sharing app). That means the Instagram app can scan her latest pictures in the background when it's opened. I think it's more likely that the data was leaked this way.
In case folks don’t know this: on an iPhone you do not need to give an app access to all your photos in order to use photos in the app.
Under Privacy > Photos, you can set “Selected Photos” instead of “All Photos” on a per-app basis.
Then when you go to add a photo to the app, you first go through an iOS prompt to select the photos the app will have access to. Only then do you go through the app’s photo selection dialogue.
I have all my apps set this way (or “None”).
I just did this and the UI is weird and confusing - it looks like I need to statically pick photos in the settings app, which obviously won’t work for day to day use every time I take a photo and want to publish it to instagram.
Not saying it doesn’t work like you say, just saying it doesn’t look like it does.
At least for Telegram each time you go to pick a photo to share, it offers you the chance to "add more photos visible" or you can click Manage.
I assume Instagram and friends would do the same.
I often just take the photo via Telegram instead, which automatically adds it to your photo roll and gives Telegram access to it. It works relatively well.
You can just hit “done” in the settings app and it will close (with no photos selected).
Then on Instagram (for example) when you go to post, you’ll get a message like “you’ve only let Instagram have partial access to your photos - Manage”. Tapping Manage will let you select photos that Instagram can access.
Glad I deleted my Meta apps and only use online FB when I need to.
The other day I noticed the yahoo mail app on iOS was reading my clipboard for no reason. I’m going to start blocking photos on most of my apps.
Instagram is especially malicious with this - it is the only app that REQUIRES access to my microphone for me to post something. They try to do this by having a camera inside instagram (that you can record with which would obviously require mic access) but even to post stuff I have already taken (even just photos) it wants mic access. I usually temporarily give it what it wants, post, then remove again.
Is this something that actually happens (= can anyone prove this by disassembling the app or MITMing the network traffic), or is it just unfounded paranoia?
Considering how easy it is to implement these things without anyone noticing since it's closed source, you have to assume it is happening in any scenario where you need any decent opsec. Even in scenarios where you don't, there's been enough cases of similar things happening with well-known apps and services to be wary.
> Considering how easy it is to implement these things without anyone noticing since it's closed source
You can reverse engineer those things and analyze your network traffic. You can’t have a client in a device controlled by the user, in this case an app, send anything to a server without anyone noticing it.
And frankly, they don’t even need it. Just with your contacts they can link you to your friends and common interests without even you having a facebook account, all you need is friends with a fb/ig account who have linked their accounts to their phones and use whatsapp.
> You can reverse engineer those things and analyze your network traffic.
Yes there's people that dedicate themselves to reverse engineering apps like this, but they're few and far between, and most of them focus on either the easy fish, or security vulns. Considering nobody's building public documentation on the protocols of these apps I'll have to assume it's hard enough and changes often enough to be worth the time of people without special monetary interests.
I agree with the rest of your assessment, there's way less "obviously malicious" ways to exfiltrate data about users than literally uploading users' pictures, since for example whatsapp stored unencrypted backups on google drive until very recently, among other things. I'm just trying to shed a light on the fact that apps like this have a lot of ways to accomplish this without raising too many eyebrows.
It shoukd be easy to test since Ios has a feature called app privacy report that lists networks and permission access and no when you just open the instagram app it does not access photos. Only when you open add to story page or click on the new post icon it does the access.
> Considering how easy it is to implement these things without anyone noticing since it's closed source
I see you’ve never heard of Jane Manchun Wong...
I imagine the reputational and potential legal consequences would be fairly severe if this sort of privacy invasion were discovered (either by employee leak or reverse engineering). Seems unlikely Meta would take a risk like this.
Back when deep learning was first hitting "mainstream" for object recognition in images, I recall reading that Facebook was using it to look for brand logos and other signs of using a particular product, in your uploaded photos.
Turns out they were also building a database of everyone's face so they could build shadow profiles...
How did she buy the puzzle to begin with.
> my wife sent me a picture of my daughter working on a puzzle.
> her Instagram was showing ads for a store that was selling the same type of puzzle
How did she take the pic ?
I think that's an important question. Did user take the photo within the app, thereby skipping the camera roll, or did they take the photo, then upload to WhatsApp from camera roll. If the latter than as someone else said, could be that Instagram had access to camera roll and decided to serve ads based upon the puzzle.
I have a suspicion as well that this is what they're doing: before the message is encrypted and sent, the app (on your phone) does analysis and picks out keywords relevant for advertising. So they can claim and be technically correct that they are not reading your messages. Although if their algorithm is doing it on your phone, is it... reading?
Or they can say, technically it wasn't a message before it was sent. The dictionary definition even mentions "send".
This is definitely the most likely scenario in my opinion
> Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.
No, that's precisely what End-to-End encryption means.
It means that for strictly one receiver end-to-end encryption. When it's touted as a feature without explicitly stating that "all messages are sent only e2e encrypted and only to your receiver" we can't assume only the receiver is getting the message, it might be E2E encrypted for all traffic, between people using their own keys and nothing stops Meta from sending a different encrypted payload to their own servers with a key they have access to.
Facebook loves to use newspeak, wouldn't surprise me if they applied newspeak to what "end-to-end encryption" means.
So it's end-to-end encrypted, but your data is sent to some "ends" you didn't think it would be sent to? Well, if that's not a good reason to end your usage of WhatsApp, then I don't know what is...
Meta own the proprietary code running at either end of the encrypted pipe. Of course they can.
They can decrypt if someone enables backups, so I see no reason they could not read them indeed.
Signal might be the only app unable to read, but even that, I would not trust.
How would you propose Signal -- or any app for that matter that provides end to end encryption -- encrypt the messages in the first place if they don't have access to the plaintext at some point?
End-to-End means that it can't be read in the middle. It does not not mean it can't be read by proprietary clients on either end.
Until there are cybernetic implants, the "ends" are the app running on your phones, which they control.
The quandary of what one allows to run on those implants sounds like a chilling sci-fi novel (chilling not because "but FAANG could read your thoughts!" but because people would absolutely still get them installed).
End-to-End is about the networking, not the end points.
That is the technical definition.
So you're nit-picking over the phrasing of the sentence, but should instead focus on the spirit/meaning behind it.
It's illustrated in their example below that they if you say you're having a baby, meta can send some type of distilled ad-keywords to its servers (eg `[mother, baby]` if it knows the user is a woman based on their name/profile, but probably more sophisticated than that). The message you sent is still technically end-to-end encrypted, though,
I addressed this just below:
Google can in theory read what is on your screen (assuming you use Android) regardless what app with what encryption you use.
Oh, come on. It's called "end to end" but it isn't. Meta has to read them to provide the service. This is not a new revelation.
I think they are extracting terms. Some of the messages generated ads that were related to a term but not really about the conversation.
> Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.
I think it does actually no one except them can read them. If someone else can, then by definition it's not end-to-end encryption.
> End-to-end encryption (E2EE) is a system of communication where only the communicating users can read the messages.
The conversations being e2ee do not affect the app itself from acting on contents. By definition the app needs to know the contents to display it, but it can also update your ad profile. It doesn't even need to send the whole message to meta, just the keywords triggered, or a preprocessed vector defining your interests.
E2ee means only the messages themselves can't be intercepted and read. But if anyone can actually prove fb acting on message contents, I suspect the EU banhammer would be interested.
The application processing the message for the purpose of displaying it is clear.
But if the message is copied, read, analyzed and sent further on behalf of a third party before encryption, then that puts that third party in the middle between the sender and the recipient. A man in the middle directly undermines e2ee: "no one else reads your message".
It doesn't matter if the third party made the messaging app or not. What matters is whether information in your messages is accessible to anyone besides you and the recipient.
E2EE doesn't prevent the app itself from analyzing messages locally, and sending updated interest profiles to meta... which can be a vector of weights or whatever thing they might be using to know what ads to show. If the logic is in the app, the message doesn't leave the app and E2EE is preserved.
This said, analyzing messages for the purpose of ad display is creepy, whatever the way it is done.
E2EE most certainly does exclude analyzing messages anywhere for a third party.
Notice that "ends" in "end-to-end" are users, not applications. When an application forwards things to an entity, then that entity becomes an "end" of the conversation. When it displays a message to the user, the way the user wants, then the user is the end. When it processes the message and delivers results to Facebook, the way Facebook wants it, then the application makes Facebook the "third end".
In such scenario, Facebook had intercepted the message, just chose to forward only some extracted information (which may or may not be enough to reconstruct the original). This does not match the definition of "end-to-end encryption".
> Notice that "ends" in "end-to-end" are users, not applications.
That's not right. First, it's technically an impossible, since users can't do encryption themselves - it's the application that does it. That's where the e2ee boundary is.
Second, we've got e2ee communication between non-user entities as well. There's are servers using for example zerotier which communicate e2ee through other nodes. Third, applications can definitely send the data to other parties automatically. WhatsApp executing backups as configured does not make it not e2ee.
It's not a distinction between softwares, it's a distinction between agents. I.e. who the software works for.
Whatsapp can't read the message on their servers but they can read it at clients, otherwise they cannot display the messages for users. Likewise, Apple/Google can read them too because they have to in order to render the texts.
This is just redefining terms, then.
We know the app decrypts it to display it. But if the app decrypts it to send it to the parent company, then it is by definition not end to end encrypted anymore.
If the app decrypts it, analyzes it and sends information about the message to the parent company, then the same thing is happening. The parent company is reading the message, INSTEAD of E2E encrypting it. It doesn't matter whether that reading happens on device or on the company's servers. E2E means the company is not reading it.
there was a time when “Unlimited” meant without any limits, but US cell carriers have redefined the term to support their business model.
It’s possible that this data harvesting ad company has redefined what E2E means (to them) to advance their business interests.
>then it is by definition not end to end encrypted anymore.
HTTPS is E2E between the client and the server.
But the problem arises, I think, is when they say they can't read them: "WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp." https://faq.whatsapp.com/general/security-and-privacy/end-to...
> Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.
so what's the point? just inconvenience. better to use telegram at this point.
…and have no encryption at all? (Unless you manually enable it for a given conversation.)
Telegram has encryption (server-client encryption). Whatsapp may have e2e encryption, but then if it sends conversation or part of them to facebook to serve advertising, that's arguably even worse.
Wait, so Telegram, which is known for being able to read all your texts, is worse than WhatsApp where people are speculating that it might read your texts?
Not that I trust WhatsApp (I use Signal) but that's an odd comparison.
you are also speculating that Telegram read our messages, in transit. For sure, unlike WhatsApp, the Telegram client is FOSS (and you can download it from FDroid).
My guess is they encrypt the message twice, append it, and split it off at their servers. To anyone observing traffic, it looks like normal encrypted traffic AND they can still, if needed, show that everyone has their own key and can encrypt/decrypt their own messages. I don't think they would be brazen enough to send it to themselves in plain text.
In principle yes, in practice no, as this is a statement from the WhatsApp website:
> We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.
That statement was worded carefully
They are saying they dont store or forward your message text, not that your phone doesnt send them topics of interest
Exactly. Zuckerburg cannot directly read your convo but the app itself writing down a few key keywords of interest and sending it back to facebook / whatsapp is not out of the question. And that amount of traffic is so tiny and could be so easily mixed in with everything else.....
Do you really trust TOS like that, though?
Assuming they're not blatantly violating the policy (which I think they've done before), it's pretty easy to weasel out of that statement by only sharing keywords from the conversation, or only sharing the info with advertisers (but not WhatsApp and Meta), or redefining what a "personal conversation" is, or carefully redefining what "end-to-end encryption" means, or ...
There's no transparency, a huge power imbalance, and terrific pressure on WhatsApp/Meta to monetize as much as possible.
Yup, I think it's just some form of analytics that profiles the user.
I've always suspected them of recording conversations, also why I think Android has gradually tightened permissions and visibilty around speech to text/microphone/camera use.
Looking at this from a reality perspective is not very helpful.
Meta->Joe: “Focus on yourself bro”
Instead of speculating whether something like this could or could not be true, there should be a way to test it scientifically.
* Have pairs of mobile devices set up from factory configuration with WhatsApp and Instagram installed.
* Simulate conversations between each pair from select topics.
* Collect all ads from Instagram after the WhatsApp conversations from each device.
* Categorize ads to broad topics.
* Search for significant bias.
There are probably a lot of factors I'm missing here, and it's probably easy to introduce bias when there is none there. For example it's probably a good idea that a different person categorizes the ads into topics than the person handling the specific phone, otherwise the person might bias the categorization of the ads based on the conversation they had on WhatsApp beforehand. The person categorizing the ads should have no knowledge of the WhatsApp conversation that happened on the phone. The devices should probably be on different networks. There is probably a lot that I am missing here.
The scarier thing to me is when ads match _conversations_ I have with my wife. I told her about this story this morning, and she reminded me about a conversation about stem cell research we had yesterday. I said something along the lines of 'I hope there is a breakthrough soon on regenerating the Isle of Langerhans in the pancreas to treat diabetes.' Sure enough, she noticed an article in her Google News feed later that day related to diabetes.
Once or twice may be a coincidence. Maybe. But this happens regularly and with startling specificity.
What could be listening? I'm a technologist like the rest of you. I know apps need permissions to the mic, I know it's not easy for an app to stay in the foreground. Is it my Roku? My smart TV?
Makes one want to go full Richard Stallman.
p.s. my wife just said it would be really funny if Google News showed an article now on people worrying about their tech listening to their conversations. I'll post an update if that happens...
This may or may not apply to the anecdote you shared about your wife, but since these apps know your relative proximal location to your weak/strong social connections, they also can know what your friends may search for and is often-used flavor of targeting.
e.g. You and a bunch of friends go to dinner and have a conversation about <topic x>, at least one of those friends googles something about that topic. You later see an ad related to <topic x> because you were targeted based on the search your friend did while they were near you.
If your wife potentially did anything digitally, related to the diabetes topic, its likely that you were targeted based on that.
Again, no idea what happened resultant to the story you shared, but whenever this sort of thing happens to me I try to appeal to Occam's Razor based on how much I know about how this tech works under the hood.
I thought of this. Just to be clear, neither she nor I did any searching on diabetes or anything like that. I can understand this being driven by search, chats, emails, etc. (basically any type of keyboard input). But here, the mode of communication was voice-only.
Yea I've definitely had situations where I've been challenged in this way (in which I definitely was having a voice-only conversation that seemed to be targeted after the fact). It can be vexing.
It was mentioned elsewhere in this thread, but this is where Baader Meinhof may apply. I don't how many times I've seen a targeted ad that was a "miss" in terms of recency bias. But I absolutely remember every time there was a "hit".
Both situations were targeted based on my digital behavior, but they're playing the volume shooter game. Taking as many shots as possible hoping eventually they score. This could be true in your case. The fact that you and your wife were having a recurring conversation about stem cell research, diabetes, etc. suggests that its likely that this is in your digital fingerprint at least once in the past (recent or otherwise).
Something I try to do now, when I'm being mindful about it, is note how often I see ads that are definitely in my interest bucket but are completely uncorrelated with any recent conversations I've had. That helps at least establish an anecdotal ratio of hits-to-misses that makes the Orwellian/dystopian much less reasonable on balance.
Occam's razor, in this situation, is that the corporate entity who profits from collected data while deceiving it's product into perceiving itself as a "customer" might, in fact be... collecting data from its product to sell to it's customers.
The core tenet of the principle is that the more entities you have to posit in any given postulate, the more assumptions you have to make, the less likely it is to be true.
I was making the point that I have to assume much less to make the statement that companies are not passively recording, tokenizing, and analyzing every conversation users have without their consent.
As creepy as this is whenever it happens, I remain convinced it’s observation bias. Because of how often it doesn’t happen.
Quoth Richard Feynman:
> “You know, the most amazing thing happened to me tonight... I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!”
You were gonna see ads, or news stories, or whatever. You happened to see this one. And it happened to connect to a conversation you were having. Amazing! What magic!?
Well how many other ads or other times didnt it connect to a conversation and you just don’t remember because it didn’t feel special? Probably many more times than it did work.
Lately I’ve been trying to find alternate explanations. Hm we were talking about going to a restaurant tomorrow. Why is my girlfriend getting all these restaurant ads on instagram all of a sudden? Oh right, because it’s Thursday, we live in a city, I searched Google for “Good Friday restaurants for a date” during our conversation, and we live on the same IP.
I love the Feynman reference. Thank you. And this may be right.
However, I also am reminded of Nassim Taleb's Fat Tony, who when asked 'what are the odds of flipping a coin heads 10 times in a row?' responds with 'fuggehdabowdit. it's a hustle.' There's a scientific response, which can be very naive in some ways, and there's the Fat Tony street analysis. As I've gotten older, I tend to value the latter.
As I've gotten older, I tend to value the latter.
Definitely a good perspective to keep in mind. But at Facebook scale, if a billion people flip an actually fair coin 10 times, around a million of them will get 10 heads. It would be understandable for those people to conclude that the coin is biased, but they'd be wrong.
Right. And 0.1% of them, 1000, will post here on HN how this can't possibly be a coincidence.
Oh I’m not saying they’re not trying. I’m more saying that they aren’t as good as it sometimes looks.
Humans are very good at finding patterns in happy little accidents. Much better than the ad networks are in making those accidents happen. If the tech was as good as they say, we’d never ever see an ad that wasn’t a hyper relevant instant click.
And it seems unlikely that Apple is selling my iMessage conversations to Facebook. The “we searched for it, created an explicit signal, and forgot because it’s such a habit” explanation seems more likely.
Plus if a simpler method like “hey your location is at a restaurant on most Fridays, why don’t we start pushing restaurant ads on Wednesday” works well enough, why wouldn’t they use it?
I'm sorry, but the Feynman reference may be a little misleading. With respect, it is difficult to imagine Richard Feynman putting "I sent the information to the recipient via the internet, but I hope they didn't read it" in the same category as "no way they could know; must be telepathy."
It’s not “I hope they don’t read it”. It’s: — they say they don’t. — developers who work on it say what they’ve worked on doesn’t. —dispute this conversation being a decade old, no one has reverse engineered the client or network traffic to prove it does afaik.
Google probably gets higher CTR by showing you many different ads that are oddly specific rather than ads that are more general. Once in a while, they will get one right, and it will feel creepy.
> Because of how often it doesn’t happen.
So sweet boxes from Kazakhstan suddenly appearing in your ads is due to observation bias...
I'm seeing a lot of people trying to rationalize and excuse this behavior from Google, but man is it a hard sell on me.
My mother recently remarked in a unique conversation that we have had a box of Golden Grahams cereal for a year and should find a recipe to use it up. She opened her phone to search, and lo and behold, the top recommendation after only two letters, R and e, was "Golden Grahams recipes". Not only had that never been a topic of conversation or search beforehand, nor did she have Google open on her phone, but you may have noticed that "Golden Grahams recipes" doesn't even start with R or e. This sparked a long conversation about how privacy really is something worth fighting for.
My only guess is that Google has the ability to listen in because we use Android phones.
It is totally weird when this kind of thing happens. I attribute it to my husband Googling and clicking on links related to our conversation with IP based tracking. I've found Instagram showing me things more related to my husband's interests if I haven't used it in a while.
It's possible some of it is simply Baader Meinhof https://www.healthline.com/health/baader-meinhof-phenomenon but it could be something deeper (or something forgotten). If my wife asks me about something, I often search or google it almost without thinking, even if I never use any results.
It should be possible to run scientifically sound experiments to show if it's happening.
It just happened right here on HN... clearly HN is spying on your conversations /s
I would setup NextDNS free version, add a ton of blockers, link your devices correctly to it, and then see if it still occurs. NextDNS is not going to block a lot of stuff but if there is a significant change then at least you have an easy way to show it. Setup is quick and easy relatively
I’m convinced WhatsApp’s e2e is BS. Because multiple times I’ve mentioned something I’ve never even googled and then had Facebook ads for those things show up minutes later.
The most notable one being renting an apartment. I viewed an apartment then sent a message to the agent requesting window grills or latches and then had adverts for that stuff straight away.
When ever I mention this on HN I get downvoted with lame excuses as to why it happened but none of them are plausible.
My friend messaged me saying he needed to go buy kitty litter and I get adverts for cat toys and supplies on Facebook despite not even replying to him?
Anyone who believes WhatsApp is really e2e is a fool IMO.
Did you view the apartment with…?
- the agent present - you both had smartphones on you - they both had bluetooth data enabled - they are both signed into location services - you'd added the agent to your contacts or they add you to theirs - you took pictures (and didn't strip the EXIF data) - you exchanged emails with the agent via gmail - you rang each other via a VOIP service - you used a map app to find the place - you found the apartment via an online listing - you found your current place some multiple of a standard contract length ago (e.g. 1 year) - your data in aggregate statistically matches that of other people's who also looked for new apartments
The metadata just drips off and they sell it, it's repackaged, bundled up and sold on to others who then target you (personally or as part of a group) in their ad campaigns.
The agent is a friend so I had in my contact list for years.
We exchange listings to apartments to view so absolutely they know I'm looking at apartments and all that stuff.
Getting adverts for apartments is fine because I know I searched and browsed sites etc. The issue is for 2 weeks I looked at apartments. Got some adverts for AirBnb, other property sites, etc.
The particular apartment I looked at, I viewed it, walked around it, etc. Went home, pondered to myself about the apartments I had looked at that day...
Looked at facebook, etc...
Hours later I go to WhatsApp, message the agent, something like.
"I really like apartment X but the only issue is the windows don't have grills and its the 23rd floor, would you mind asking the landlord if we can get grills or window latches? If they are willing to do that then I think I'll take that apartment"
(give or take on the message as it was 3 years ago? That I took it...)
Agent had not read the message yet, infact he didn't read it until well after 6pm.
I go to facebook (2-3 minutes?) after messaging him, and I have adverts for grills and window latches.
I had not googled them, the only mention ever of grills or latches was a message on whatsapp a few minutes earlier...
Even if they are scanning the messages on the client, as far as I'm concerned it's no longer e2e. They are scanning my personal message, analysising it, building a profile on me, and using it for advertising.
What keyboard do you use?
That is always the one that worries me.....
Stock one on iOS. Never used a custom keyboard. (except when I used Android cos the stock one was terrible ~8 years ago.
I mean the WhatsApp founders literally quit with millions left on the table over "disagreements with facebook senior leadership" and then promptly made a huge donation to Signal. If that wasn't them saying "HEY THEY ARE REMOVING E2E" without violating NDA I don't know what it was.
Whether or not this is a canary for negative impact to the integrity of WhatsApp e2e, this move was enough to move my primary communication to Signal.
Jan Koum not only had unvested shares, he was on the BOARD of Facebook and decided to quit. Along with the donation to Signal, I don't think the magnitude of this action was widely appreciated at the time.
I had a private family Slack instance, and we observed similar things. A question about buying a particular brand of soap would be spoken in the channel and ads for that exact soap would show up minutes later on Facebook. No searches or other activity, just a text query to a person through Slack.
I prefer to think that some of this is chance, or Facebook just guessing really, creepily well. But after the 3rd or 4th time we observed this same scenario play out, trust has eroded, even if I have no evidence as to how this is happening. It's time to change up the tools.
I now self-host an open source chat application, and we have not had another repeat of this kind of creepy ad invasion.
> I prefer to think that some of this is chance
Don't. People are way too tolerant of serial liars, psychopaths and rapacious corporations.
I wonder if there’s some kind of side channel leaking the ad targeting data, for instance if you type something into WhatsApp and they add words that you type into a user-specific autocomplete dictionary (or your lexicographical profile or whatever other aggregate they keep for various business purposes). Then they use those aggregates to target users based on the words they use in conversation without “decrypting their messages” so to speak. Depending on what phone you use, any custom keyboards, etc there may be other ways for typed text to leak.
Yup. Keyboards can read everything. I always turn off all the special features of the keyboard that seem sketchy even if its from a legit company like Google.
Isn’t that because they can read sender and recipient? So if you write someone who do a lot of renting stuff they believe you might like it, too? Just a guess.
In all your scenarios the likely answer , even with great e2e is that WhatsApp would still know who you messaged and they might have googled the words at some point.
Your agent might have searched for latches after you messaged them. Your friend likely searched for cat litter before he messaged you.
In each of those cases, E2E may still be active but you’re compromised by the human equivalent of the age old analog hole.
Jane googles topics A,B,C,D
Jane sends unknown message to John
John fits the target profile of topic C
John gets shown ad for topic C
coincidentally, Janes unknown message was about topic C. Likely because she too believed it to be applicable to John.
Maybe topic A was trending in Janes area. Jane's friend Sally got ads for topic A. sally talked to Jane about topic A. Jane searches topic A
A trend emerged where people who searched topic A, very soon afterwards searched topic E.
Jane is very late to hear the news about topic A. She gets shown ads for topic E. Jane thinks advertisers are reading her mind because she was thinking about topic E but didnt talk to anyone about it.
I believe that it's automated, they scan every message with some embedded piece of code (that you can't really verify and have to trust that's harmless), that looks for key words and serve ads. And don't forget that with judicial order you get 24h-real-time access to any whatsapp account in question.
It's e2e encrypted, just they have the keys and use them :)
this is the point. too much evidence against it and not enough evidence for it. i just don't trust facebook, google, apple etc.
They're not in the same category. Two of those are advertisement firms, one is a hardware company.
I dont need to believe anything, its proven by mane scientific researches.
Can you link to some?
1. Nobody is reading your WA messages, the same topics can be learned from your browsing activity or other msgs, eg. by reading your sms texts.
2. Meta is reading your messages directly in-transit, server-side.
3. Meta is not reading your messages server-side, but the Meta apps extract keywords from your conversations and request relevant ads from the ad servers.
4. Another non-Meta app is doing the above.
If 2 is true, then it is not end-to-end encrypted, and I don't think that WhatsApp is lying. They have ways of doing their things without lying, so I don't expect 2 to be true.
I think that 1 is the most plausible, however the original post is about "topics they never talk about", so assuming that WhatsApp is the only channel and they don't leak data in other ways (and there are many other ways to leak data), then 1 becomes unlikely.
3 is the most compatible. All the targeting can be done locally, so no end-to-end unencrypted message leaves the app. The app then sends your topics of interests to Meta.
4 again assuming WhatsApp is the only channel, then there is probably some malware somewhere, and it is unlikely that Meta accepts illegally collected data (they can do it legally, better, and with less potential trouble). There are however a few legitimate apps that can do the above. I am thinking about things like predictive keyboards, accessibility apps (screen readers, ...), backup apps (end-to-end encryption is about transmission, not storage), and the OS itself. I don't think Meta controls any of these, and I don't think they would buy data from them (Google and Apple are competitors after all).
So I would go for an accidental leak (case 1). For example, for the experiment to be meaningful, you shouldn't tell anyone about the test topic before you receive the ads. Or with the WhatsApp app hinting Meta about your topics of interest.
Another thing that would make 1 happen even if they think they're not leaking information over different channels, is the software keyboard. GBoard is google's, and likely has some data collection in one way or another. Similarly, there's a lot of google-related services running with root privileges on stock android phones that could easily snoop on data from various apps. This effect is worsened by other android OEMs, like xiaomi or maybe even samsung, who ship their own invasive services on top.
Disclaimer: I worked at FB, but not on Whatsapp or ads.
I agree, I'm pretty sure 2. is not the case; I just listed it as a theoretical possibility. Despite all the bad press and problems, FB has very (very) high integrity and standards, at least the parts I saw.
Another option would be that META created a small model that could be run client-side and picked the right selection of ads to show elsewhere without even exfiltrating keywords.
Well, my wife sent me a picture of my daughter working on a puzzle. Less than 24 hours later, her Instagram was showing ads for a store that was selling the same type of puzzle as the one my daughter was playing with. So it's not just terms but images too.
Assuming Android: your pictures may be "parsed" by Google once they make it into Google Photos. Also, Meta may think that "parsing" the images in your local Whatsapp folder of Google Photos (or all of your local images) is fair game. Note that I have no clue if this happens, I'm speculating.
That doesn’t even make sense. Why go through the trouble of retargeting based on images? If you took a photo of something you likely already own it.
> If you took a photo of something you likely already own it.
Tell that to Amazon who never fail to recommend me things I just bought.
Hey if you really liked the vacuum cleaner you just bought, you definitely want to buy another one.
You have to believe that Amazon's algorithm is working as intended though, right?
The thesis is that recent vacuum cleaner purchasers are many times more likely (than the average person) to be looking to buy a vacuum cleaner.
Apparently about 20% of Amazon purchases are returned. And most returners are looking for a replacement. Some of the replacement product research is done before the return decision is made, so you get ads even if you have not initiated a return.
As much as Amazon doesn't want you to return your purchase, they really don't want you to buy the replacement somewhere else.
It would be interesting to measure how the ad ratio changes over time. Particularly when you exit the return window, but of course Amazon will know the return-likelihood curve with much greater precision.
Yes, for sure. Now I've got a different vacuum cleaner for every room. Never have I felt cleaner before.
Wait until you find out about whole home vacuums! Then you can turn The Whole House into a vacuum!
Looking through my photos, that's not the case. It's either things I don't own but like, places I enjoyed, or photos of things I want to buy / sent to my wife to buy. In my case, the photos would be a prefect targeting opportunity.
Mindlessly retargeting you based on things you own or have already bought is standard practice in the online ad industry.
Where did your daughter get the puzzle and when if she was just working on it :)
Yes, this is a bit similar to my 3. option, but more sophisticated..
WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp. This is because with end-to-end encryption, your messages are secured with a lock, and only the recipient and you have the special key needed to unlock and read them. All of this happens automatically: no need to turn on any special settings to secure your messages.
> End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp.
This does not exclude an algorithm running running on the sender/recipient's App from scanning the content and sending suggestions to AD servers.
I thought that too, but if that is the case then it should be relatively easy to find this hidden functionality by decompiling the APK and exploring it.
> only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp
I guess the key lies in "what is sent" in the above statement. The casual reader might reasonably interpret as "no-one except the intended recipient can see _what I type_". But it doesn't say that. It only covers what gets _sent_. It doesn't say anything about what happens to the content outside specifically _sending_ it to the other party(ies).
Even if that interpretation was correct in a broader sense, which I don't believe it is, certainly in this context of e2ee it isn't correct.
The more obvious explanation is that before or after the e2ee (i.e within the app itself), an algorithm scans the content, categorizes it and sends this to Meta/Facebook.
In this scenario, *Nobody* has read the content other than the person you're communicating with.
The local app running on your phone might even legally by considered "you" since it is running on your device under your user agent. I realize that is a bit of a nefarious take but I could see that being the case.....
Maybe Meta are not as trustworthy as I imagined.
Stop calling them Meta. It‘s Facebook.
don't die on this hill, it's all arbitrary and won't make a difference
Comcast rebranded as Xfinity because they knew everyone hated Comcast. Now everyone also hates Xfinity. It's fine.
So lying by ommission?
Can I not train a text classifier on encrypted text?
Basically, let the AI figure out what ads get clicked the most for a given string of encrypted 24h window of chat history. Eventually, the AI is going to hit on its “Rosetta Stone”, even without ever formally decrypting the text, much less any human reading it.
With millions of conversations happening on WhatsApp, why shouldn’t that be possible?
And it’s not even a breach, technically, because nothing ever got decrypted, and the similarity vector generated by the AI have, per se, nothing to do with the content of the conversation or the individual that sent them. Run the same training algorithm again and they’d look completely different! Hence they can’t possibly be “personal data” in the sense of the law.
No. Encryption means the data is scrambled. Essentially unrecognizable from noise, save perhaps for some headers.
If you can discern meaning from noise, then your theory would work. But discerning meaning from random noise is obviously impossible (i.e. what if there is no meaning?).
If you leak information than you say, then the encryption is worthless. Harmful, even, because you think you have protection when you do not.
It doesn't matter how many conversations are going. With private key encryption, what a phrase encrypts to for one person would be different than the next. It would have to be trained solely on your conversation. Encryption is also dependent on all the text before it as well as text in the same block, so it would have to be the beginning of the message with no metadata to throw off alignment saying the 16-byte phrase so many times that it could pick up a difference. I'm pretty confident it's impossible to get anything useful out of that.
If any algorithm can get even one bit of data about the plaintext then the encryption is broken by definition.
IANAL of course, but you should read the ToS carefully and you will probably find something that allows them to read your messages anyway.
e2e encryption doesn't forbid to read the messages as you type or read them or read a screenshot of the screen or whatever they can do inside an app :P
They were caught activating your camera by "error" a while ago https://www.macrumors.com/2019/11/12/facebook-bug-camera-bac...
As per the experiment you did...
We did the same experiment with a female friend a while ago. We started talking about her pregnancy (a topic we never touched, as she was single and of course not pregnant) in a group chat, specifically targeting her. Sure enough, after a couple of days her fb and instagram were full of strolley ads (but not ours) :)
We are seeing the same thing. More so, my wife sent me a picture of my daughter working on a puzzle. Less than 24 hours later, her Instagram was showing ads for a store that was selling the same type of puzzle as the one my daughter was playing with. So it's not just terms but images too.
The more basic explanation is that she has been served ads for puzzles like that for a while, based on previous history and maybe retargeting from the company and you guys only notice the ad because it's more salient after sending a message with the puzzle in it.
There's no reason you'd have noticed an ad about the puzzle in the wash of content and other types of ads.
putting on my tinfoil hat
yup, do you think all these img recognition stuff is made for fun? companies wants to use them to read our pics on your phones and profile us
Was the picture taken with whatsapp, or with the camera app? Do you have google photos installed? I GP classifies images, maybe also for advertisement?
She used the WhatsApp camera app. We have Google photos too, but we have it configured so that WhatsApp images don't get uploaded.
Are group chats also E2E encrypted?
One explanation I've heard for mysterious "We were talking about it in person but nothing else" ads, is that if you were connecting to the internet from the same Wi-Fi access point or IP address as someone else that did a web search on the topic or visited websites on the topic, it has connected you by way of shared internet connection.
Is it possible something like that happened?
In general, while anything is possible, my own occam's razor calculation is that if someone does have a way to get through ostensibly end-to-end encrypted messages, it's going to be government actors saving it for law enforcement/national security purposes. They wouldn't "waste" it on ad targetting. And if it's being secretly used for ad targeting so many people would know about it, people who aren't disciplined military bound by law to secrecy, that it would be quite likely to get out and be revealed and no longer secret.
My thought exactly. As an experiment I would compare talking about new topics in person only vs. on WhatsApp. Actually there is a huge explosion of variations you can actually test: verbal communication with and without devices present (in case you also suspect audio recording), WhatsApp communication but no verbal, combinations, same network and different networks, etc.
Being on the same network is not even necessary. Meta can still see who talked to whom at what time, and that would be sufficient to correlate the interests of both individuals.
How We Work With Other Meta Companies
As part of the Meta Companies, WhatsApp receives information from, and shares information (see here) with, the other Meta Companies. We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Meta Company Products. This includes:
- improving their services and your experiences using them, such as making suggestions for you (for example, of friends or group connections, or of interesting content), personalizing features and content, helping you complete purchases and transactions, and showing relevant offers and ads across the Meta Company Products; and
So most likely WhatsApp on your phone includes an engine that reads all your incoming messages and tells Meta that you are interested in some topic X based on your recent messaging history. Meta is not per se breaking the E2E encryption, but their app contains a backdoor that reports some topic-level information back to Meta that could be used to deduce what you are talking about without totally breaking the confidentiality of your correspondence.
(All this is just a guess based on OP's report and the above quote.)
What's scarier than secretly reading messages is the idea that we are being manipulated into believing that we thought of the "random item" all on our own, instead of it being cleverly triggered by a series of manipulative ads or posts from friends.
Or, a similar idea is that ad companies don't really need to know anything about you so long as all your friends are "unprotected".
For example, you may pick "lawn furniture" as your "totally random" item to test WhatsApp. What you don't remember is that a good friend mentioned lawn furniture to you 3 days ago and just did 14 web searches on Google and FB marketplace to find some. They have strong metadata ties to you, so you get served ads on that topic too.
I am 99% sure Meta/Facebook have secretly broken WhatsApp e2e encryption by adding a second key to all users.
I have security code change notifications enabled, and around November 4, 2021 a large number of my unrelated contacts suddenly had security code changes. There wasn’t any media reporting at the time, but I remember some others mentioning it on Reddit (would love if anyone here can scroll back in their message history and look for security code changes around the same time - maybe we can finally shine some light on this).
Since then I have assumed they are flat out lying about the fact that “not even WhatsApp can read your messages” (direct quote from the iOS app).
Also note that both iMessage and WhatsApp strongly encourage you to enable iCloud backups, which are not e2e encrypted and readable by Apple (Apple only claim backups are “encrypted” and that messages are “e2e” encrypted):
At least Apple are not flat out lying like Meta, but they are still being incredibly deceptive with their marketing.
Use Signal if you care about e2e encryption. Everything else is a marketing slight of hand.
"Also note that both iMessage and WhatsApp strongly encourage you to enable iCloud backups, which are not e2e encrypted and readable by Apple"
-> That's not completely true (at least for WhatsApp): It is possible to enable a e2e encrypted backup right in the chat-backup menu.
You are right, I should have said they WhatsApp messages are “not e2e encrypted by default”.
However, I still believe Facebook holds a second decryption key for all messages, which they rolled out along with their web access product as described above. So they are not e2e encrypted by any reasonable interpretation of the phrase.
I am not aware of any way to e2e encrypt iCloud backups, so the vast majority of “e2e encrypted” iMessage messages are readable by Apple.
Assuming you're using a phone, is there a "keyboard app" that could be intercepting things before WhatsApp? other endpoint issues with security? Not that I'd be surprised to see a big company flatly lying about their product, but because they're so big I suspect you need to work hard to eliminate other possibilities before taking them to court.
No keyboard app. We use the regular Android Keyboard provided by Google.
But doesn’t Gboard send your typing data to Google?
Yes, it does. Also, DNS blocking will not work with many GApps phoning home.
Even Swiftkey keyboard (bought by microsoft) sends back telemetry to MSFT. Try a keyboard from Fdroid, but it may not be as feature rich.
But this is Instagram and WhatsApp. Are they sharing data about our conversations openly?
An interesting experiment would be the same thing you are doing but isolated in a note taking app to discard the google keyboard you’re using. Also, it would be interesting if you can use e.g.: proxyman (available directly on iOS), or some proxy on your PC to intercept your network traffic and then try to reproduce while blocking/allowing some domains. Especially, blocking all google domains, then facebook domains, etc. If you have a Pi-hole set-up doing that at dns level may be easier.
Edit. Another idea: try to reproduce while disabling predictive text on your keyboard.
Who is they? Google or meta?
Google are definitely collecting data from gboard.
That may not be directly shared with meta but is likely to get indirectly shared through overlapping advertising identifiers. They won't be openly sharing your text, but they will be scanning to and flagging you as having interests in something in your text or something related to what you said, then sharing that with advertisers.
surely that just completely subverts e2e encryption??
so it's clear, your gboard si harvesting all input you type and then sells these data to highest bidder, nothing to do with whatsapp, just good old Google harvesting everything for ads
Facebook has an extensive history of grabbing people's data, lying about it, being caught and fined, apologising, then doing it again. So it's absolutely in character.
I'm sorry, but this feels like a highly irresponsible FUD post to me. (And I am not a fan of Facebook in any way at all, so let's put that out of the way.)
For years and years and years, there have been people claiming their voice assistant (for example) is listening in on their conversation to show ads, and so forth. And it's always anecdote, never any hard data.
And the thing is, if this were the case, it would be relatively easy to prove with a controlled experiment that other people can replicate. And yet, somehow, magically that never happens.
Sure, Google used to algorithmically read your Gmail to show you relevant ads, but they were totally open about that, and then they stopped because it weirded people out anyways.
If Facebook were mining Whatsapp messages for ad topics, they'd probably be as open about it as Google was, out of pure self-interest. Because right now so much of their advertising is about how Whatsapp is trustworthy because it's E2EE etc. So if they were secretly analyzing messages, it would blow up the reputation of their main marketing message. There's a good chance it would be business suicide for Whatsapp. A profit-driven company probably isn't going to take that risk.
To be honest, this post feels social-engineered by a messaging competitor or something. I'm not saying it is, but the personal touch ("silly little game with my wife"), the innocent questioning ("Is... or am I missing something silly?"), and the total lack of any objective evidence (e.g. screenshots of messages and ads) are all HUGE red flags.
If Meta really is doing this, it's pretty easy to prove with hard data, and that's going to become a front-page news story on the New York Times. The fact that that hasn't happened leads me to think it's much more likely there's nothing here.
Are you trying to say that FB was never caught on privacy scandals before? Did it blow their reputation?
Not like this -- stuff like Cambridge Analytica revealed major lapses in security, but none of them ever contradicted their main marketing.
To the contrary, WhatsApp advertising E2EE so publicly can be seen as actively trying to get away from past scandals and building trust.
So the idea that they would be undermining themselves at the same time doesn't make a whole lot of business sense.
I'm going to say the same thing I always say when it comes to E2EE.
E2EE does not mean anything in a world where both ends are owned by the transport layer.
I'm not saying they're doing anything wrong, you could be mistaken and information can be exfiltrated some other way.
But: Either you trust the transport layer or you don't. Saying "E2EE means the transport doesn't have to be trusted" while running a neigh impossible to reverse engineer binary on both ends distrubuted by the network --- *is* trusting the network.
I think Meta is reading your messages locally on your device and showing you personalized ads from the messages that are actually on your device. It's not uploaded on Meta's servers and not in anyway breaking the e2ee, because your device is one of the 2 ends. If you don't use Facebook or Instagram on your phone then no personalized ads is shown.
Everything above is supposition from something I vaguely remember but not 100% sure.
Whatsapp isn't open source. How do you know that the messages are actually e2e encrypted?
Decompilation. Reverse engineering. Network monitoring. Third-party attestations like https://research.nccgroup.com/wp-content/uploads/2021/10/NCC.... The lack of whistleblowers from within Meta itself. There are are hundreds of employees working on this product.
LOL no need for elaborate MITM efforts if both endpoints are completely closed source and totally powned and corrupted.
Of course that reads backwards just as well, no need to implement complicated "end to end encryption" if both endpoints are hopelessly powned.
Every few months someone thinks they've proven that these companies are recording us via our phones, scanning our messages secretly, etc. What's more likely that these companies are going to break the law but manage to keep it a secret that not one engineer who worked on these features speaks out after all the whistleblowers that have come forward or that you're that predictable they can guess what you're wanting/thinking from what you content they already serve you?
For a few years, WhatsApp "e2ee" messages were stored in plain text on your Google drive backup (that's how it worked on Android. Don't know about apple). This was even stated in their FAQ.
This was as part of a FB/GOOG deal where the storage for WhatsApp backups did not count for your Google drive quota.
Recently the backups did finally become encrypted as well. With a key known to the WhatsApp app. (On Android, stored in a file called "key" in the apps local storage)
However, when you restore the backup, where does the key come from? From the WhatsApp servers, obviously.
So still, FB and GOOG together still have full access to your daily backed up messages.
And the free storage deal is still there, of course.
Please do correct me if I'm wrong and you know better.
I understand your point. Their track record regarding handling users data is bad, so egregious that in Europe laws are being crafted to force them to keep users' interests at heart.
Let me tell you about another third party that is not mentioned here. The telecom operator. Once, in 2019, I was reinstalling WhatsApp on my smartphone. I checked the Internet traffic records (outbound) with curiosity. This led me to finding out that my phone was reaching out to *whatsapp.com (Meta's servers) and Belgacom (Belgium's telephony provider). So, my phone's data was being routed through the parent company's devices and some other third party services like a national telecom company. I don't if it's still the case nowadays, but since that day, I have more worries about their encryption and data relay practices.
I can attest I've seen this behaviour and played this game with a friend. We concocted obscure conversation points e.g "Flamingo statues" and not long after would get ads in the right ballpark of relevance on Instagram. Hard to know if it's nefarious as it could be mere coincidence or confirmation bias.
Tangental aside; it still confounds me where the business opportunity of WhatsApp resides for Meta if they "can't" get access to the data.
It's a shame this kind of thing is so hard to prove, otherwise it would be all over the media. People will write it off as 'coincidence'. "Perhaps you looked for or discussed it elsewhere".
What happened with Skype before was that Microsoft would ping any links from their servers, so it was really easy to prove it by generating a new web server, publishing it nowhere and then mentioning it in a chat. This caused some publicity and they stopped the practice. Skype didn't guarantee E2EE at that time though.
But perhaps you could do a similar 'clean room' excerise to prove it. I don't think they would break the E2E by the way but perhaps there is something calling home in the app itself.
It's not hard to prove at all. Randomized controlled study organized over the phone so it's subject to legal protection from surveillance, unlike random FB messages.
If 200 people send preselected product communique over brand new devices and are 800% more likely to be targeted by ads for those or related products by the control group of the same size then we have enough for a legal case and discovery.
That's still a lot harder than to send 1 link by 1 person and immediately find it grabbed :)
It really should be airtight if they're not going to weasel out of it.
I kinda doubt Meta is doing it but on the other hand they do seem to be in heavy weather right now.
It's incredibly easy to prove.
The reason it's not all over the media is precisely because it doesn't hold up.
Anybody who's dedicated to this can select truly obscure terms, fully document their private chats and full internet usage and ads shown, and show whether this effect is actually happening.
The reason we don't hear about this is because the snooping doesn't appear to be happening. So you just get a bunch of people sometimes claiming it "seems like" ads are coming from private chats, because coincidences do happen statistically so it will always happen to some degree to some people.
The reason it's not all over the media is simply because the phenomenon doesn't appear to exist, not because it's hard to prove.
"WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp"
This has always been disingenious.
WhatsApp control the client, the client displays the unencrypted message, ergo WhatsApp can read the message.
It provably does when it interprets links and does a web page preview card.
Also... that is highly likely leaking your advert profile as even if the preview didn't then any visit to the website is outside of WhatsApp and is now tied to your IP, browser cookies, etc.
All of the above can be true without end-to-end encryption being broken or otherwise defeated on the server side.
Could it just be confirmation bias? Or could it be that the correlation is the other way around: You see something online, decide to pick that as your topic with your partner (unconciously) and the ads show up because of the initial location you have seen the topic.
The best way to solve this would be to pick three topics ahead of time. If you're worried about the phone's microphone, do so by pen and paper.
Then randomly select one of those three topics to discuss on WhatsApp.
Finally, keep an objective eye out for ads about all three topics. Ideally, log every single ad you see.
Better to have a computer pick random topics.
While that's possible, the more plausible explanation is Meta reading your messages imo.
Is it? I feel like this would be easily reproducible and would bite them in the ass big time.
Whenever you post a link in Whatsapp, the app tries to create a card from the metadata available in the link. The procedure to make the card presumably goes through fb servers and is outside the e2e. Could they be using that?
Another hypothesis is that you are taking other steps such as searching for that topic so that you can send something to your wife those extra steps might be enabling tracking.
I often wonder this about unsolicited media text messages (which, for me at least, are almost all political) and whether they might use the iMessage automatic “preview” function to track whether or not a message has been opened/viewed. As far as I’m aware it’s not a feature you can turn off on iOS.
I’m pretty sure the iMessage previews are generated by the sender and sent with the message itself - the recipient doesn’t parse the link.
Can easily be proven by sending someone a link that only you can access (hosted on the LAN which the recipient can’t reach) and seeing if the recipient still sees a preview.
consider the following scenario
type a message -> msg encrypted -> msg sent -> msg received -> msg decrypted -> msg viewed in app
Then consider the following:
type a message -> msg encrypted -> msg sent -> msg received -> msg decrypted -> app scans content and sends classification to ads server -> msg viewed in app
Both are end-to-end encrypted.
This is likely the explanation. I put signal on phone I initially used exclusively for work stuff and the other day and chatted about some car tires. Next thing we know, tire ads in mobile browser. Main PC ( which has linked signal installed ) got generic ads when I looked online suggesting one of the apps is grabbing info from keyboard.
Well, yeah. Of course lots of companies are targeting us for ads based on data that we lightheartedly assumed was private. It's even possible that they could do this without violating their ever-changing privacy policies. For example, they could say that the programs that review your messages don't retain any personally identifiable information. It might even be true. (Although there's no auditing of this and little or no accountability.)
Online ads ... if you've ever paid for one you know they are desperately in need of targeting. We consumers provide our info directly to folks who sell ads, under terms and conditions that we don't understand. Of course they're making use of this free resource. They'd have to be idiots not to.
So ... with respect, the wave of denial in the comments here ... 10 years ago, that would have seemed "naive but understandable." Today it's just weird. Almost like some kind of absurdist comedy. It's totally disconnected from the world we actually live in.
The app that you type your message into is controlled by Meta. They can do whatever they want with what you type and THEN encrypt the message and send it to the other person.
OMG, no, it's "END TO END." That means WHATEVER I NEED IT TO MEAN. Even if the provider promised they ARE using my messages to target ads, THAT NEEDS TO NOT BE HAPPENING.
Irony notice. This message contains irony.
What I desperately want at this point in time:
A button next to every ad that says "why me?" that details every byte of my data scraped to generate that specific ad. Was it a GPS location from an hour ago? Did you scan through my photos? Did you figure something out from my youtube watch history? TELL ME!
The small blue adchoices triangle logo in the corner of some ads, if clicked, will sometimes tell you some of that information. Like who served the ad, if they targeting was from the content of the page or an audience segment, etc.
That’s technically required by the GDPR - any data controller must give you all the data they hold about you upon request and be able to review or explain any automated decision. This would include ad targeting data.
The problem is that the GDPR isn’t enforced enough even at a very basic level, let alone technicalities like this.
WhatsApp make no specific claims about who this encryption is keeping you safe from. And they also require you to agree that they can use your information and interactions for their legitimate business needs. I mean, WhatsApp is standing right there when you stuff the message into the box regardless of how safe the package is in transit once it's left your phone. And consider basically every 'enchancement' to security or privacy around Facebook was done under duress for years. Pre-acquisition WhatsApp is a different story, but that story is ancient history.
I didn't agree to the recent WhatApp nor Facebook's TOS so no longer have their product on my devices. I suggest you do the same, or just sit back and enjoy the specialised, relevant, targeted ads, but think twice before each send.
I'd ask myself another set of questions: - can they extract information from conversations and exfiltrate it in a stealthy or obfuscated enough way so that they won't be noticed or have plausible deniability - do they have incentives to do so (assuming the absence of liability described above) - do they have a track record on related topics that makes you confident in the fact that they wouldn't act that way
My answers being yes - yes - no, the question of 'do they listen to target the ads they try to make me display' is pretty irrelevant to me. I can't trust them not to nor check reliably if they do.
If you try to address a different question such as 'do they really encrypt reliably to protect your conversations from being snooped on without their authorization', the threat analysis may differ. In that case they have incentives aligned with yours and are probably faithfully trying to effectively protect your/their data.
At the end, I'd estimate the probability of the scenario and how I value the consequent loss of privacy. Then accept/mitigate/refuse the risk accordingly.
What types of phones are and your wife using? My personal theory is that a lot of these situations are actually the phone itself (or a third party keyboard app, apps copying clipboard content, etc) doing the “spying” and not WhatsApp somehow bypassing the E2E encryption.
It is unlikely that Meta is able to decrypt the messages on their side. The WhatsApp desktop client is actually very crippled by the fact that it can’t just ask someone for the messages, it is another participant in end to end encryption and needs the shared keys. If they would defeat end to end encryption they would first implement the desktop client in a more functional and easy way.
But it is possible for the client itself to build a map of advertising id -> interests and send that over to meta separately. This would be similar to one of chrome's proposals.
Can someone working on WhatsApp weigh in on this or are there very restrictive NDA's? I would expect it's an interpretation of the TOS, so WhatsApp should be able to communicate if this is a possibility or not.
I work on a competing product (not for any of the named companies in this thread).
I don't think there is any fault with the e2e encryption. Humans are very bad at seeing causality when there is none, or accidentally leaking their thoughts into the search box.
There could also be leaks with the clipboard, photo gallery or keyboard - all things that freemium apps love to scan in the background. The way real-time-bidding ad markets work, anyone that the data leaks to can influence ad ranking - doesn't have to be FB/Meta.
If you did a true blind study, I think you'd find no link.
For example, start with a list of 1000 products images and accompanying text. Select 2 at random each day. Flip a coin to decide which to send (keep the other as control). Cover the screen so the user can't see what they've sent/received. Then, a few days later ask the user to select which product they think they sent.
I'd bet that even after months of doing this, there will be no finding of a leak.
guess nobody from meta could comment on this without a legal writing the answer for them
Why would they want to even if they could? Nobody would believe their answer.
The old joke is that if you want to increase the quality of the ads you see, google rolex and Rolls-Royce for a week
Another idea: test a null hypothesis. Block with your pi-hole or a proxy all the facebook traffic so the messages won't work. Then reproduce with the same exact behavior with your wife (it would be better if you both could get a clean identity and then repeat the exact same conversations and topics, but getting a clean identity is not that easy) so all the external factors are the same except the facebook connection itself. It could be even more granular by trying to block just enough so it won't send the messages, allowing facebook ad trackers, etc.
If you try this several times, the messages are not working and the corresponding ads show up you can be sure that it is not because they are reading your messages. Which does not rule out the possibility that they might read them, but at least you can be 100% sure that your ads are not showing up because of your messages in this case.
If they do show up you could then try discarding other factors like client-side keyword analysis e.g.: talking about very generic things which are not useful to ad trackers like "how you going?", etc (ie. awkward elevator conversations), but it is harder to test a null hypothesis for client-side keyword analysis.
just pointing out - pihole can't block web traffic, only the DNS lookups. If server IP addresses are hard-coded, there's nothing pihole can do.
Yeah, I recommended Pi-hole because it is easier to setup than a proxy, it is very easy to see if it works - just try to send messages. The last time I tried it you could block whatsapp traffic by just blocking the domain at DNS level.
Of course, if it does not work, use a proxy.
I've told you again and again that if you value privacy you don't use WhatsApp, you use Signal.
Personally I use Telegram which works fine for me and I have taken a fair amount of flak for saying it is a better choice than WhatsApp.
I'll still try again: There is more to security than protocols and algorithms. If you value your privacy, don't use a free messenger from a company with a long record of sleazy behaviour.
Well, even end-to-end encrypted in WhatsApp doesn't eliminate metadata that is being collected about you. Even then, unless you are using a keyboard that is specifically not gathering information, the keyboard you type your message with might be sending keywords/metadata forward. Combine that with the rest of the information being collected about you and whatever you speak about, isn't so secret anymore.
It's very easy to blame an application, but the problem with the modern ecosystem is that it's all very interconnected. Signal makes a point of having a setting that sends a request to the keyboard to disable personalized learning, but even that is a request. There isn't a guarantee that it complies.
Companies that deal with data will not use a single source of information, but a huge variety of sources and your smartphone is like a huge vacuum that is pulling in everything it can gather from you through any means possible.
Lastly, it could also be observation bias as others have mentioned, but to truly be able to regain control, you would need to take a variety of steps to make this change.
Pcaps or it didn't happen.
People have been claiming this for years, and yet we have never seen actual evidence. I completely understand being creeped out by the surveillance shops, and I've seen coincidences that weirded me out.
But if this is going on, then there is network traffic about it. And busting FB with real proof of audio surveillance would be a massive feather in some researcher's cap.
I don't buy it.
try an external source of true randomness for choosing your test topics. choices that seem random to you may be totally predictable.
i know that's wild, but also often true. humans are bad at randomness. there may be no direct leak at all of your test topics, they might just be guessable based on everything that is known about you, people like you and things you've been presented or looked at.
You also have to factor in confirmation bias-type effects where when you are looking for something everything seems related. If you are seeing dozens of ads a day on Instagram and suddenly you have some "random" topic in your head you will mentally connect them.
Maybe this could be counteracted by something like:
1. Generate multiple random topics and only send one across WhatsApp. Count "related" ads for each.
2. For every other random topic don't send ti across WhatsApp and see if you still find "related" ads.
that would work. i would be utterly shocked if an experiment like that still found related ads. (you'd probably also want a prior on the baseline prevalence of various topics.)
assuming that's all true, good news is that they're not doing any totally illegal spying. bad news is, they don't have to.
> If you are seeing dozens of ads a day on Instagram and suddenly you have some "random" topic in your head you will mentally connect them.
ahh yes. incidentally that sort of behavior is often linked with the onset of poor mental health. there are some interesting questions around the limits of personalization and impacts on mental health. if the machine behaves like magic, specifically on a personal level, does that encourage magical thinking?
When they say that only "you" have access to read the encrypted message (at the end) could they be being disingenuous with that interpretation?
Take this scenario for example:
1. E2E is not broken in anyway by Meta/Whatsapp. In this scenario only both WhatsApp clients (and thus you and the other person) have access to the messages. This is required for you to even read the messages in the first place.
2. The WhatsApp local instance is running on YOUR device under YOUR username / digital identity. From a legal perspective is it possible that since the app is running under your username that it is also considered "you" ?
3. If number 2 is true then it might give the local WhatsApp instance legal shield to read and do anything it wishes (locally) with the message content. And then of course this could be sent separately back to Meta/Whatsapp in a very small format easily mixed in with other traffic.
I keep being told this is a conspiracy and that several people checked on this and found no evidence of them spying on you, but I still have a hard time believing they're not doing something really sketchy.
The case that was the final straw for me was when I was chatting with my partner and remembered a funny song from my childhood, so I opened YouTube on Safari and showed it to her. A couple of minutes later, she opens Instagram (on her own phone) and the first "follow suggestion" is the artist from the song. She had never heard of this song before, much less of the artist (which is not famous at all).
I would understand if everything happened on my phone/accounts, but the suggestion was on her phone and account. I don't think they're literally listening to you, but there's definitely a GPS-based user relationship table somewhere which reflects what you do to everyone they think has some connection to you and is physically close to you.
The reply all podcast has a nice episode on this, #109. Whether your phone is listening to your conversations.
They conclude that ads are shown to people you know. So you search for a product and then your partner gets ads for this product too, as they know you spend a lot of time together through other tracking methods.
> definitely a GPS-based user relationship table somewhere which reflects what you do to everyone they think has some connection to you
I've seen this when I am sharing IP addresses with someone because we're both connected to the same WiFi. Especially if you block meta from being able to get any data about you but the other person doesn't. Meta then get data from that other person because it can't see anything from yourself.
It helps when I'm looking for a gift for my partner - I can see what she's been looking at recently because its most of my adverts in Facebook.
Pretty sure this can be explained without GPS, if you were both using the same internet connection (e.g. home WiFi NAT’ed through the same IP from your ISO).
You are both on same wifi
Just because they’re on the same wifi doesn’t mean FB gets to break the SSL connection between OP’s (separate) client and YT to snoop on details of the video being played, to then suggest actions on their, competing IG platform, on a separate device.
First of all: I don't think you're crazy. I also don't think WhatsApp is the one leaking your data.
My counter argument: I use WhatsApp all the times and nothing I talk about on it ever shows up in my ads. A hefty amount of adblock may help here, as does the fact I live in the EU where the worst tracking is illegal.
Something on your phone is probably leaking data. Most suspect are third party keyboards, accessibility apps, apps with access to your photos and videos, or even Google Assistant. Third party keyboards can easily track what you're typing, accessibility apps can parse what you're saying or typing, and Google Assistant will take a screenshot of your current screen when you invoke it.
Other options are clipboard scanning (i.e. on older operating systems) and perhaps link preview services breaking out of e2e.
Finding what app is selling your information is difficult. For starts, you don't know which device is leaking. Ad companies are smart enough to see the connection between you and your wife. Her search results alone can probably make ads appear on your device!
Also consider the Baader-Meinhoff phenomenon. You can only track special topics if you track the topics of all ads and apply some statistical analysis. If you get blasted with ads all day, you'll notice the ones that you're on the lookout for. Pausing your scrolling through the app to take a screenshot will then reinforce the e-stalkers' algorithms.
If you have two old phones lying around, try repeating this trick with phones that are completely wiped, without any Google account logged in, with firewalls to block anything but WhatsApp from talking to the internet. I bet you'll find that those devices won't generate ads.
Why do I think that? For starters, enthusiasts decompile and analyse WhatsApp APK files all the time, in search for rumours and beta features to report about on tech news sites. If at some point WhatsApp added a secondary information channel about your messages (whose encryption is reasonably proven), reporters would've made a HUGE story out of it. A single line of decompiled code can send tech outlets into a frenzy of Meta accusations and let loose the EU's regulatory commissions for lying to customers. It'd be the scoop of the year!
Personally, I think "Google's keyboard or Instagram's gallery scanner is leaking my data" is a lot more likely than "WhatsApp has never been analysed enough to find the magic leaking code".
As suggested by others, I would also recommend going one step further and turn this into a proper experiment, e.g.:
- systematically record how often this happens, in contrast to other ads, and for each of the reason topics
- record for each random topic if you have also mentioned it anywhere else, e.g. in a Google search it some other digital media
- make the choice of random topics more random, ie. not depending on current moods (which might be biased through subtle, external nudges
These are of course just pointers, and by no means a proper experimental setup.
I'm aware that this might take the fun out of your playful approach. However, you might be surprised by the results, in whatever direction. Also, it would give you a much more grounded fundament for further discussion. Of course you can just keep doing it the current, less tedious way. I'm only suggesting it because you seem to be interested in the topic and it might be more satisfying for yourselves to turn this into a little citizen science project.
WhatsApp embeds google analytics into your client, which means that it does tag all your messages while you are viewing them (and also tags them with each open/close of the conversation).
If you don't know this already, use App Warden to remove spyware handlers on Android and use RethinkDNS to block their ad domains.
What does "tag" imply? What exact data is supposedly taken by Google?
Every time I buy a new car, there suddenly are a lot of the same model everywhere I look.
Are you sending each other links or just mentioning the ads in text?
If this is just in text—and I'm definitely not defending Meta here—could it also be that the ads you see have got us so figured out already? The topic you choose to talk about may be influenced or seeded by your environment (online/offline), and one thing leads to the other almost deterministically.
Here's an experiment: try rolling a die a few times or using a random number generator to pick one word or more from a list like the EFF wordlists , and then talk about that exclusively.
I see a lot of tinfoil theories in the comments but no one mentioning "url unfurl" capabilities. You send a link, it's encrypted end-to-end, it arrives to the other side, and the link is unfurled for display. Bam, Facebook knows you sent that link.
It could be the keyboard on your phone that's essentially keylogging everything you type and sending it to advertisers. I have no idea whether this may have really happened, but it's certainly feasible.
My understanding was always that whatsapp controlled/has access to the key, so they can decrypt anything anyway. It wouldn't be surprising that "end to end" means that it includes you, your partner, and whatsapp.
it wouldn't be surprising that whatsapp gleans info from your comms and builds a profile of you, from which ads get injected. whatsapp is not selling your actual comms, but the likelyhood you'd be interested in certain things/products. sort of like how the three names supposedly only store metadata of your calls, not the actual call.
Recently I was booking a trip to Finland, so my booking app showed me suggested destinations to other nearby countries like Estonia. That's normal.
Then, my friend asked me where do I want to go the most if I am to go scuba diving. I answered "Phillipines". My friend then said "Maldives is also great". We never searched for anything, just casual conversation. A few minutes later I look at my booking app, guess what were the top suggestions - Maldives, followed by Phillipines. Must be coincidence.
"End to end encrypted" on proprietary code base from META of all companies, with no ability to publicly audit?
I am to this day baffled by gullibility of people believing that WhatsApp is E2E encrypted.
Messages sent are encrypted, but what about the keyboards. I know Samsung logs all key strokes for ad purposes. Also WhatsApp backups maybe? Are they encrypted? What if another app is reading off of them. Or screen time apps, most of which are selling your data and need permission to read everything on your screen to block apps and content
If the confidentiality of your messaging is a concern, you shouldn't be using whatsapp anyway or most closed source software. There's mostly no point in speculating, because it will be hard to verify the extent of information leakage from the vendor anyway.
It would be better to use signal or element, something that tries to solve the key exchange problem. And if you are even more concerned, run their respective server software on your own hardware. Then you can inspect what goes in and out.
Please note that it could also be the keyboard app sharing stuff.
I don't think end-to-end encryption means what most people in this thread thinks it means. Yes, it's encrypted between devices. However, when I open Whatsapp on a new device, it pulls all my message history for the last 5 years. I wouldn't be able to do so without the keys. So to facilitate that, they must be storing the keys and sending them over on their end.
And if they have the keys then they can still read your messages!
I'm pretty sure even the mics are listening to what we say since for years it's been happening that when I discuss something with my girlfriend ads about it start to show up. We don't even use WhatsApp, didn't search about the said company on Google or Facebook, nothing. We just discuss something about it with our phones on our side and suddently ads start showing up.
The fact they are spying on WhatsApp messages isn't really surprising.
There has been a couple of studies on the topic that observed network traffic and never were able to found anything. So despite the persistence of the rumors, so far this remains a conspiracy theory.
Speech to text isn't difficult to do and within the the capabilities of Meta. I even wrote a basic proof of concept on what it might look like: https://github.com/smuzani/soundrecorder
It's not too hard to transcribe text into a few kb, or summarize them further on the client, and then upload that together with the rest of the data.
I'd be very interested in seeing those studies and whether they match up.
Sure, it's end-to-end encrypted, but it means nothing when Signal, Meta, Apple, Google etc. are conveniently keeping a copy of the encryption key.
They probably have the key, but decrypting takes server time and why bother? They are the ones who encrypt it and decrypt it. Just scan the plain text. Which they must do in order to encrypt the message.
In my case i observe this was happening because of keyboard i was using (Gboard in my case), generally they keep track of all the words we are typing (for dictionary or suggestion purpose??) and that is used for ads targeting.
Let me share a plausible scenario which may be a bit on conspiracy theory side :) Yes, WhatsApp uses secure protocol which prevents anyone, even WhatsApp servers from reading your messages. However WhastApp mobile app, developed by Facebook can see all your unencrypted text while it is typed or shown on the screen. So the ad targeting may be happening then.
I would love to see a control channel in this experiment here.
Pick two topics every time. Send one topic to your wife on Whatsapp. Write paper messages to your wife about the other topic and give it to her.
Track how often you see advertisements for topics in both channels. A significant difference in any one channel will be worth sharing.
Prediction: they will still see ads for the topics shared in writing because the surveillance being done is based on recording the inner speech of everyone with microwave imaging and machine learning.
Encryption is useless against advanced covert radars. Big Tech knows and benefits by lying by omission.
"E2EE" -> "And during encryption we extract/share non-identifying personal information that we sell to ad networks (keywords or vectors from your conversations). But don't worry, there's no way to identify you based on this data."
Ok, so you have the data now. You seem to be a person who cares about privacy, so why do you still have a WhatsApp account?
Meta owns the code. You type messages into a little text box, Meta (the whatsapp app) takes that plain text and encrypts it. There's no way that they do not have access to each and every single message you type for enough time to collect data on you.
And, given it's Meta, there's no way they are not doing this.
That doesn't mean that they are collecting data related to the message themselves; if you can get proof of that, you'll make a ton of money! Whatsapp doesn't operate in a regulatory vacuum, and they would be in big trouble if they broke their E2EE/privacy promise.
I realize that real life is more complicated than this, but if you tell companies "privacy is important to people" and "you will make more money by reading their messages", it is safe to assume companies will assure you that your data is private while using it to make money.
How do you know you have picked the topic at random? Maybe you've subconsciously seem adverts for that thing and that is why it is salient to you. Once you write it down, it becomes actually conscious and you start noticing the ads.
I had a video call with my mum on signal from her android phone to my macbook. Then I got very specific video recommendation in YouTube about our conversation within the same day. Her android is oppo. Could it be leaking the signal call and then cross match me with the phone numbers to my google account?
Yes, it’s possible. If you have an android device, it’s possible your device (or a ‘bad app’) is sending signals too.
An easy way to test is by talking about something (but not researching it) that’ll put you “in market” for advertising targeting.
You must say things to show you’re in-market. Eg. “I really want to buy a new AWD pick up truck” — include some brand names and specifics, which will give the spy device more confidence about the ads to show you, something like “Toyota Tundras seem better than the Ford F150, but maybe I should get the Hummer EV truck”.
Try variations of this “conversation” a few times over 2 days to help give the ML model confidence. Then, monitor ads on social media, pre-roll ads and new ephemeral recommendation tiles on YT that suddenly feature videos on this topic.
More likely something on your end was listening, if you’re the one who got the ad.
My wife and I play a similar game .. except we don't actually exchange messages, we just talk in private about something. Next we'll try just thinking about it, and see if that works too!
Most users activate WhatsApp's suggested "backup" feature which uploads all chats and those are most likely not E2E encrypted. And of course Meta could just directly read and upload stuff from the app, they probably buried a vague clause for that somewhere in their agreement.
Those backups are sent to Google and Apple, not to Meta.
They are E2E encrypted now in recent versions.
No, you have the option, disabled by default, to enable backup encryption. And there are two ways to enable it, one with a short password that when used Meta can always unencrypt the backup themselves and another with a 64 digit key, which AFAIK is unaudited.
This is just https://www.wired.com/story/facebooks-listening-smartphone-m... in a different medium.
I’ve seen this with telegram on iPhone. A few weeks ago I had a drastic argument with a female friend on telegram and immediately Instagram (meta) started showing me breakup memes and Rumi quotes. There were other recent examples. I don’t know what to make of it.
Both E2E encryption of messages and data processing for ads can happen at the same time. The app can do it on your phone and just send keywords to a server. In fact, this is the absolute best way for apps like these to do it.
You and your wife can both install Signal and play the same game. Then you can discard that Facebook is snooping on your messages. And you can think of bigger conspiracies which is always fun.
I always wondered how is the business decision between charging a dollar per year for no information sharing vs some information sharing but free made. Does anyone know of a messaging app that uses such a model?
It's probably not WhatsApp. It's maybe your keyboard. Gboard (default on many Android phones) for example is sending words by default back to Google if you do not turn it off.
IMO keyboard apps, ISP, clipboard fiends are to be blamed for the most part
WhatsApp is end to end encrypted, but it can still detect what you are typing and can use that info to target you. It just can’t detect anything once you click send.
I was under the impression that Google keyboard uploads everything you type to the cloud for 'product improvement' purposes. Why do you think WhatsApp is to blame?
Call me paranoid, but what about stuff that you just say when around your phone? This has happened to me a few times (ads showing up about relevant obscure topics), so I'm wondering.
I think that's paranoid. Looking at the state of speech recognition, they have a hard time even when actively listening to a clean source. Extending on this, they have more productive, lower hanging fruits than this.
Just uninstall the apps.
Uninstall Google too while you're at it. Click.
Use antitrust to break up the tech conglomerates into individual companies and ao avoid collusion of this sort?
There might still be collusion but it'd likely be far more transparent.
The message might be indeed end-to-end encrypted but the local WhatsApp installation could extract keywords as you type from the message and send them to Meta.
The android keyboard is also capable of reading everything you type, especially if you install a third party one.
Meta can still run on-device code to extract meaning from messages and pass on the metadata to the Ads backend.
make something up.
send messages extolling the utility of brrlftz discuss how every body not taking advantage of brrlftz will miss out. let your SO know that you need as much brrlftz as can be produced and delivered.
keep your eye out for cheap imitations offered to you.
How are you coordinating the topics? Maybe by voice within earshot of a phone with Meta apps?
Your silly game is what watchdogs should be playing.
Sadly, they just never seem to be up to the task.
> Is WhatsApp scanning personal messages to target their ads as we are noticing?
Three letters...begins with Y
I literally mentioned Vitamix in a single WhatsApp call and started getting Vitamix ads on IG. Note that I've never searched for a blender, and I don't really want one; this is the only occurrence of Vitamix in... years of my conversations.
So, I'm guessing that they not only read your messages, but also run TTS on your calls and serve relevant ads.
A lot of users backup the WhatsApp encrypted messages to google drive unencrypted.
If it is all encrypted how comes the police always have access to whatsapp conversations?
We do not retain your messages in the ordinary course of providing our Services to you. Instead, your messages are stored on your device and not typically stored on our servers. Once your messages are delivered, they are deleted from our servers. The following scenarios describe circumstances where we may store your messages in the course of delivering them:
Undelivered Messages. If a message cannot be delivered immediately (for example, if the recipient is offline), we keep it in encrypted form on our servers for up to 30 days as we try to deliver it. If a message is still undelivered after 30 days, we delete it.
Media Forwarding. When a user forwards media within a message, we store that media temporarily in encrypted form on our servers to aid in more efficient delivery of additional forwards.
We offer end-to-end encryption for our Services. End-to-end encryption means that your messages are encrypted to protect against us and third parties from reading them. Learn more about end-to-end encryption and how businesses communicate with you on WhatsApp.>>
> End-to-end encryption means that your messages are encrypted to protect against us
So, they say the protection is there once the encryption has been applied. They say nothing about what happens to the content before or after that on the end user's devices. That handling is however covered by other legitimate use clauses in the privacy statement. This covers keyword scanning for targetted ads (so a defence lawyer will say at some point.)
Yeah, that could be tested by reversing engineering and analyzing your network traffic to see if there are requests leaking keywords
It's hard to encrypt "all". "End to end" encrypted would mean that a message is encrypted while in transit. "At rest" encryption would mean that the plain text message is not saved on storage, just the cypher. Software however can behave in a way that all of these are true, yet, the plain text is readable by a third party. For example, the app can send the plain text after you decrypted it. Or the encryption could have multiple keys that can decrypt; you can have a private key for yourself, and the authorities could have a master private key that decrypts all. Your phone could have a middleman between the touchscreen and the application, and so, your "keystrokes" can be stored and sent somewhere, for example, by your keyboard application.
That's nothing. I talked to my wife in bed about sex and then immediately received an email selling viagra and cialis. It's crazy, but I think gmail must have put a microphone in my pillow. It's the only way.
I was sleepy so when I woke up I tore apart the pillow after brushing my teeth but there was nothing there. They must have taken it out surreptitiously when I was in the bathroom.
I know it wasn't my wife because I put her phone in a Faraday cage to stop her from using the Internet when I'm home. Unless... now that I think about it. Unless she's secretly working for Facebook. She said a friend of her's could get her a Portal webcam. No one buys those things unless they're working for Facebook!
Yeah but tough luck trying to prove it.
Fuel the monopoly, suffer the monopoly.
Sounds like confirmation bias?
Which keyboard are you using?
they are using gboard, so it's pretty clear nothing to do with whatsapp but google doing google things
What was the topic?
Have you considered the possibility that Facebook is listening to you via the chips Mark Zuckerberg added to the COVID vaccine that he forced you to take?
Just do your own research, man...
From first-hand authority, some encrypted/secure apps have client-side feature detection. Leaking features doesn’t count as violating a standard. May be running something like a named entity recognizer or keyword/category recognizer that is “sufficiently” anonymized. This on photos too; of course this can be adjusted for device parameters, battery availability, and geolocation. I cannot speak to WhatsApp specifically, but note this directly from at least one other popular messaging app. I would absolutely assume no encryption exists. Rats in the opera house.
All these corporation apologists... why wouldn't Meta/Fecesbook do this? You can read about corporate over-reach on a daily basis - and that's just what we find out about. Here are some of todays privacy headlines on the register, right now:
Significant customer data exposed in attack on Australian telco - Subscribers have questions – like 'When were you going to tell us?'
Boeing to pay SEC $200m to settle charges it misled investors over 737 MAX safety - Ex-CEO also on the hook for $1m after skipping over known software issues
Privacy watchdog steps up fight against Europol's hoarding of personal data - If you could stop storing records on people unconnected to any crimes, that would be great
Meta accused of breaking the law by secretly tracking iPhone users - Ad goliath reckons complaint is meritless – but it would, wouldn't it?
Federal agencies buying Americans' internet data challenged by US senators - Maybe we don't want to go with the netflow, man
.. and I'm only halfway through!
Did you think the rule of law is there to protect you? Do you think corporations won't break the law to get access to information? Have you learnt nothing?!
Its very unlikely. Not because FB is a kewl company with super kewl morals.
If you look at through the lens of game theory, the employees are extremely incentivised socially, ethically and financially to leak it.
First they didn't break E2E because its very hard to do it without people knowing.
So know we are talking about a "soft break" where they search/send key words before e2e kicks in. They wouldn't be able to that without quite a few employees knowing.
Let alone those super nerds who spend insane amount of time reverse engineering these apps and spoofing network requests just to see wassup.
Completely agree, there's a ton of incentive for any security researcher to find E2EE leaks on the world's leading messaging app. In addition to that, Meta would be in ton of trouble if they did from a legal point of view.
Easiest explanation would be your keyboard leaking data, what keyboard your wife and you use?
There is an app called TransferChain. https://transferchain.io They say everything is encrypted and no one can read or modify or sell data because it is heavily encyrpted and meta data are on blockchain.. It is a Blockchain project and seems very legit. Anyone has any comment about it? Could you guys give me a feedback pls. My Meta apps all read and analyze my data :( Thanksss.