Closed Bug 1105376 Opened 10 years ago Closed 9 years ago

Support negative adjacency of tiles to not show a tile in the context of a visible site

Categories

(Content Services Graveyard :: Tiles, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Mardak, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [story])

Some clients wouldn't want an enhanced tile shown in the context of undesired sites. It sounds like this blacklist will be on a per-tile basis ? The server needs to allow for this blacklist and the client needs to decide when an enhanced tile can be shown.
Depends on: 1105381
We'll want to support negative adjacency for sure for suggested tiles, and if supporting it for enhanced and directory is not too much more effort, we should try to cover those too.
Summary: Allow enhanced tiles to be conditionally shown → Support negative adjacency of tiles to not show a tile in the context of a visible site
Blocks: 1140185
(In reply to Tanvi Vyas from dev.planning reply) > I'm not sure you can get around leaking data about tiles that would have be > shown next to the movie trailer (but are hidden). Perhaps this isn't > something Tiles should be asked to support. Do publisher's guarantee that > certain ads won't also show up on webpages next to the movie trailer ad? > For example, an ad steakhouse next to an ad promoting vegetarianism? jterry, do you know what a steakhouse advertiser would do in the context of a vegetarianism website? I would assume that advertiser just wouldn't run ads on that website in the first place. But maybe a better situation is the steakhouse wants to advertise on major publisher, and there happens to be a whole section on food, so it would seem like a good place to put the steakhouse ad. But perhaps there's a specific article talking about vegetarianism. In either case, I would guess the advertiser sees it as wasted dollars. But not showing a movie trailer next to an illegal file sharing site is more of protecting the brand image?
Flags: needinfo?(jterry)
The steakhouse / vegetarian example was the best example I could think of quickly. But publisher pages have lots of ads. There are bound to be two ads that conflict with each other. Do publishers and ad networks have heuristics to make sure that doesn't happen? My guess is that they do not. The illegal file sharing site came from the user's history right? As a top visited site? Why would showing a movie trailer ad damage the movie producer's brand?
There are companies in the adtech world that provide services that help advertisers not run in "dangerous" places. For example, an airline wouldn't want to run ads in the context of an article about plane crashes. I don't think there's technology to check what other ads are running on the page as a given site could use two completely different networks. Or maybe even a common example is ads are placed independent of what user comments might appear on the page. jterry can provide better explanation of why an advertiser would want negative adjacency in the first place. I did a quick search and found this article that does mention your exact problem of two different ads conflicting: http://www.adweek.com/news/advertising-branding/problem-online-advertising-there-are-so-many-problems-139071 "Regardless, advertisers need to be diligent about protecting their brands from sites they aren’t OK with, despite using technology designed to prohibit undesirable adjacencies."
For v1 of negative adjacency feature, we will filter out adult related content from appearing with Suggested Tiles. With subsequent version, we will expand scope to include other common blacklist topics such as illegal drugs, alcohol and tobacco, illegal gambling, guns and weapons and other content topics that do not align with Mozilla Acceptable Use Policy: https://www.mozilla.org/en-US/about/legal/acceptable-use/ I'll create a sub-bug defining negative adjacency related to adult content.
Depends on: 1159884
maksik was asking for my opinion on implementation of this feature. First here are my thoughts on the feature in general: It seems like both relevancy and adjacency are getting conflated in the discussion but there are subtle differences. I personally think improving relevance is more important to users and advertisers so I personally would prioritize that over adjacency. The steak house example is more of a matter of relevancy in that we shouldn't be showing an enhanced tile to a user at all if vegetarian sites are frecent for the user (regardless of whether they're on the new tab page at all). Neither the user or advertiser benefits if we show the steakhouse tile in this case. I'm not really convinced that adjacency would have much impact in the context of tiles since I personally think that users consider each tile as an independent page and I'm not sure it's worth the large effort (in software development and list maintenance) to implement negative adjacency. Contrary to implementing relevancy, negative adjacency doesn't benefit users, only advertiser interests, if at all.
Adding few more relevant points from the conversation 1) We do not want to porn sites in clear on users machine: If user's computer is scanned at TSA in the airport, a user may be in trouble without even knowing why 2) One solution is to hash bad sites domains and create a massive json object. The client hashes domains of newtab viable history tiles and checks that json object for inclusion. The size of json object is not trivial if the black list of hashes is long. 3) Due to potential size issues, the use of "safe browsing infrastructure" should be considered: relevant links are below: https://mxr.mozilla.org/mozilla-central/source/netwerk/base/nsIURIClassifier.idl https://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/SafeBrowsing.jsm#150 "Safe browsing" already downloads long list of attacking sites, stores, hashes, and provides efficient lookup to parties wanting to check url against "unsafe sites" However, if we do remote updates, we may need to clear with privacy, since remote updates and tile reporting make possible a privacy attack where Mozilla feeds an evil black list (for example a single site black list) to clients, then configures a tile to be avoid negative adjacency, and watch clients stop reporting tile views. ----------------------------- This fills like a bug that may be really involved, we may want to rethink if this needs to be in 39. We should check for the size the object that lomay take 3) they would like us to use "safe browsing" infrastructure to download lists which make scheme updatable, hence potential privacy leak
(In reply to Matthew N. [:MattN] from comment #6) > It seems like both relevancy and adjacency are getting conflated in the > discussion but there are subtle differences. We are getting at relevance with the Suggested Tiles functionality where we try to do a positive match based on browsing behavior. This discussion around negative adjacency is primarily for the advertiser's benefit (or more for Mozilla's benefit in that the advertiser would not work with us if we don't support it).
(In reply to Ed Lee :Mardak from comment #8) > This discussion around negative adjacency is primarily for the advertiser's > benefit (or more for Mozilla's benefit in that the advertiser would not work > with us if we don't support it). I still think we should push back on this since we're in a different situation than most online ads where advertising is beside content. In our case the ad itself is content and I think each tile is considered separate by most users. i.e. I can understand that a major brand wouldn't want their ad appearing on an adult website but appearing beside one in a tile is different (and much less of a problem) IMO.
Maxim already mentioned most of the technical aspects that we discussed in comment 7. Re: 1) I definitely don't think we should store undesired sites in plain text on the user's hard drive since the list could be misunderstood to imply that a user had visited or had some interest in those sites. For 2 & 3) I definitely think we should consider re-using the the Safe Browsing infrastructure like we did for tracking protection as it's already optimized for this type of data, is already accessible to JS, supports multiple classification list and gives out-of-cycle updates for free. It also supports chunking of the data for download and partial updates. * We have already implemented our own server: https://github.com/mozilla-services/shavar * I believe we're following this format: https://developers.google.com/safe-browsing/developers_guide_v3 * We can use these APIs from JS already: https://mxr.mozilla.org/mozilla-central/source/netwerk/base/nsIURIClassifier.idl * See the tracking protection related-code in https://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/SafeBrowsing.jsm for an idea of what you would do for each undesired content list.
I propose to run telemetry experiment to test the merit of having porn related data in FX, even in a form of bloom filter or hashed list of offensive sites. My arguments are: 1) as a user I do not want to have anything do to with porn, much less my browser keeping list of porn sites (whatever form it could be), or checking on me if i visited a porn site and making some decisions based on that. And certainly NOT sending any information to backend server that may reflect if I visited a porn site. 2) even if we know that our scheme is benign, privacy protecting and provides deniablity, still this feature may cause a shit storm when general community discovered that FX is watching people doing porn. I feel this danger is not yet quantified, nor we discussed/measured potential reaction of Mozillians to it. 3) The gain of this feature could be non-existent. Porn tiles are VERY embarrassing: I would expect users to immediately block them if they appear. Which raises the point: do we know the merit of implementing such a controversial feature? If 99.9% of FX users never have porn sites in newtab - the feature is UNNEEDED. So, let's run telemetry experiment protected by RAPPOR that counts newtabs from hashed list of porn domains. If the count is minimal - we simply report to advertisers that porn tiles are extremely uncommon in mewtab, hence no need for negative adjacency.
(In reply to maxim zhilyaev from comment #11) > I propose to run telemetry experiment to test the merit of having porn > related data in FX That's fine, but we need to implement something in the meantime. We already have existing telemetry data that we can use to estimate some number of impressions that resulted in a porn site being shown on new tab. > 1) as a user I do not want to have anything do to with porn, much less my > browser keeping list of porn sites That's nice, but the "anything to do with porn" data exists in the browser already. Safe browsing has data that is related. If you want to get more theoretical, any search engine part of Firefox is quite related to porn. You could argue that private browsing is quite related to porn or the browser itself being able to navigate to any website including porn. > 2) even if we know that our scheme is benign, privacy protecting and > provides deniablity, still this feature may cause a shit storm when general > community discovered that FX is watching people doing porn Would you rather the blacklist not be solely porn sites? We can work on messaging if necessary. To be clear, users who browse to porn sites are nowhere close to invisible -- their activity is quite available to 1st party and many 3rd party servers. That doesn't make it okay, but there probably isn't much gained/lost in perceived privacy. > 3) The gain of this feature could be non-existent. Porn tiles are VERY > embarrassing: I would expect users to immediately block them if they appear. Existing data shows that to not be the case of people blocking immediately. I'm quite confident there are users who have porn tiles pinned. There are plenty of Firefox features that only trigger for a small percentage of users -- basically any feature behind a pref, and anything that happens in the exceptional case, but an implementation needs to handle it. As mentioned before, implementing this feature is primarily for Mozilla's benefit, and at this point it is quite critical for Suggested Tiles success and for overall goals.
(In reply to Ed Lee :Mardak from comment #12) > As mentioned before, implementing this feature is primarily for Mozilla's benefit, and at > this point it is quite critical for Suggested Tiles success and for overall > goals. We need to be careful not to implement a feature that benefits Mozilla but is detrimental to our users. (If I'm not mistaken) one of the goals of the tiles project is to show advertisers that it is possible to place privacy-preserving ads. And hope that some advertisers learn from our techniques and start adopting more privacy-preserving practices. Are advertisers not understanding the distinction between adjacent tiles and in-content ads? Adjacent tiles are similar to adjacent ads, which (as noted in comment 4) is something that ad agencies can't control. Is this a critical feature for many advertisers, or just one or two that we can live without? Alternatively, is there a way to not show the tile while not reporting back to our servers that we didn't show it? Perhaps we could account for this by assuming that X tile placement attempts actually resulted in 0.9*X actual tiles placements.
We aren't reporting back that we aren't showing a tile, so if you believe that's the source of "detrimental to our users," then it shouldn't be an issue here.
(In reply to Tanvi Vyas [:tanvi] from comment #13) > Alternatively, is there a way to not show the tile while not reporting back > to our servers that we didn't show it? Perhaps we could account for this by > assuming that X tile placement attempts actually resulted in 0.9*X actual > tiles placements. Nevermind, I see this is covered in the dev-planning thread.
(In reply to Ed Lee :Mardak from comment #14) > We aren't reporting back that we aren't showing a tile, so if you believe > that's the source of "detrimental to our users," then it shouldn't be an > issue here. Yes, that is what I was referring to. If we aren't reporting back to the server, then we aren't leaking data about the user's browsing habits! Using a bloomfilter and the safebrowsing API sound like great ideas!
All bugs this feature/[story] bug depends on have been fixed There's some followups, e.g., switch to safebrowsing bug 1164303
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Flags: needinfo?(jterry)
You need to log in before you can comment on or make changes to this bug.