Friday, July 29, 2011


Workshop on Internet Tracking, Advertising and Privacy (WiTAP, if you get the pun) took place at Stanford on July 22.

Dan Boneh (who organizers credited with quarterbacking this meeting) started the workshop and introduced the amazing crypto+security group at Stanford, seminar series, mailing list, and link to course notes and certificate course programs.

Bala described his work on busting privacy issues via a series of papers that collect and analyze data, from '06. He said that people dont read papers, but people read WSJ. He talked about things getting worse with each paper, from web to social to mobile. Users assume they interact with first parties (web sites, social networks, telephone companies), but if you dig deeper, for a variety of reasons, second, third and other parties intervene in the conversation between the user and the first party. The reasons for this leakage could be because first parties outsource data collection and analysis for operations or mining or revenue, whatever. He talked about mechanics of leakage: flash cookies that are hard to delete and that respawn; http headers that leak referer/cookie/requestURL; external applications; mobile social networks that leak presence, location and device ID; etc. From leakage, next development may be that aggregators can join across data sources, and beyond this, fingerprint browsers. What can one do? Bala pointed users to the FTC 2010 report on privacy for more awareness. He also pointed toAcquisti's work on separating offline and online identities as well as Economics of privacy. Among questions: research community knows these leakages, why dont public follow? A: May be the tap in UK is a turning point; Privacy expectations are culture specific. A: Study has to correct for countries, china, korea, US.

Ed Felten spoke next, communicated his personal goal to "write a few lines of code everyday" (in my case, when I was a graduate student, the mantra was, "learn one new lemma a day") but talked about his policy work at FTC for monitoring businesses. He communicated how FTC can enforce: deceptive use (misleading claims or use by companies) is easy to regulate, unfair use is harder to define, and in general, before an issue can come up for FTC enforcement, researchers have to go beyond research and develop support from general population for their case.

One of the anticipated talks was by Omar Tawakol of Bluekai. He spoke about how context and sites are no longer the key in Internet advertising, and user data was the new fuel. He stated the premise of his company that data is too much in shadow, consumers are smart and can handle the complexity of data, and how they support opt-out cookies, open source and so on. How much will users pay for ads free: little. He phrased a technical argument that the plumbing you need for (a) user conversion tracking where one measures whether user actually bought/converted some time after seeing an ad, (b) retargeting, the most effective form of advertising after search, where a user seen at once place is targeted elsewhere, and (c ) frequency capping, where an advertiser limits how many times a user sees an ad, are all the same. (a) and (c) are very legitimate uses for advertisers. Does that justify (b) too? Without 3rd party cookies behind (a-c), in particular (b), he claimed that 70--80% of 7B spend on display ads will be at risk. Tradeoff of free content vs ads is not fair because users don't know what they are trading for the free content; if you aren't paying for it, you aren't the customer, you are the product; why don't we turn off targeting?; why not turn off ads altogether? He proposed that these were all good questions. He tried out a technical argument that if 50% turn off ads, other end up paying for their free loaders content and that is not fair. Not clear this argument flies. He said Bluekai has a dashboard so users know what is known about them. He further pointed out that k-anonymity of a dataset does not help quantify the damage when two or more datasets are combined; he argued for some "smart noise" that will obfuscate the data and still keep it useful for advertising. Questions: common pipe may be avoided by doing (a) and (c) at the client browser, without doing (b); q: what is the marginal value of tracking users; etc.

Russell Glass of Bizo spoke about targeting business professionals. He pointed out also that you have to see lot of ads to make small amount of $'s (margins are small in data/display ads). He argued that what the industry needed was certainly regulation, but that will not provide "trust" which is sorely needed.

Dean Hachamovitch spoke about Internet Explorer and argued it was good at blocking malware and pointed to for consumer protection. He spoke about the list data structure behind how users can specify track/dont track sites. Knowing he was talking to research audience, he posed some directions: How to describing and visualize information flow between sites; how to generating, validate and value tracking lists for curators; etc. Questions: couldnt multiple tracking list conflict each other in white/blacklisting same site? Q: Do you see visualizing information across sites as a browser problem? He wanted a more general solution across vagrancies of the browser.

Narayanan spoke about Adnostic where behavioral profiling and targeting takes place in the user's browser. The ad network remains agnostic to the user's interests. Jonathan Mayer spoke zealously about DONT Track feature (8 byte request in http header?). Paul Francis spoke about privad, a privacy-preserving way to see and be shown ads. Nina Taft did a great job of describing research goals of the new Technicolor Labs in Palo Alto (great location, people and interns!). She had math on her slides, describing differentially private SVM algorithms. She also posed problems: how to model users in a home entertainment scenario; how to do social recommendation at home where users share resources; etc.

Altogether, a good set of talks, a very large audience (200+?), and an important set of topics, I enjoyed the workshop.



Post a Comment

<< Home