Sunday, December 21, 2014

Setting 2014 down

Happy holidays everyone! I hope you set 2014 down gently and meet 2015 with energy.

Spurring by an offline conversation, let me add: Some people make the days of their lives and its instant decisions sound difficult, some infuse them with gravitas, yet others with visions of achievement, drama, seriousness, etc. I work very hard to package my grimed fingernails, sleepless sumping and sweat of research, work and relationships, submerge the package, stand on top of it, and sketch aparcus of fun, art, smiles and puzzles. That is what I am, every year. 

Monday, December 08, 2014

On Urban Planning and Story Telling

When I was in University (it doesnt matter which prefecture), I enrolled for a class in Urban Planning. Now I can imagine many reasons why I might have done that, I was 20 yrs old, and I focused more on easy grades and good looking fellow students than learning. Whatever the reason, I didnt really make it to classroom all semester except once. That day, I was drinking my tea in the students center as usual, reading the Monkey King, and puzzling over not being able to recall the monkey's name. I happened to talk to a student, she was easy on eyes, our conversation flowed and before I knew it, I was accompanying her to her class, which coincidentally turned out to be Urban Planning. Her father was a government official in-charge of the local Dept of Buildings, and she really cared about Urban Planning. I dont know why, but to this day I remember what happened in the class. The professor taught us about zoning (how buildings have to be set back a fixed amount from the street) and water runoff (how to build catch basin and french drains to capture runoff from neighbors).

I told this to my friend Haruki in college, and he later told me he wrote a short story about it. I didnt think I had much of a story but I read Haruki's and you know, he is a real writer, he can imagine things I cant even contemplate, his story was creative and went places my mind couldnt be dragged. In the end, it was not my story at all, it could only have come out of Haruki's mind.

But my story continues. Years later, I bought a place that needed a lot of work. I could easily build my own fence because I knew what the setback was, and I built a catch basin too and watched the runoff from my neighbors property.

Sunday, December 07, 2014

Data Science and Online Ads: Panel in NYCE 2014

Thanks to Arash Asadpour, Mohammad Hossein Bateni and Alex Slivkins for organizing the 2014 NY Area CS and Econ (NYCE) day in NY.   I will let the organizers blog about the day, it was a superb program.

I organized a panel on Data Science in Online ads. The panelist are stars and represented a constellation of perspectives in this complex ecosystem.

  • Matt Curcio is from Neustar (via Aggregate Knowledge). He builds and supports a neutral data platform for advertisers to gather and analyze ads data. He is remarkably broad, using data streaming to data privacy in this work. He spoke about the challenge of getting data scientists to collaborate, no matter the company they worked. 
  • Chris Wiggins is now the Chief Data Scientist at NY Times. He spoke about data products at NYT and estimating Long Term Value (LTV) of users. He also talked about placing house ads as an example of reenforcement learning. 
  • Aparna Pappu runs AdX at Google. She focused on AdX and described the goal of fair transfer of value from advertiser to publishers. She mentioned many specific data issues: there are gaps in their data since they dont observe all online events; data viz is hard; there is asymmetry of info since advertisers may know about users than any specific publisher; AdX can not share data equally with all parties; and finally, she spoke about great diversity of data they have so it is hard to find natural segmentations of publishers and advertisers. 
  • Paul Barford is now the Chief Scientist at Comscore, Inc following their acquisition of his Mdot. He described his consulting experience at BIM which is a publisher network that led him to the problem of detecting fraud clicks. He said, when systems are complex and there is money involved, there are bad actors, ie, fraud is a problem. Further, he quoted that data science starts with measurement and it is hard to gather data from ads platforms. 
  • Neal Richter is now the CTO at Rubicon Project. He spoke about how ads sales is changing into being automatic, and a challenge in petabytes of analyses he does with 200B transactions a day is to make the analyses and conclusions explainable to others including biz folks. 
  • Catherine Williams is Head of Data Sciences at AppNexus. She described AppNexus as the largest independent (non FB/GOOG) programmatic media co. and not involved with PII. AppNexus has a performance marketplace which is nice. She spoke about the challenge of suitable incentives for various types of content and stretched us to consider freedom of speech issues when we emphasize one type of content over the others. 
  • Claudia Perlich is Chief Scientist at Dstillery. She spoke about the predictive modeling and machine learning challenges in prospecting for ad targets. In particular she pointed out that it is not as much about predicting if you will buy X as it is to convince/convert you to buy X. She quipped that from their lat/long data of users, 30% os US population travels above the speed of sound! She also talked about how not to look at artificial metrics to improve in ad platforms, and the challenges of getting performance data from networks. 
  • Jon Krohn is a Data Scientist in the orbit of Omnicom, a large media company. He started with the observation that you need data to spend money well, and went to the board to draw the ``river of money'' from advertisers to agencies and media companies like his, to eventually publishers, with $s dwindling along the way with 20--30 hands that touch the transaction. 
I summarized their presentations. Discussions ensued:
  • Costis Maglaras asked, is the ad market going to be like DJ with small transaction cost or like Christies with XX% cut? Goods in ads are ephermeral, valued differently by different parties and cant be retraded, so not clear financial analogies apply. 
  • Vahab Mirrokni asked, is the ad market converging to reservations/allocation or auctions? Catherine mentioned that platforms like AppNexus are supporting many different types of markets from reservations to private packages/deals to RTB and performance. 
  • I asked if large distributed ML package that searches automatically over models and parameters will suffice for ad business. No, because information is not complete, players may not be rational, not single objective optimization, signal is weak, moving targets, etc. 
  • I asked why more of microeconomic concepts didnt penetrate ad markets, like substitutable goods. This is because publishers dont think their inventory is substitutable, and there are handcuffs around who owns data and privacy isseus, so data permissions dont let this info be usable. Paul Barford said Comscore is an exception of data and he was willing to work with academics on data access. 
I now have the formula for a great panel: recruit great professionals, let them go, and sit back. I enjoyed the panel immensely. I hope researchers connect with the folks above, there is a lot we can gain. It was good to sneak into NY academic scene, if only briefly.