Sunday, August 27, 2017

Data Science and Social Good

Bloomberg folks organize a day of activities on Data for Good, Sept 24, 2017, FYI. 


(Sublinear) Time Meets Space

Maybe this post should be called "property testing meets streaming". For more than a decade now we have managed to put together meetings that had both property testing folks (with methods running in time sublinear in input size) and streaming folks (with methods using space sublinear in input size), and in many of these meetings, researchers debate the nuances and results of these communities (the usual: algorithms vs complexity, for each vs for all, yaada yaada yaada). Obviously each community has learned from the other, but there were not many results that formally related the underlying concerns.

Recently, Christian, Pan, Morteza and I showed: ".. for bounded degree graphs, any property that is constant-query testable in the adjacency list model can be tested with constant space in a single-pass in random order streams." This is a human-sized hop towards establishing formal connections.  Hope more hops and leaps will ensue. 


Some Insta Humor

The left photo tells you what NY restaurants have to deal with, artists painting amazing pieces over lunch; the right photo tells you the height of human aspiration found on a sidewalk in mission, SF. 

Friday, August 18, 2017

Heavy Hitters Continued

Recently I revisited the classical heavy hitters problem on streams (I think I can call it "classical", I remember the meeting where we chose to call overwhelmingly large items as heavy hitters, in the shadow of the baseball scandals in early 2000's)  but now looked at the high dimensional version, where each stream item is a d dimensional vector.  

There is a d^2 space bound that is inherent, and we manage to circumvent it, by using graphical model (in our case, Naive Bayes) on the dimensions. This is in general a fruitful direction, I think, using a dependency model on dimensions to half-circle around lower bounds we get with high dimensional analyses.  Our paper is in arXiv, joint work with Brano and Hoa. 

This work was motivated by collaboration with Adobe that needs to analyze high  dimensional web/ad analytics data. That is covered in the Ad Exchanger article, as part of the coverage of the university funding program that Anil Kamath spearheads at Adobe (and does a superb job of eliciting very specific projects but drawn from a wide set of areas).


Thursday, August 17, 2017

Artificially Intelligent Assistants, KDD Panel. What happened.

Andrew Tomkins and I cohosted the KDD 2017 panel on AI Assistants.  The panelists were Usama Fayyad, Larry Heck, Deepak Agarwal and Bing Liu, representing the spectrum from entreprenueral to academic including corporate research, in many cases individually.
  • For AI assistant technology today, talk about either a current success story, or a shortcoming in the market.   (Success story for "head", goal oriented bot: Siri, Alexa, Cortana, Google Assistant, DuerOS (Baidu), or Wechat bot Xiaoice  for chit-chat) Need:  The need by a broad set of companies for AI assistants is a given/assumed → broad engagement/interest by many companies to build/leverage AI assistant technology. 
  • "Help me plan a trip” versus “Connect me to my travel agent.”  Does user talk with one “butler” agent versus many specialized agents?  Follow-on:  What critical standards are required to enable assistants to operate effectively across multiple sensory domains?  How are we doing at developing these standards?
  • What implications will artificially intelligent assistants have for teens?  How about in the workplace?  Follow-on:  what are the expectations for task-focused versus chit-chat assistants?
  • How should a user be able to empower AI assistants to operate on the user’s behalf?
    1. “I’m sorry, Joe’s not available right now.”
    2. “I booked the show you wanted.  It’s 7:30 tonight.”
    3. “I found a great date for you this Saturday night.  Dress nice.”
    4. BTW, I bought you a house.
  •  What are research bottlenecks? (Dialogue understanding and large scale availability of suitable data, information elicitation methods, classical AI open issues like common sense reasoning, etc.) 
  • What role should assistants take in your life?
    1. Servant: does as you say
    2. Friend: emotional support
    3. Mentor/therapist: provides guidance
    4. Psychological role: assistant as id, ego, super-ego
    5. Deity: superhuman being with power over your fortunes, lead a way even if we dont see the rationale or even punish.
  • You can only optimize what you can measure.  For these different roles of an assistant, what are the right metrics?  
  • If assistant is built by multiple companies loosely cooperating and employing black-box learning techniques, why would its goals be perfectly aligned with yours?  What goals should we expect?
  • Could there be an assistant for your pet, wild animals and plants?
  • Which portrayal of AI assistants from literature or film do you find most compelling?
  • What is needed to unleash a technological ecosystem around AI assistants and what are the challenges? 
  • Last question to conclude the discussion: 1 prediction for the future of artificially intelligent assistants that you would like to be correct about and 1 that you wish you were wrong about. 
There was a lively discussion with people wondering about the implication of AI assistants among other things. 

In parallel, we had the following ads shown on the screen: 
  • Do you want to make $1000 a week from the comfort of your home? Get
  • Save AI-Assistant lives! Get
  • Are you bored as a parent? Wait for
  • Do you want someone to do award-winning research for you? Get
  • Need VC funding? Get


Monday, August 14, 2017

Artificially Intelligent Assistants: KDD 2017 panel

I agreed a while ago, so despite physical health-related challenges, I will travel to cohost a KDD 2017 panel on Artificially Intelligent Assistants with the immaculate Andrew Tomkins.  We have great panelists, check out the official site