Monday, September 04, 2006

KDD 2006

I went to the KDD conference at Philly. It is a mixed community with people from databases, data mining, statistics, applied mathematics and an occasional algorithmus. So, it is difficult to get a sense for central challenges, and the holy grail of open problems. The conference seems to have succeeded in turning people towards real data sets and public competitions with public data sets, which is encouraging. Andrew Moore gave a plenary talk: he was his usual engaging self with anecdotes of algorithms scalable to very large datasets with simple, elegant techniques, provable or not. I heard him first at a National Academy of Science meeting on data streams, and he continues to get great scalability. The industrial presence was strong: Yahoo, MSN and Google were firmly represented.

Main themes seemed to be privacy in data analysis, scaling and applying machine learning solutions to data mining, social network analysis like finding communities, and new data sources (blogs, wiki) and unique problems with them. The best student paper seemed to focus on speeding up Johnson-Lindenstrauss, a la, Achlioptas's work, but it was mainly heuristics and did not seem to refer to Ailon and Chazelle.

Btw, for contrast, I think SIAM Conf on Data Mining is less database-centric and draws more Statisticians and Applied Mathematicians, but that too may change.


Blogger thespecial said...

Hello I am a student in Korea majoring industrial engineering in the university. I am thinking about attending KDD 2008, and worrying it might be difficult for me to understand, actually even if i read all of the information on the official website it's hard to figure out how technical it is. (Im only in my third year. undergraduate.) And could you tell me the system of how the conference works? Thanks a lot! ahh, let you know my email or you can also reply by this comment.

8:16 PM  

Post a Comment

<< Home