To theoretical computer scientists, random sampling comes naturally. We think it is simple, elegant even, and useful. In practice, random sampling presents problems. If you walk into a cafe in NY, 90%+ people have macs (and look stylish); walk into an airport, and 90%+ people use windows machines (and look much less chic). Any estimate you get from these "samples" by themselves or jointly for the market share of these products is (doable but) difficult. Worse, I have been working the past year or so putting sampling primitives into a database engine. Different users need different kinds of samples (eg., biased, distinct, fixed size, persistent, over different attributes, with or without replacements), no single sample meets all requirements and maintaining all these different samples is a systems nightmare. Also, users want the same answer if they rerun a query! Take all this into account, and putting sampling primitives into a database becomes a research problem.