[lbo-talk] science without theory

Charles Brown charlesb at cncl.ci.detroit.mi.us
Fri Jul 11 12:20:52 PDT 2008


shag

http://www.kk.org/thetechnium/archives/2008/06/the_google_way.php

this article above is being discussed on another list. haven't had time but for a skim, but something tells me that 1. this has probably already been discussed before (that is, goog didn't invent it) and/or 2. that the article's authors are confused as to how theory construction actually proceeds in the sciences.

i'm too busy lately so with any luck something interesting with ferment

from the lob collective brain trust.

shag

There's a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things. The traditional way of doing science entails constructing a hypothesis to match observed data or to solicit new data. Here's a bunch of observations; what theory explains the data sufficiently so that we can predict the next observation?

It may turn out that tremendously large volumes of data are sufficient to skip the theory part in order to make a predicted observation. Google was one of the first to notice this. For instance, take Google's spell checker. When you misspell a word when googling, Google suggests the proper spelling. How does it know this? How does it predict the correctly spelled word? It is not because it has a theory of good spelling, or has mastered spelling rules. In fact Google knows nothing about spelling rules at all.

Instead Google operates a very large dataset of observations which show that for any given spelling of a word, x number of people say "yes" when asked if they meant to spell word "y." Google's spelling engine consists entirely of these datapoints, rather than any notion of what correct English spelling is. That is why the same system can correct spelling in any language

^^^^ CB: How about the theory in this case is that lots of people misspell words in the same way ?

This message has been scanned for malware by SurfControl plc. www.surfcontrol.com



More information about the lbo-talk mailing list