April 2006 – Robin's Interesting Thoughts

This Interesting Thought came when I was thinking about the research I did this past summer. I was working on the problem of automatically finding sentence boundaries in a string of text. The big issue was that in a string of hundreds of sentences, there are on the order of 2^100 ways to place the sentence boundaries — only one of which is correct!! In other words, placing boundaries exhibits exponential complexity.

Our solution to this problem was to take high-probability boundaries as given. Such placements might be incorrect, but usually they aren’t and we were willing to sacrifice this little bit of precision. Because then, we only had to consider multiple possibilities for the boundaries in-between these high-probability boundaries. In mathematical terms, if we split the text into 10 high-probability segments of approximately 10 sentences each, the number of possibilities is only 10 times 2^10: about 10,000 possibilities, easy for a computer to consider. This is FAR, FAR smaller than the original 2^100, which is so big it is essentially impossible to comprehend. 2^100 is approximately the number of words that would be produced if every human being who EVER lived produced an entirely new academic research library every SECOND of their life.

But the point is, it occurred to me that all of science might be framed as a process of isolating a few variables from the vast network of interconnected variables that form the world. It is impossible to analyze more than a few variables at a time because if you assume that most variables affect most others, the number of “effects” (links between variables) exhibits exponential growth (combinations).

From this viewpoint you could say that if you want to make progress in science, you need to find a way to break down the problem you are solving into component parts with few enough interactions. If such small, useful units are found, you can then use them to build back up to the full size of the problem — where building back up has nice linear growth. This makes it possible to actually make progress. You just can’t get anywhere with 2^100 possibilities. Start writing those research libraries…

Month: April 2006

Science and Exponential Growth

I swear I had an Interesting Thought the other day…