DATA MINING
Imagine that you had a banquet table covered with pennies. You are looking only for 1944 pennies pressed by the Denver Mint. En mass, the pennies all look the same. One morning you decide to start looking for the 1944D pennies. After an hour or so, you just cannot do it anymore. You are not even sure that you haven’t missed one.
A geek friend says he can find at least half of the 1944Ds in minutes.
He sets up a high resolution camera at a good height above the table. This camera scans the entire table in seconds. A program, containing 360 patterns of 1944D, scans each frame. Each frame is 600 x 600 pixels (360,000 pixels).
On finding a 1944D, the computer spits out the X, Y coordinates of the 1944D.
Some of the pennies are Canadian. Some of the pennies are slugs. Some of the pennies are German 5 pfennig coins. Some of the pennies are old beyond recognition. The computer is not looking for those, doesn’t care about those, and they aren’t worth looking for in the first place.
Now you understand the concept of “data mining.”
- - - - - - - - - -
When I was a geophysicist looking for buried unexploded ordinance, I would have missed a wooden box of buried gold coins, or diamonds stashed in a plastic container. My job was not to look for treasure.
Suppose the Corps of Engineers said, “We are not going to clear former ranges anymore. Just turn the property over to the public and let them take the risk.”
Suppose the idiots convince the NSA to stop connecting the dots.