Let me start with how much space is one terabyte. According to this webpage, one terabyte is about 1024 gigabytes; so let’s say that would be about four 256GB disks if you buy them from BestBuy, or ones they sell now you can get two 512GB disks currently selling at Fry’s. For a home enthusiast/hobbyist, buying two 512GB disks would be the right choice for a homebrew PC box. Stick it in the two bays, plug the cables and let Ubuntu take care of it. That’s pretty much it, got it squared-away.
I’m going to the next point now and that is about data sourcing. The web is basically a network of data, a vast collection of whatever-you-wanna-call-it is right there sitting on the web. It’s just there, wow! Imagine the wealth of information you can get from that vast sea, ocean or even celestial data space. Man! that is simply awesome, mind-blowing just thinking about that great number of data. So yeah, data is up there waiting to be mined. That is the source, pure unadulterated wide-open wealth of knowledge right at your doorstep.
Here’s my third point, I’m going to relate the first paragraph to the second paragraph and it will go something like a data processing system on your home machine. All I got is one terabyte of empty space, waiting to be populated with collected data from the Great Web. I’m guessing one terabyte is enough to perform a simple experiment required to generate a very interesting report which may or may not have any value to anyone, except me. The resulting output definitely has a huge potential because I believe in this truism that “the perfect data is the one you have never seen yet.” Casting a big wide net to the web and hauling it over to a one terabyte space for processing will definitely capture that hidden gem. The most important part of the process is performing thing this called synthesis, which would even refine it a cleaner version.
This is my closing for this entry. Some of the tools are already in place, I just got it working yesterday, enough to proceed and carry-on to the next level of test. Although, I may have to cough-up some dough for the 1024GB disk as they are not cheap. 500GB disk is still pretty expensive compared to 160GB, though. It is definitely quite an investment for that small experiment I’d like to perform. Drive and Redland will be the ones doing the heavy lifting.