Draft number two of this post.
How big is data? Human minds often have trouble encompassing more than a few orders of magnitude at a time. When the amount of somethings goes from thousands to millions to billions, it's difficult to really keep the whole picture in your grasp. This is particularly true of the sizes of quantities of data, which vary by many orders of magnitude even within a single computing device.
I sat down one time a few years ago and tried to come up with an analogy for the difference between the largest and smallest useful units of data to the largest and smallest animals on the earth. Unfortunately, animals don't have nearly enough range of sizes to accomplish this. More recently, I realized I could use length as a proxy for data size and give a fairly good idea of how big things are.
The bottom of the scale is 8 kilobytes, or 8000 bytes. This picked for both historical and practical reasons. ENIAC, the first electronic digital computer, only had registers but didn't have anything like a "memory". However, its immediate successor did; the EDVAC had a memory of about 8k in the 1950s (it was a room-sized computer). The Apollo Guidance Computer (super-miniaturized computer built in the mid-1960s about the size of a small suitcase) had an erasable memory of 2000 15-bit words or about 4k. The level-1 cache of a single integer core of an AMD Interlagos processor is 16k. So across the 3, which covers, 50 years of computing hardware, 8k is a reasonable starting point for the smallest amount of data storage worth bothering with. In our length analogy, we'll make 8k equal to 1 inch, 2.5 cm, or about the width of a person's thumb at the first knuckle.
A high-density 3.5-inch floppy disk is 1.44 MB. The analagous length is 180 inches, or about the length of a mid-sized sedan.
A digital photo is usually 2 to 3 MB; the analogous length is about that of a bus or a truck.
A CD is 650 MB. The analogous length is about 81,000 inches, 6700 feet, or a mile and a quarter.
A DVD is 4.7 GB, analagous length is 588,000 inches, or 9.3 miles. It's about the length of Manhattan island; you can drive that in 10 minutes on a highway.
As of this writing, the middle size option of both the middle iPad Mini and iPhone is 32GB; this is also a fairly common size of SD card. The analagous length is 4 million inches, or 63 miles; about an hour's drive.
A reasonable hard drive in a new laptop is 250 GB. The analogous length is 31 million inches which is about 500 miles. That's the distance between Boston and Washington DC, or between London and Edinburgh; about what you can drive in a car in a very long day.
Around the biggest single hard drive that you can buy in a retail store today (as of early 2013) is 4 Terabytes. The analagous length is 500 million inches or 7900 miles, or about 1/3 of the circumference of the earth.
So the ratio of the sizes of the smallest useful amount of data and the largest reasonable single chunk of data is the same as the difference between the width of someone's thumb and 1/3 of the way around the earth. There you have it.
I'm a pack rat; this comes as no surprise to anyone who's ever been in my house or garage or office. I tend to collect things that I want to use later. Often the net effect is that I end up having to throw stuff out later, frequently enough if I want something I can fine the one I used to have and put it to work.
I'm like that with physical things and with data. I have data disks that I was using regularly almost 25 years ago. So this blog post by data preservationist Jason Scott about the end of floppies came as a bit of a rude shock. I'd been carrying lots of old floppies with stuff that I used to use, and it never occured to me that I wouldn't be able to just throw the disk in a machine and read it.
I read that post last fall, when it was more than a year old. That really scared me, so I've been sort of vaguely tryin to get set up to pull data off of disks since then. I had the setup ready about a month ago but since then I've been rather busy. Today my wife is in Knoxville and my plane is in Lexington and broken, so I had time to start seriously feeding floppies into the machine.
The results were actually pretty good. The 5 1/4 floppies actually did better; there were only a few that I couldn't get a full image from. The 3.5 disks didn't do as well, but I was able to get full images off of about 2/3 of them, and partial images off of several more. I have 10 or so that just don't want to read in the drive, so I may work harder on those, or I may not.
Most of what I'd like to preserve are game files, and I got at least one clean copy of all of those. The other thing is images. A lot of the photos that I took in graduate school were on floppies, because the camera that I used was a Sony Mavica that recorded onto floppies. I'm pleased that I pulled images from several of those disks, so I have images of stuff that I haven't seen in 10 years.
Feeding disks into the machine
read read read...
Here's me, much younger and thinner with more hair. This is on the
balcony of my last graduate school apartment. The laptop is the one
my group bought me. It's an original iBook. I lived in this
apartment August of 2000 to July of 2002.
I didn't realize I had any photos of this car. This is my 1986 Ford
Escort EXP that I drove from the fall of 1996 to the spring of 2002.
Fantastic road car. It died the day that I interviewed for my
(current) job. It's parked in front of a storage unit. I'm about to
leave for the big summer run of my graduate school experient in the
spring of 2000.
Apparently, even then I was prone to taking photos of the highway.
This is probably driving towards West Virginia.
These 11 cards (and the corresponding ones in the other crate) were
what made my graduate school experiment possible. These cards
implement a wire-OR of the results of 352 comparators and put the
result out the orange cable on the right. My biggest worry between
1998 and 2000 was making sure that this piece of equipment did its
job.
Here's me in the Hall B control room at Jefferson Lab sometime in the
summer of 2000. I find this photo terribly amusing; I'm doin stuff on
my laptop surrounded by larger and much more capable screens. So that
hasn't changed at least. :-)
This post is actually a draft. The real post will probably also be a separate page on the site. I'm putting it here to check to make sure the formatting is right and stuff.
How big is data? Human minds often have trouble encompassing more than a few orders of magnitude at a time. When the amount of somethings goes from thousands to millions to billions, it's difficult to really keep the whole picture in your grasp. This is particularly true of the sizes of quantities of data, which vary by many orders of magnitude even within a single computing device.
I sat down one time a few years ago and tried to come up with an analogy for the difference between the largest and smallest useful units of data to the largest and smallest animals on the earth. Unfortunately, animals don't have nearly enough range of sizes to accomplish this. More recently, I realized I could use length as a proxy for data size and give a fairly good idea of how big things are.
The bottom of the scale is 8 kilobytes, or 8000 bytes. This picked for both historical and practical reasons. ENIAC, the first electronic digital computer, only had registers but didn't have anything like a "memory". However, its immediate successor did; the EDVAC had a memory of about 8k in the 1950s (it was a room-sized computer). The Apollo Guidance Computer (super-miniaturized computer built in the mid-1960s about the size of a small suitcase) had an erasable memory of 2000 15-bit words or about 4k. The level-1 cache of a single integer core of an AMD Interlagos processor is 16k. So across the 3, which covers, 50 years of computing hardware, 8k is a reasonable starting point for the smallest amount of data storage worth bothering with. In our length analogy, we'll make 8k equal to 1 inch, 2.5 cm, or about the width of a person's thumb at the first knuckle.
A high-density 3.5-inch floppy disk is 1.44 MB. The analagous length is 180 inches, or about the length of a mid-sized sedan.
A digital photo is usually 2 to 3 MB; the analogous length is about that of a bus or a truck.
A CD is 650 MB. The analogous length is about 81,000 inches, 6700 feet, or just over a mile.
A DVD is 4.7 GB, analagous length is 588,000 inches, or 9.3 miles. It's about the length of Manhattan island.
A reasonable hard drive in a new laptop is 250 GB. The analogous length is 31 million inches which is about 500 miles. That's the distance between Boston and Washington DC, or between London and Edinburgh.
Around the biggest single hard drive that you can buy in a retail store today (as of early 2013) is 4 Terabytes. The analagous length is 500 million inches or 7900 miles, or about 1/3 of the circumference of the earth.
So the ratio of the sizes of the smallest useful amount of data and the largest reasonable single chunk of data is the same as the difference between the width of someone's thumb and 1/3 of the way around the earth. There you have it.