People had previously grown used to the notion that there must be one central arbiter that oversees all transactions on a network: a Mainframe. This model has an obvious weakness: when the Mainframe goes down, the whole system is unusable. Then again, if there is only a single important point of failure, you could pay some people a lot of money to sit there and fix problems as soon as they happen (and hopefully insure that the problems never happen in the first place). Unfortunately, it's difficult to do this with regards to a nuclear bomb, so a different model was needed.
DARPANET provided this model by removing the server. It's like a model in which everyone hands mail to a friend who passes it to a friend who passes it to the recipient. While at first this might seem a little odd or inefficient, it means that it would be a lot harder for someone to stop the flow of mail to you (or the flow of mail in general). Instead of simply bombing the post office, now they have to assassinate each and every one of my friends to prevent me from getting mail. Going back to the real world, there would be no single point of failure which the Russians could bomb to take down our communications.
It was a revolutionary, strange way of thinking about things. To this day, some people don't understand it and ask questions like "Where is the server that runs the Internet?" or even "Where is the Internet?" It's hard to understand that every server on the Internet is a part of the Internet.
A quick fix is to employ a large number of servers configured exactly the same way, so that if one goes down, traffic is quickly diverted to the others. Work is equally distributed amongst these servers by use of a "load balancer". This solves a few problems, but what if your server cluster is in California and the network link from California to New Zealand is getting bogged down? While the long-term answer is to invest in a faster connection to New Zealand, the short-term way to solve this problem is to put a server cluster in New Zealand. This sort of rapid expansion can quickly get expensive to deploy and manage. Some bright kids from MIT figured this out a few years ago and cobbled together what is now one of the fastest-growing companies out there: Akamai. (Hawaiian for "cool", if you're wondering.)
Akamai has already gone through the trouble of buying several thousand servers and putting them in network closets all around the world. The idea is that you can spoon off delivery of the parts of your site that don't change much (the pictures, the movies, etc.) to Akamai, and they'll take care of making sure that your readership can always quickly access your content. Cute idea. "Cool," even.
Distributed services lead to higher data availability. The more machines that are distributing your content in the more places, the more people will be able to access your content quickly. It's a straightforward idea. This notion of distributing work is also useful for distributing computation...
It's important to note that making fast chips is expensive, because if I want ten times the processing power that comes in a top-of-the-line consumer PC, the best way to do that and save money is not to buy a machine that's ten times faster, it's to buy ten top-of-the-line consumer PCs. People have understood this general concept for a long, long time: wire together a bunch of processors to get a very, very fast machine. It's called "Massive MultiProcessing" (MMP) and is pretty much how all of the supercomputers of yore (and of today!) work.
The recent concept is that it's possible to do this with off-the-shelf PCs. Recently, software (such as Beowulf) has been developed to make it very easy to make a cluster of PCs act like one very fast PC. Sites that previously deployed very expensive custom supercomputer systems are actively investigating using massively distributed commodity hardware to serve their computing needs. That would be remarkable as-is, but this concept of distributing computing cycles has gone even farther than clumps of commodity hardware: it's gone into the home.
Some clever programmers put together the software used for analyzing the data returned by the Arecibo antenna (the largest radio receiver on Earth), put some pretty graphics on it, got it to act as a screensaver, and put it on the Web. Several hundred thousand people downloaded it and ran it. While they're away from their computers, this pretty screensaver crunches through vast quantities of data, searching for patterns in the signals. The SETI project (as of this writing) in this way has a "virtual computer" that is computing 13.5 trillion floating-point operations per second.
(I feel I should also mention distributed.net, which spends its time using people's computing power to crack cryptography codes. Their "virtual computer" is currently cracking a 64-bit cipher known as RC4 at the rate of 130 billion keys per second.)
Napster is one of the first and best examples of end-users acting as distributed servers. When you install Napster, it asks you where your MP3 files are. You tell it, and it makes a list of what MP3 files you have, how long each song is, and of what quality the recording is. It then uploads this list (but not the songs) to a central server. In this way, the central server has a whole bunch of lists. It knows who has what music, and you can ask the server who has songs by Nirvana and then contact those other users (while your Beck tunes are possibly getting served to some Scandinavian with a predilection for American music). This model allows information (in this case, MP3 files) to be rapidly and efficiently served to thousands of users.
The problem with it is both technical and legal. There is a single point of failure: Napster's servers. While there is more than one server (the client asks a "meta-server" what server it should connect to), they are all owned by Napster. These servers, unfortunately, do not share their file lists between themselves, and as a result, you can only share files (and see the files of) others connected to the same server that you happen to have connected to. Napster is currently being sued by the RIAA for acting as a medium for distributing illegal MP3 files. While it is true that Napster can be easily used for illegally distributing MP3 files, they themselves don't actually copy the bits for users; it's more like acting as a Kinko's that happens to be used by subversives than actually distributing copies of MP3.
If you are a Napster user, you should be worried about this lawsuit, because if the RIAA succeeds, they will probably want to shut down Napster's servers, thus theoretically shutting down the whole Napster network. In short order, they could quickly close down any Napster clones because of the legal precedent that the anti-Napster case would set. Boom. Game over, no more illegal music.
This is a "virtual Internet" of sorts in which links are not physical (a wire from you to me) but logical (I know you). Data flows through this "web of friendship" in such a way that it looks like you are only talking with your friends, when really you are talking to your friends' friends, and so forth.
Since there's no central server around which Gnutella revolves, AOL's shutdown of the project didn't actually stop Gnutella from working. A growing user base of several thousand souls (myself included) uses the product on a daily basis to share files of all types, from music to movies to programs. At last check, there were about 2,200 people using it, sharing 1.5 terabytes of information. Wow.
There's no way to shut it down. There is no organization to sue to stop it. There is no server to unplug that would bring the network tumbling down. As long as at least two people are running the software, the network is up and running.
What is enabling this now? Well, computers are, unsurprisingly, getting faster every year. The average desktop that's sold to Joe User for doing word processing, email, and Web browsing can, when properly configured, deliver hundreds of thousands of email messages a day, serve millions of Web pages, route Internet traffic for tens of thousands of users, or serve gigabytes of files a day. (Joe probably isn't aware of this and will still kick it when Word takes five minutes to load.) His hard drive could store 100,000 Web sites each having ten or so pages, email for 1,000 users, and a few thousand of his favorite songs. Furthermore, if Joe has DSL or a cable line, he's got a static IP (an address on the Internet that doesn't change often, if at all), is almost always connected to the Internet, and is online at high speed.
In short, Average Joe's computer resembles one of the best Internet servers of yesteryear.
If thousands of Joes end up running "community" applications like Gnutella, they can take advantage of their connectivity, disk space, and computing power. New "co-hosting" services will spring up like popcorn in the microwave. Here are a few possibilities in that direction:
David Weekly is a senior majoring in Computer Science at Stanford University. A programmer since the age of 5 and a veteran of the MP3 scene, he's working on graduating in June and generally figuring life out. He offers a tip of the hat to Kevin Doran for inspiring him to write these ideas down.
We're eager to find people interested in writing editorials on software-related topics. We're flexible on length, style, and topic, so long as you know what you're talking about and back up your opinions with facts. Anyone who writes an editorial gets a freshmeat t-shirt from ThinkGeek in addition to 15 minutes of fame. If you think you'd like to try your hand at it, let firstname.lastname@example.org know what you'd like to write about.