As Fiber to the home (15-30 megabit speeds) and Cable/DSL (1-6 megabit speeds) become more common, some servers are having trouble maxing out a user's download pipe. One way to increase performance is to download from multiple resources at once. This is mainly useful for large files.
Mirrors are confusing to an inexperienced Web user. The Fedora Project has 110 mirror sites in North America alone. Which do you choose? Which has all the files you want? Which is quickest?
In this case, not all mirrors carry all files. Some might not have all large ISOs (the Fedora Core 4 DVD image is around 2.5 gigabytes), or might only carry a subset of files (some kernel.org mirrors only have .tar.gz or .bz2 files, some have both). Or they might just be out of sync. That means you have to navigate through them to find out if they really have the file you need.
This is basically a usability problem. With some downloads, complications arise from users needing to select their Operating System, language, and location. I hope to make things easier.
Mirrors are great. We need to keep using them, but we need a better, more automatic way to use them. Peer-to-Peer (P2P) in general and BitTorrent specifically are amazing. They make it so individuals can share their bandwidth and distribute files that would otherwise cost too much through traditional server-to-client downloads.
But... P2P and regular hyperlinks are not that reliable. A hyperlink is one link to a file. If that file is gone or moved, or the server is temporarily down, that's it. 404 Error. You can search by filename, but there is no unique identifier to find that file again on the Web. P2P sharing is ephemeral. Most files are not available constantly or for the long term. I'm sure everyone has found a .torrent that he really wants, but that no one is sharing any more. BitTorrent downloads will not complete if there are no seeds at 100%. A torrent download will sit at 99.9% forever until a 100% seed (someone with the full file) starts sharing. There is no fallback plan.
I have been working on a file format called MetaLink that bundles the various methods (P2P/HTTP/FTP) of downloading files in order to improve usability, performance, reliability, and efficiency over one P2P method or a regular hyperlink. One of the main goals is to make the download process simpler for the end user. I hope this format will be found useful by Free and Open Source software projects.
Performance is increased because you download from multiple resources at the same time. Reliability is greater because there are multiple avenues or alternate locations to get a file. Hyperlinks have a single point of failure. Metalinks do not; all resources have to go out at the same time for a file to be unavailable. And it is more efficient because it spreads the downloads more evenly across multiple resources (P2P or Web/FTP servers) by multi-threading (a.k.a. segmenting or accelerating) downloads. That means that a portion of each file is downloaded from separate servers.
The minimum requirement for Metalink to be integrated into a program is that it already supports segmented downloads. Clients should also have a way to check MD5 and SHA-1 sums. And if it has BitTorrent and other P2P methods (ed2k links, magnet links, Gnutella) built in, even better. The perfect client will be able to share and access files across many P2P networks.
A few clients are implementing MetaLink right now and should be available shortly.
Here is an example MetaLink for OpenOffice.org 2.0 with links for a BitTorrent .torrent, magnet, ed2k, FTP, and HTTP. A really useful MetaLink will include combinations for different Operating Systems and languages.
<?xml version="1.0" encoding="UTF-8"?> <metalink version="2.0" xmlns="http://www.m3talink.org/" origin="http://www.openoffice.org/mmm/OpenOffice.org-2.0.1.metalink" type="static" pubdate="2005-12-21-22:07:22" refreshdate="2005-12-23-03:24:18"> <files> <file name="OOo_2.0.1_LinuxIntel_install.tar.gz"> <identity>OpenOffice.org</identity> <version>2.0.1</version> <description>OpenOffice.org 2.0.1 - free office suite</description> <tags>OpenOffice.org, office suite, OpenDocument, open source</tags> <language>en-US</language> <os>Linux-x86</os> <size>109237237</size> <verification> <md5>e0d123e5f316bef78bfdf5a008837577</md5> </verification> <publisher> <name>OpenOffice.org</name> <url>http://www.openoffice.org/</url> </publisher> <license> <name>LGPL</name> <url>http://www.gnu.org/copyleft/lesser.html</url> </license> <copyright>Copyright 2000-2005 Sun Microsystems Inc.</copyright> <resources> <magnet> <url> magnet:?xt=urn:sha1:TWTEVOAO2IIEV67QT2ZITTXHXEUR4EXD&xt=urn:kzhash:07b7760f1c05440c779479b50dd9dd5d96708cf47b7cef1181058119637ff20ab7d38af0&xt=urn:tree:tiger:VKFOQ3RETGBCLWOJAMX53EQR4OWNV7CUEOAVY6Q&xt=urn:ed2k:8966658d3b75ff12e1260371ad257098&xl=109237237&dn= OpenOffice.org_2.0.1_LinuxIntel_install.tar.gz&xs=http://ftp.snt.utwente.nl/pub/software/openoffice/stable/2.0.1/OOo_2.0.1_LinuxIntel_install.tar.gz </url> <preference>90</preference> </magnet> <ed2k> <url> ed2k://|file|OpenOffice.org_2.0.1_LinuxIntel_install.tar.gz|109237237|8966658D3B75FF12E1260371AD257098|h=3JVTR3O2DYGSBYCDCHKBOBXL2IJ6A3H3|s= http://ftp.snt.utwente.nl/pub/software/openoffice/stable/2.0.1/OOo_2.0.1_LinuxIntel_install.tar.gz|/ </url> <preference>90</preference> </ed2k> <bittorrent> <torrent> <url>http://borft.student.utwente.nl:6969/file?info_hash=%53%13%06%4e%30%c4%1e%e2%6f%e2%b0%24%8f%1b%e7%1e%97%ae%ec%ca</url> </torrent> <preference>100</preference> </bittorrent> <http> <url>http://mirrors.isc.org/pub/openoffice/stable/2.0.1/OOo_2.0.1_LinuxIntel_install.tar.gz</url> <location>US</location> <preference>80</preference> </http> <ftp> <url>ftp://ftp.ussg.iu.edu/pub/openoffice/stable/2.0.1/OOo_2.0.1_LinuxIntel_install.tar.gz</url> <location>US</location> <preference>20</preference> </ftp> <http> <url>http://mirrors.ibiblio.org/pub/mirrors/openoffice/stable/2.0.1/OOo_2.0.1_LinuxIntel_install.tar.gz</url> <location>US</location> <preference>20</preference> </http> <ftp> <url>ftp://openofficeorg.secsup.org/pub/software/openoffice/stable/2.0.1/OOo_2.0.1_LinuxIntel_install.tar.gz</url> <location>US</location> <preference>40</preference> </ftp> </resources> </file> </files> </metalink>
The goal is simplicity. A user will click this one .metalink, and the client will download the file in segments from P2P and mirrors. After the download is complete, the checksums will be compared to verify that the files are identical.
So, to sum up, these are the benefits over traditional methods:
I'd be interested in any comments you have.