Teach-SA reads mail in designated maildir folders (spam on the one hand, ham on the other) and feeds them to Spamassassin for Bayesian learning and submission to various spam detection schemes. It is good for implementing any sort of supervised training in addition to Spamassassin's unsupervised training (also known as automatic whitelist) while reducing training-related admin workload to nearly zero. It fits any setup storing mail as maildir, but could trivially be modified to work with mbox based systems. It was originally developed so that users could report spam by moving it to a specific folder in their own IMAP mailbox.
Go Awstats automates the production of Awstats Apache log reports at a site where multiple virtual hosts exist. It always updates the reports for the current month and the current year and only produces other reports if they do not exist, which saves on CPU. The regeneration of a report can be forced by simply erasing it.
Great concept, but please do not reinvent the wheel.
For example, you could maybe reuse the work done at the HURD project. I think that the concept of a translator (http://www.gnu.org/software/hurd/whatis/translator.html) implemented in this system might well be something interesting to you : http://www.gnu.org/software/hurd/whatis/translator.html (http://www.gnu.org/software/hurd/whatis/translator.html). In addition, you would eventually benefit a micro kernel foundation.
Wrappers of all kind would make possible the building of your new namespace by allowing to use existing software by talking to them with your new way. Then, one distant day in a hazy future, when enough developers will have toyed with the wrapper implementation of your concept, maybe they will begin to contribute implementations of the lower levels of your model.
I particularly love the idea of a universal namespace, but if you really want to make it happen, you stand better chances by using existing foundations. I think the core value added of your idea resides in the namespace. You donít actually need to implement anything else to make it valuable : just design the best namespace in the world, make it usable by building wrappers to control other things with that language, and you will soon find out that people will build things that natively understand your new way to name things.
Why use this ?
I configured all the machines on my net to use the SQUID proxy on my DMZ for both FTP and HTTP. Since they all apt-get from the same Debian mirrors (give or take a handful of unofficial archives), SQUID handles all the caching with very little tweaking (maximum file size, more time until cached entry become stale, etc.). I found that method a more intuitive way to solve the resource mutualisation problem.
Several reasons make it especially efficient :
- packages in the cache have a limited lifespan. Therefore, building a mirror out of requests is only valid until the next package upgrade.
- a typical set-up only selects a fraction of the available packages, even less for a small number of supported hosts.
- using a general purpose caching program such as SQUID limits the additional complexity, and users do not have to change a line in their setup, provided they configured the proxy environment variable right.
As usual, there's more than one way to do it !