GQuiz is a generic question/answer drilling program. You provide a program that displays the question and answer, given as a single filename on the command line, and gquiz randomizes, prevents immediate repetition, and temporarily eliminates questions you've answered correctly "enough" times in a row. The author uses GQuiz with cgoban to study go, the Asian strategy game, but it could just as well be used to memorize States and their Capitals, or foreign language vocabulary, etc.
fallback-reboot is a last resort, when you need to remotely reboot a computer. It attempts to maximize its ability to get the job done by completely avoiding touching the hard disk; it opens no files, and it locks itself into memory to avoid swapping/paging. It also does not fork or exec. It includes optional cryptography.
Newbies to Unix (and some experts too) often have trouble porting applications to a Unix the software wasn't originally intended for. Many of these problems take the form of unresolved externals. find-sym attempts to provide suggestions for libraries and header files to include to eliminate the unresolved externals.
Reblock is a tool that has three main, largely independent purposes. It can ensure that data transferred over a network is reassembled into an appropriate series of equally-sized blocks (for example, prior to writing to tape), give a throughput measure of how fast the data is moving, and give an estimate of what percentage of the transfer is complete as well as an estimated completion time. If the file size can not be determined using fstat, it is possible to provide an estimate of the size with the -e switch.
maxtime is a tool that accepts a number of seconds and a shell command, and runs the shell command as a subprocess. If the subprocess runs for more than the specified time, it will attempt to kill the subprocess. If the subprocess exits as expected, the program returns its exit status. Otherwise, the program exits with a sentinel status and leaves the stuck subprocess with some deadly signals pending.
env-search is a tool that can be used to examine a situation where a program runs correctly under one account but not another. By saving the environments of the two users in files and writing a program that can detect whether the program is running correctly or not (just a grep wrapper in most situations), it attempts to find which environment variable, if any, is causing the problem. It also has rudimentary support for handling multiple environment variable problems.
notify-when-up periodically polls something and notifies you when that something has changed. It can poll a port on a host, and let you know when it's accepting connections, and a command on the current machine, and let you know when it's returning true or false. The purpose is to be a lightweight adjunct to something like Nagios, for those times when you're rebooting a system. Notices can be messages to the tty, X windows popup windows, curses popup windows, or email messages.
dup-label copies a label from one disk on a Solaris system to another, but if the source disk's ASCII label was "foo" and the destination disk's ASCII label was "bar", then dup-label will duplicate partitioning, but also produce an ASCII label of "foo, was: bar". This may prove less puzzling than current other offerings to future admins, when they otherwise would find that a disk that has Seagate written on the outside, says Maxtor in sun format.
slowdown is a program that contends with I/O-hungry processes. The "nice" program does a good job of handling CPU priorities, but doesn't help much when you have a process that is moving tons of data; other processes can continue to starve for I/O, making a system painful to use, as during a backup, while tripwire is running, etc. slowdown manages another process by sleeping for a user-specified number of seconds or fractions of seconds, each time some data is moved using, for example, read(), write(), send(), recv(), etc.
The "converting binary" suite of programs includes "endian" to determine the "byte sex" of a machine, "byte-swap" to swap bytes between little-endian and big-endian orders, "real-ascii-real" to convert from native reals to ASCII and then back again, an example program that converts a file of variable length records, and "strip-fortran-framing" which can strip off some of the Fortran runtime's record framing for the benefit of language runtimes that do not assume such framing.
equivs2 can relatively (or very, depending on options and input size) quickly divide a series of files into equivalence classes. It's suitable for use with a very large number of files and/or very large files. It can perform its task strictly in terms of cryptographic digests, or use cryptographic digests backed up at the end with full byte-for-byte comparisons. It also uses some heuristics like comparing the beginnings of files and the device number and inode number and file sizes. Further, it knows to only read in the parts that it needs "so far", to avoid a huge inhale at the beginning of the run, instead trying an initial method and falling back to subsequent methods as needed.
When reading a lot of data from a disk or network filesystem, e.g. during a backup, buffer cache pollution can be a substantial performance problem for other processes on the machines involved. odirect is intended to provide a convenient way of avoiding that on systems that offer an O_DIRECT value for open(). The project provides a C library, a C++ wrapper, and SWIG code to make it easy to use from your favorite scripting language.
merging-uids merges one or more files in /etc/passwd format. You give it a list of n password files on the commandline (leftmost varies least in the output) and a series of n-1 scripts that will be used by sed. In return, you get a new password file and a series of UIDs that need to be rearranged on-disk with chowns.
try-copying-up-to-n-times takes a list of filenames, creates a database of those filenames associated with counts, and tries to copy each of those files up to "count" number of times before giving up. If enough files have problems in a row, it decides the filesystem is broken, and stops processing so you can restart the fileserver and pick up again just past where you left off (after n attempts at each troublesome file).
Implicit Queuing System is a queuing system with no qsub command nor qsub analog. It sits in the background watching for CPU-hogging processes and adds them to an ultra-batch queue. Processes in this queue are run for lengthy timeslices, such as a minute at a time. The number of processes that run concurrently is admin-selectable. It's not good at getting processes out of VM, but it is good at getting processes off your CPU(s).
pypty is a tty logger aimed at heavy script(1) users who like to (or would like to start to) log everything they do on important systems. It creates one (or two, if you ask for timing data) file(s) per day. The distribution also includes "script-replay", which is somewhat like the traditional scriptreplay - that is, it's for replaying tty logs - but it does not require timing data and lets you step forward and back in the log.
highest is a program that efficiently finds the n highest (or lowest) numbers in a list of numbers on stdin. The traditional way of computing this using GNU sort should have a running time of O(n log n), where n is the number of numbers to check. Highest should have a running time of O(n log m), where m is the number of numbers you want to keep. A graph comparing the performance of highest to that of GNU sort is provided.
DRS Looper is a program for running a list of commands, a certain number of commands at a time. It facilitates a variety of forms of parameterization, including hostnames for ssh or rsync. It features good error checking. Output is saved in one file per command, optionally interleaving output to stdout. It can exit on the first success or failure, or run all commands irrespective of their exit status.
treap.py is a treap implementation for Python. A treap is a hybrid of a binary tree and a binary heap that is self-balancing and is O(nlog2(n)) for most operations, including deleting a value, inserting a value, finding the least value, and finding the greatest value. This particular treap implementation looks like a dictionary to the caller, but it also supports getting an ordered list (forward or reverse) in O(n) time. The code is available as pure Python (should run on about any Python implementation supporting generators, but was tested on CPython 2.6) or as part Python and part Cython for performance. The version with Cython should run on CPython or Unladen Swallow, but was only tested on CPython 2.6.
zip-center accepts n ZIP codes on the command line. It converts these ZIP codes to latitude and longitude, averages them separately, finds a ZIP code whose center is nearest the lat/lon center point, and computes the distance of each input zip code from the resulting centrally-located zip code in miles (taking into account the variable width of longitudinal differences). This allows a caller of the script to do things like pick a centrally-located venue for a meeting given the ZIP codes of the prospective attendees.
gprog is a basic GUI pipe meter that shows the percentage complete as data moves through a Unix pipe. It is very fast because it uses a dual process design with a cache oblivious algorithm for self-tuning. Also, the presentation is largely decoupled from the transfer, so that the GUI won't slow down the transfer.
Backshift is a deduplicating (variable-sized, content-based blocks), compressing (xz or bz2) backup program. Full saves and incrementals are pretty indistinct other than the amount of data transmitted, somewhat like with "rsync --link-dest" but without the huge number of hardlinks. It also de-duplicates large file content at a granularity of about 2 megabytes on average; there tends to be a unique copy of each file with size less than around 2 megabytes on average.