The Ex-Crawler Project is divided into three subprojects. The main part is the Ex-Crawler daemon server, a highly configurable and flexible Web crawler written in Java. It comes with its own socket server, with which you can manage the server, users, distributed grid/volunteer computing, and much more. Crawled information is stored in a database (Currently MySQL, PostgreSQL, and MSSQL are supported). The second part is a graphical (Java Swing) distributed grid/volunteer computing client, including user computer state detection, based on JADIF Project. The Web search engine is written in PHP. It comes with a Content Management System, user language detection and multi-language support, and templates using Smarty, including an application framework that is partly forked from Joomla 1.5, so that Joomla components can be adapted quickly.
Java distributed framework is a framework for distributed grid and / or volunteer computing. It's divided into a server and client library. You can create new or implement it into existing applications in no time; you don't need knowledge about network connections, sockets, etc. The Framework does almost everything automatically. It provides secure automatic client <-> server communications, unique IDs, automatic resending of jobs to new clients if needed, user stats, and much more. The client framework supports the detection of the computer's user state (idling, away, online, etc.). It also offers many other useful features and helpers for developing a distributed client application.
Makeflow is a workflow engine for executing large complex applications on clusters, clouds, and grids. It can be used to drive several different distributed computing systems, including Condor, SGE, and the included Work Queue system. It does not require a distributed filesystem, so you can use it to harness whatever collection of machines you have available. It is typically used for scaling up data-intensive scientific applications to hundreds or thousands of cores.
POP-C++ is a comprehensive object-oriented system for developing applications in large distributed computing infrastructures such as Grid, P2P or Clouds. It consists of a programming suite (language, compiler) and a run-time system for running POP-C++ applications. The POP-C++ language is a minimal extension of C++ that implements the parallel object model with the integration of resource requirements into distributed objects. This extension is as close as possible to standard C++ so that programmers can easily learn POP-C++ and so that existing C++ libraries can be parallelized using POP-C++ without too much effort. The POP-C++ run-time is an object-oriented open design that aims at integrating different distributed computing tool kits into an infrastructure for executing requirement-driven object-oriented applications. It uses objects to serve objects: the system provides services for executing remote objects.
Parrot and Chirp are user-level tools that make it easy to rapidly deploy wide area filesystems. Parrot is the client component: it transparently attaches to unmodified applications, and redirects their system calls to various remote servers. A variety of controls can be applied to modify the namespace and resources available to the application. Chirp is the server component: it allows an ordinary user to easily export and share storage across the wide area with a single command. A rich access control system allows users to mix and match multiple authentication types. Parrot and Chirp are most useful in the context of large scale distributed systems such as clusters, clouds, and grids where one may have limited permissions to install software.
dispy is a Python framework for parallel execution of computations by distributing them across multiple processors in a single machine (SMP), or among many machines in a cluster or grid. The computations can be standalone programs or Python functions. dispy is well suited for the data parallel (SIMD) paradigm where a computation is evaluated with different (large) datasets independently (similar to Hadoop, MapReduce, Parallel Python). dispy features include automatic distribution of dependencies (files, Python functions, classes, modules), client-side and server-side fault recovery, scheduling of computations to specific nodes, encryption for security, sharing of computation resources if desired, and more.
Wisecracker is a high performance distributed cryptanalysis framework that leverages GPUs and multiple CPUs. It allows security researchers to write their own cryptanalysis tools that can distribute brute-force cryptanalysis work across multiple systems with multiple multi-core processors and GPUs. Security researchers can also use the sample tools provided out-of-the-box. The differentiating aspect of Wisecracker is that it uses OpenCL and MPI together to distribute the work across multiple systems, each having multiple CPUs and/or GPUs.
Hados stores files in a cluster of servers. Its goal is to handle high availability by storing copies of the same file on several nodes. It provides RESTFUL APIs to easily store, check, or retrieve files. Using the cluster APIs, you can retrieve files from whichever node hosts them. To avoid any single point of failure, it is possible to apply a request to any node of the cluster; there is no master node.