Discussion:
advice wanted: help choose development framework for a simple distributed computing project
(too old to reply)
FT
2004-09-12 22:36:47 UTC
Permalink
Greetings,
I would appreciate opinions on which of the publicly available code base
would be best suited for my project. I would like to utilize a small
number of idle computers in my department to run a number-crunching job
at a low priority. The computation lends itself to distributed computing
ideally, one might say that the domain decomposition is trivial. The
computational task is divided neatly into small independent chunks. I am
envisioning running a single executable that would reside on each of the
PCs used and would seldom change, so there is no need for an elaborate
scheme to push the executables to the PCs -- as I said there's a small
number of them, all of them are physically accessible. For each
computation chunk, both the input and output are a fairly small amount
of information, on the order of 16KB, while the execution of 1 chunk
takes ~10 seconds on a typical pentium III PC. The computation itself is
a floating-point-heavy scientific computation and is CPU-bound with a
small memory footprint (not sure if these details are relevant but here
it is). I am thinking of having a server computer take computation
requests, check which of the PCs are available to perform the work,
split the work, distribute the input data to the PCs, collect the output
data, and ideally have the intelligence to resubmit the work to other
PCs if some fail to deliver the output in expected time. The key word is
simplicity, I am looking for some framework that would allow me to make
rapid progress, and does not need to have most of the features,
flexibility and security that a sophisticated internet based
multiplatform project would need. All the computers available are
intel-based PCs and run windows 2000 or windows XP. I'd like to used
open source software. Exact license type is not important, GPL is ok --
this is purely an internal computing project and the derivative code is
never meant to be sold or otherwise distributed outside of the
organization. Also, I would not want to learn any exotic programming
languages just for this project, so would like to stick to C,C++,
perhaps python or shell scripts. I looked at the directories like
http://www.aspenleaf.com/distributed/distrib-devel.html
http://directory.google.com/Top/Computers/Computer_Science/Distributed_Computing/Platforms/
and suspect that my answer is in there somewhere, although most of the
impressive projects listed look like an overkill for my purposes.

Thanks in advance,
Fedor T.
I'd appreciate responses by e-mail. Please remove all uppercase
characters in the address.
Martin 53N 1W
2004-09-13 00:54:03 UTC
Permalink
FT wrote:
[...]
Post by FT
would be best suited for my project. I would like to utilize a small
number of idle computers in my department to run a number-crunching job
at a low priority. The computation lends itself to distributed computing
[...]

If you have direct control of all the machines and there are no concerns
about security or cooperation then:

The simplest solution is just to use network file shares and coordinate
actions with standard lock file techniques. The host computers poll to
see if there is a data file to work on... To ease server loading, have
the server pull and push all data files. The hosts just poll their local
filestore.

Slightly more elaborate and efficient would be to do the same but
utilising fixed pipes (network sockets).


A more elaborate expansive solution would be to take up Boinc!

Let us know how you progress.

Good luck,
Martin
--
---------- OS? What's that?!
- Martin - To most people, "Operating System" is unknown & strange.
- 53N 1W - Mandrake 10.0.1 GNU Linux
---------- http://www.mandrakelinux.com/en-gb/concept.php3
Loading...