Discussion:
*Real* Distributed Computing
(too old to reply)
Michael N. Christoff
2005-08-10 05:15:07 UTC
Permalink
Hi.

Sorry about the multi-posting, but I wanted to get a general opinion of
those online. The groups I chose (while on the surface may not seem like
they would have a lot in common with distributed computing) all intersect
with the idea of collaboratively solving problems in an environment where
one may not have a complete set of information. Where there is no 'global'
entity that sees the global state of the system - ie: where each
participating entity's view of the world is relative to the other entities
in direct proximity to it. Anyways, on to the original post:

I was wondering if anyone in this group is looking into the theoretical side
of distributed computing. I keep reading definitions of DC to the effect of
"DC is a method of computing where one a) breaks a large problem into
smaller parts, b) distributes the partial workloads to a set of processes
that compute the partial solutions, c) finally recombining the partial
solutions into a total solution". ie: SETI, etc...

This is an *example* of distributed computing - a particular, and very
straight-forward use of many computers, but is by no means a definition of
DC.

Does anyone here look into things such as decision tasks (the renaming
problem, k-set agreement, consensus) or any theoretical papers in the field
(FLP, the asynchronous computability theorem, the relationship between
algebraic topology and distributed computing, dihomotopy, ditopological
homology groups, various calculi for concurrent systems, fine-grained
concurrency, fault-tolerance, provable impossibility of certain tasks,
computability in asynchronous, semi-synchronous and synchronous networks,
worst-case message complexity of distributed tasks, randomized distributed
algorithms, self-stabilization, shared-memory vs. message passing models,
fault-tolerance, Byzantine failures, crash failures, omission failures, etc,
etc, etc...)?

Check out the following link to get a small peek into what I mean:

http://scholar.google.com/scholar?q=algebraic.topology+distributed.computing
&ie=UTF-8&oe=UTF-8&hl=en&btnG=Search

I'd be interested to know if these things are still entirely a part of
academia or whether there are people out there who discuss these things in
their spare time.


Thanks,



Mike N. Christoff
Edward A. Feustel
2005-08-10 10:27:15 UTC
Permalink
Post by Michael N. Christoff
Hi.
Sorry about the multi-posting, but I wanted to get a general opinion of
those online. The groups I chose (while on the surface may not seem like
they would have a lot in common with distributed computing) all intersect
with the idea of collaboratively solving problems in an environment where
one may not have a complete set of information. Where there is no 'global'
entity that sees the global state of the system - ie: where each
participating entity's view of the world is relative to the other entities
I was wondering if anyone in this group is looking into the theoretical side
of distributed computing. I keep reading definitions of DC to the effect of
"DC is a method of computing where one a) breaks a large problem into
smaller parts, b) distributes the partial workloads to a set of processes
that compute the partial solutions, c) finally recombining the partial
solutions into a total solution". ie: SETI, etc...
This is an *example* of distributed computing - a particular, and very
straight-forward use of many computers, but is by no means a definition of
DC.
Does anyone here look into things such as decision tasks (the renaming
problem, k-set agreement, consensus) or any theoretical papers in the field
(FLP, the asynchronous computability theorem, the relationship between
algebraic topology and distributed computing, dihomotopy, ditopological
homology groups, various calculi for concurrent systems, fine-grained
concurrency, fault-tolerance, provable impossibility of certain tasks,
computability in asynchronous, semi-synchronous and synchronous networks,
worst-case message complexity of distributed tasks, randomized distributed
algorithms, self-stabilization, shared-memory vs. message passing models,
fault-tolerance, Byzantine failures, crash failures, omission failures, etc,
etc, etc...)?
http://scholar.google.com/scholar?q=algebraic.topology+distributed.computing
&ie=UTF-8&oe=UTF-8&hl=en&btnG=Search
I'd be interested to know if these things are still entirely a part of
academia or whether there are people out there who discuss these things in
their spare time.
Thanks,
Mike N. Christoff
You left out discussion of many practical engineering issues of
implementation including
provisions for security, discovery, and other issues described in the
Reference Model for
Open Distributed Processing, an ISO/ITU standard. Without an appropriate
implementation, many
of the theoretical subjects described cannot be implemented.
Ed
Michael N. Christoff
2005-08-14 01:12:24 UTC
Permalink
message news:E1gKe.1890$***@news20.bellglobal.com...

[snip]
Post by Edward A. Feustel
You left out discussion of many practical engineering issues of
implementation including
provisions for security, discovery, and other issues described in the
Reference Model for
Open Distributed Processing, an ISO/ITU standard. Without an appropriate
implementation, many
of the theoretical subjects described cannot be implemented.
Ed
Hi Ed.

The types of problems I'm thinking of seem to be a bit more 'low-level'.
For example, problems such as 'leader election' in anonymous networks and
'consensus' (which are heavily studied in theoretical DC) are things that
may be employed by the higher-level distributed services that you describe
above resulting in a more robust\fault-tolerant system overall. So although
what you say is true about the theoretical sometimes needing an appropriate
implementation first, its often the case that the implementation benefits
from the results of the theoretical. For example, there are many results in
theoretical DC where proofs that show lower-bounds in the message-complexity
of certain tasks also produce real-world distributed algorithms for solving
the task as fast as its provably solvable.

Thanks,

Mike
Zorro
2005-08-11 07:55:26 UTC
Permalink
Replying to Mike N. Christoff.

The link was quite refreshing. Thanks.

A desirable result in mathematics is the existence and uniqueness of
solution whenever we are facing the discovery of an algorithm for
finding solutions (differential equations, linear systems, etc.). In my
lower undergrad courses, I usually gave them the following example.

Suppose you have two equations in two unkowns, and the only way you
know to find a solution is by guessing and trial. At least if you knew
a way of determining whether there is a solution, and that the solution
is unique, you would then continue to try and once you found a pair of
numbers that satisfied the equations you would be content.

Some of the articles at the link you have posted (I will have to spend
more time on them) prove interesting results along the lines of the
example I mentioned. Once we are aware of such results, we can try to
find a solution, though in this case the uniqueness is not the issue.
Here, we need to know what can or cannot be accomplished, and what are
the limits etc. of various formulations (shared memory, message
passing, etc).

At some point, we need to use such results to provide an
infrastructure. That is, there is more than mathematics here. We have a
theoretical solution (as in mathematics) and then we need a
computational model for the theoretical solution. That is where CORBA
etc come in (as AndyW nicely puts it). With regard to this, my view of
(concurrent) distributed applications (on heterogeneous systems) is
very different from DSOM (and DCOM). It has taken me decades to
translate the theory into a computational model.

The link was quite refreshing. Not sure whether you were asking
something?

Regards,
***@ZHMicro.com
http://www.zhmicro.com
http://distributed-software.blogspot.com
http://groups-beta.google.com/group/computerlangZ?lnk=la&hl=en
Michael N. Christoff
2005-08-14 01:31:43 UTC
Permalink
Post by Zorro
Replying to Mike N. Christoff.
The link was quite refreshing. Thanks.
A desirable result in mathematics is the existence and uniqueness of
solution whenever we are facing the discovery of an algorithm for
finding solutions (differential equations, linear systems, etc.). In my
lower undergrad courses, I usually gave them the following example.
Suppose you have two equations in two unkowns, and the only way you
know to find a solution is by guessing and trial. At least if you knew
a way of determining whether there is a solution, and that the solution
is unique, you would then continue to try and once you found a pair of
numbers that satisfied the equations you would be content.
Some of the articles at the link you have posted (I will have to spend
more time on them) prove interesting results along the lines of the
example I mentioned. Once we are aware of such results, we can try to
find a solution, though in this case the uniqueness is not the issue.
Here, we need to know what can or cannot be accomplished, and what are
the limits etc. of various formulations (shared memory, message
passing, etc).
At some point, we need to use such results to provide an
infrastructure. That is, there is more than mathematics here. We have a
theoretical solution (as in mathematics) and then we need a
computational model for the theoretical solution. That is where CORBA
etc come in (as AndyW nicely puts it). With regard to this, my view of
(concurrent) distributed applications (on heterogeneous systems) is
very different from DSOM (and DCOM). It has taken me decades to
translate the theory into a computational model.
Hi Zorro.

I'm glad you enjoyed the links. With regard to theory vs practice - this
will always be an issue. However, as I mentioned to Ed, the tasks that are
currently being explored, at least in the literature I have read, have more
to do with the provable robustness of more basic tasks upon which large
grained infrastructure can be built.

But perhaps, it would be better for those who are interested to read for
themselves the aims of theoretical DC. I've provided a much more distilled
list of papers than the link to scholar.google.com I posted last time.

Here is, IMO, a good sampling of some of the most important papers in the
field (and by 'the field of DC', I mean the small part of it that I have
been looking into). I believe the papers are in chronological order and
should help to illustrate how the field evolved from models based on graph
theory, to models that employ tools from algebraic topology.

[1] Impossibility of distributed consensus with one faulty process
---
(The famous 'FLP' paper)
http://theory.lcs.mit.edu/tds/papers/Lynch/jacm85.pdf

[2] Extended Impossibility Results for Asynchronous Complete Networks
Unfortunately - could not locate a free copy online - though the ACM Portal
has it

[3] A combinatorial Characterization of the Distributed Tasks which are
Solvable in the Presence of One Faulty Processor
---
http://citeseer.ist.psu.edu/context/80502/0

[4] The Topological Structure of Asynchronous Computability
---
(This was the 2004 'Principles of Distributed Computing' (PODC) Godel Prize
Winning Paper)

[5] Distributed Computing - A Glimmer of a Theory
---
http://citeseer.ist.psu.edu/334587.html

[6] Hundreds of Impossibility Results for Distributed Computing
---
http://citeseer.ist.psu.edu/fich03hundreds.html

I would *especially* like to direct your attention to [6], as it is a recent
and well written overall survey of the field. [5] is also a very good
survey with more of a focus on fine-grained concurrency (eg: using point-set
and algebraic topology in deadlock detection, and other problems). Finally,
[4], if you can locate it online (I know its there, but have been trouble
finding it) is a *very* important paper, but is very technically 'heavy'.
Post by Zorro
The link was quite refreshing. Not sure whether you were asking
something?
Just trying to get a feel for the amount of interest there is in these
topics. It seems like a big part of getting people more excited in the
theory will be to illustrate how it can be practically exploited.

Thanks for your time,

Mike
David C. DiNucci
2005-09-29 19:03:40 UTC
Permalink
Sorry to wait so long to respond to this, but I was hoping to have time
to give it justice. As I still don't, I'll at least give it a "quick
and dirty" response.
Post by Michael N. Christoff
Post by Zorro
Replying to Mike N. Christoff.
snip
Post by Michael N. Christoff
Post by Zorro
At some point, we need to use such results to provide an
infrastructure. That is, there is more than mathematics here. We have a
theoretical solution (as in mathematics) and then we need a
computational model for the theoretical solution. That is where CORBA
etc come in (as AndyW nicely puts it). With regard to this, my view of
(concurrent) distributed applications (on heterogeneous systems) is
very different from DSOM (and DCOM). It has taken me decades to
translate the theory into a computational model.
Hi Zorro.
I'm glad you enjoyed the links. With regard to theory vs practice - this
will always be an issue. However, as I mentioned to Ed, the tasks that are
currently being explored, at least in the literature I have read, have more
to do with the provable robustness of more basic tasks upon which large
grained infrastructure can be built.
Many theoretical findings in fact are proofs that certain things
*cannot* be done. Unfortunately, it's usually very easy to misconstrue
the result by ignoring or forgetting some of the assumptions,
potentially leading "the smart ones" away from very practical and
efficient real-world solutions. In my own dissertation work, I recall a
paper that was often summarized as "the smoker's problem cannot be
solved using semaphores alone". This paper did provide insight that
using low-level constructs like locks and messages can be very clumsy
for some problems, but the fact that my higher-level constructs are
still usually built from these lower-level constructs, even while
addressing issues like "the smoker's problem", obviates the significance
of the theoretical result itself for me.

I firmly believe in the ability of theory to provide insight and deeper
understanding while we attempt to prove or disprove various results, but
I also believe that theoretical results pay off only when they give us
useful answers to real-world problems. What bothers me most is people
who view our current practice as immutable, and theoretical results as
just providing tools to explain why they are inadequate in various ways.
If the theory doesn't motivate solutions without those inadequacies,
it's not telling us where to put our time, but only where not to put it.
Post by Michael N. Christoff
But perhaps, it would be better for those who are interested to read for
themselves the aims of theoretical DC. I've provided a much more distilled
list of papers than the link to scholar.google.com I posted last time.
Here is, IMO, a good sampling of some of the most important papers in the
field (and by 'the field of DC', I mean the small part of it that I have
been looking into). I believe the papers are in chronological order and
should help to illustrate how the field evolved from models based on graph
theory, to models that employ tools from algebraic topology.
Thanks for the list. It's good to keep up with what various camps are
focussing on.
Post by Michael N. Christoff
[1] Impossibility of distributed consensus with one faulty process
---
(The famous 'FLP' paper)
http://theory.lcs.mit.edu/tds/papers/Lynch/jacm85.pdf
This paper gives an example of a finding which appears devastating, but
which I believe to be of limited practical utility. The same reasoning
could be used to say that some old friends on a conference call can
never decide where to host the poker game next thursday, because if one
of them doesn't respond for a length of time, the others can't tell
whether he's dozed off, in which case they should wait for him to wake
up, or whether he's dead, in which case waiting wouldn't pay off.
There's at least two things that make such a finding impractical.
First, if one or more die, there's a large chance that the still-living
would have priorities other continuing with their task--e.g. still
coming up with a spot at which to play poker. Second is the statement:

"We also assume that processes do not have access to synchronized
clocks, so algorithms based on time-outs, for example, cannot be used."

Just because processes don't have synchronized clocks does not mean that
they cannot independently detect the passage of similar amounts of time.
If Fred dozed off for two hours and then came back with his
recommendation of a poker site, the rest of the guys might still be
waiting, or might have already taken the "he must be dead" exception, or
might have concluded that he was not allowed to alter the decision they
made in his absence, but to say that they're not allowed to act based on
the passage of time is unrealistic

To their benefit, the authors mention the potential shortcomings of
their findings in their conclusion.
Post by Michael N. Christoff
Just trying to get a feel for the amount of interest there is in these
topics. It seems like a big part of getting people more excited in the
theory will be to illustrate how it can be practically exploited.
Right.

I am currently writing a design principles document for my ScalPL
(Scalable Planning Language, pron "scalpel") work, and though
significant portions are a rephrasing of my PhD dissertation, and I'm
citing many other models an techniques, from Petri Nets to Turing
Machines to data parallelism and partial orders to structured
programming, I'm using very few theoretical results. Most of these
citations just help to provide insight to design decisions. Likewise, I
just read a book named "On Intelligence" where Jeff Hawkins does
similarly. In each case, most of the citations are of the form "if you
phrase (or depict) a problem in this way, then the solution becomes more
apparent". That might even be followed with "...and, in fact, the
solution has been proven in that context." Theoretical findings that
run counter to that, i.e. "if you look at it this way, you can see that
there's no way to do this thing (like deciding on a poker locale) that
people do every day" seem more like Zeno's Paradox or divide-by-zero
proofs that 1=0 --fun puzzles which may provide no more insight than we
already had.

-Dave
d***@hotmail.com
2005-09-30 06:57:43 UTC
Permalink
Just a note. I am beginning to regret the title of my post. I hope it
does not give the wrong impression in terms about my respect for the
practice of DC (vs the theory of DC). It was initially due to
frustation over the popular definition of "DC = SETI", as oppposed to
"SETI is a type of DC system".
Post by David C. DiNucci
Sorry to wait so long to respond to this, but I was hoping to have time
to give it justice. As I still don't, I'll at least give it a "quick
and dirty" response.
Post by Michael N. Christoff
Post by Zorro
Replying to Mike N. Christoff.
snip
Post by Michael N. Christoff
Post by Zorro
At some point, we need to use such results to provide an
infrastructure. That is, there is more than mathematics here. We have a
theoretical solution (as in mathematics) and then we need a
computational model for the theoretical solution. That is where CORBA
etc come in (as AndyW nicely puts it). With regard to this, my view of
(concurrent) distributed applications (on heterogeneous systems) is
very different from DSOM (and DCOM). It has taken me decades to
translate the theory into a computational model.
Hi Zorro.
I'm glad you enjoyed the links. With regard to theory vs practice - this
will always be an issue. However, as I mentioned to Ed, the tasks that are
currently being explored, at least in the literature I have read, have more
to do with the provable robustness of more basic tasks upon which large
grained infrastructure can be built.
Many theoretical findings in fact are proofs that certain things
*cannot* be done. Unfortunately, it's usually very easy to misconstrue
the result by ignoring or forgetting some of the assumptions,
potentially leading "the smart ones" away from very practical and
efficient real-world solutions. In my own dissertation work, I recall a
paper that was often summarized as "the smoker's problem cannot be
solved using semaphores alone". This paper did provide insight that
using low-level constructs like locks and messages can be very clumsy
for some problems, but the fact that my higher-level constructs are
still usually built from these lower-level constructs, even while
addressing issues like "the smoker's problem", obviates the significance
of the theoretical result itself for me.
I firmly believe in the ability of theory to provide insight and deeper
understanding while we attempt to prove or disprove various results, but
I also believe that theoretical results pay off only when they give us
useful answers to real-world problems. What bothers me most is people
who view our current practice as immutable, and theoretical results as
just providing tools to explain why they are inadequate in various ways.
If the theory doesn't motivate solutions without those inadequacies,
it's not telling us where to put our time, but only where not to put it.
True. But there is value in knowing where not to put one's time in
(like the halting problem in classical theoretical CS). See below for
more on this.
Post by David C. DiNucci
Post by Michael N. Christoff
But perhaps, it would be better for those who are interested to read for
themselves the aims of theoretical DC. I've provided a much more distilled
list of papers than the link to scholar.google.com I posted last time.
Here is, IMO, a good sampling of some of the most important papers in the
field (and by 'the field of DC', I mean the small part of it that I have
been looking into). I believe the papers are in chronological order and
should help to illustrate how the field evolved from models based on graph
theory, to models that employ tools from algebraic topology.
Thanks for the list. It's good to keep up with what various camps are
focussing on.
Post by Michael N. Christoff
[1] Impossibility of distributed consensus with one faulty process
---
(The famous 'FLP' paper)
http://theory.lcs.mit.edu/tds/papers/Lynch/jacm85.pdf
This paper gives an example of a finding which appears devastating, but
which I believe to be of limited practical utility. The same reasoning
could be used to say that some old friends on a conference call can
never decide where to host the poker game next thursday, because if one
of them doesn't respond for a length of time, the others can't tell
whether he's dozed off,
Right.
Post by David C. DiNucci
in which case they should wait for him to wake
up,
Not sure about what you may mean here in terms of what you feel the
processors 'should' and 'should not' do as dictated by the assumptions
of the proof. However, we are both writing very informally.
Post by David C. DiNucci
or whether he's dead, in which case waiting wouldn't pay off.
There's at least two things that make such a finding impractical.
First, if one or more die, there's a large chance that the still-living
would have priorities other continuing with their task--e.g. still
"We also assume that processes do not have access to synchronized
clocks, so algorithms based on time-outs, for example, cannot be used."
Just because processes don't have synchronized clocks does not mean that
they cannot independently detect the passage of similar amounts of time.
If Fred dozed off for two hours and then came back with his
recommendation of a poker site, the rest of the guys might still be
waiting, or might have already taken the "he must be dead" exception, or
might have concluded that he was not allowed to alter the decision they
made in his absence, but to say that they're not allowed to act based on
the passage of time is unrealistic
Well, in this case we would not be dealing with an asynchronous system,
but a 'semi-synchronous' system where we can use time-outs (ie:
'must-be-dead exception') as a type of pseudo failure detector.

[on a side note: you may want to check out the idea of 'logical clocks'
- a very interesting idea http://en.wikipedia.org/wiki/Logical_clocks]

But check out the paper "Unifying Synchronous and Asynchronous Message
Passing Models". The authors use combinatorial topology to unify
asynchronous, semi-synchronus and synchronous message passing models
under a construction they call a 'pseduosphere'. Using this model they
were able to derive the first lower-bound on wait-free k-set agreement
in the semi-synchronous model - a model where must-be-dead exceptions
can be used (to use your parlance again). Also, in the paper "Towards
a Topological Characterization of Asynchronous Complexity" the authors
show a 1-1 relationship between the tight lower and upper bounds of a
task in an asynchronous system and a specific 'topological property' of
the associated 'protocol complex'. Also, work is being done, with some
success, that uses homotopy theory (part of algebraic topology again)
in deadlock detection in fined-grained concurrent systems.

So its not merely impossibility proofs, but possibility proofs, tight
upper and lower bounds, etc...

But back to FLP for a sec... Part of the importance of the FLP result
was that it led to a complete characterization of the decision tasks
that are solvable in an asynchronous system. This characterization is
constructive, meaning a positive result (ie: possibility) does not only
prove the existence of a protocol to solve the task, but immediately
allows one to essentially 'plug the task into' a generic protocol that
actually solves it.

I also see a benefit in moving from 'hunches' about why some systems
seem to always run into sections where they fail, to a point where we
are actually proving that these 'critical sections' do in fact exist.
As they state in the 1983 FLP paper: "...every commit protocol has such
a window [of vulnerability where the system can fail], confirming a
widely believed tenet in the folklore".

Some other things they note:

a) "We do not consider Byzantine* failures, and we assume that the
message system is reliable: it
delivers all messages correctly and exactly once."

[*Byzantine failure: "a failure in which faulty processes might go
completely haywire, perhaps even sending messages according to some
malevolent plan"]

So the result holds even if the mesaging system is totally reliable.

b) "For the purpose of the impossibility proof, we require only that
_some_ process eventually make a decision." (emphasis mine)

So the result holds even though they consider the task 'solved' even if
out of a hundred processes, only one eventually makes a decision. To
use your example, all they require is that at least one of your poker
buddies eventually chooses a place to play poker. They show that even
this is not possible. If you are the only one of your buddies that
decides, you do not have to worry about making a decision that
conflicts with one of their decisions. What they show is that if a
process is to decide a single output, it must eventually reach a state
where only one decision is possible from that point on. They then
construct a legal execution where the process continually ends up in a
state that leads to two possible decision values and hence it never
decides.

c) "Of course, any protocol can be overwhelmed by faults that are too
frequent or too severe, so the best that one can hope for is a protocol
that is tolerant to a prescribed number of expected faults. In this
paper, we show the surprising result that no completely asynchronous
consensus protocol can tolerate even a single unannounced process
death."

The thing to remember is that there can be executions of the protocol
where _more_ than one process fails and an asynchronous system can
handle it (ie: still decide the task). What they illuminate are the
exact conditions where a single failure can keep all processors from
deciding, regardless of how many live processors are still running.

Finally, it was the FLP paper that first explicitly used 'state
indistinguishability' in a proof - a key idea in more than just DC.
Post by David C. DiNucci
To their benefit, the authors mention the potential shortcomings of
their findings in their conclusion.
Post by Michael N. Christoff
Just trying to get a feel for the amount of interest there is in these
topics. It seems like a big part of getting people more excited in the
theory will be to illustrate how it can be practically exploited.
Right.
I am currently writing a design principles document for my ScalPL
(Scalable Planning Language, pron "scalpel") work, and though
significant portions are a rephrasing of my PhD dissertation, and I'm
citing many other models an techniques, from Petri Nets to Turing
Machines to data parallelism and partial orders to structured
programming, I'm using very few theoretical results. Most of these
citations just help to provide insight to design decisions. Likewise, I
just read a book named "On Intelligence" where Jeff Hawkins does
similarly. In each case, most of the citations are of the form "if you
phrase (or depict) a problem in this way, then the solution becomes more
apparent". That might even be followed with "...and, in fact, the
solution has been proven in that context." Theoretical findings that
run counter to that, i.e. "if you look at it this way, you can see that
there's no way to do this thing (like deciding on a poker locale) that
people do every day" seem more like Zeno's Paradox or divide-by-zero
proofs that 1=0 --fun puzzles which may provide no more insight than we
already had.
-Dave
As I've mentioned, these theoretical results can provide tangible
results, such as lower-bounds on algorithm message-complexity. ie: No
one bothers trying to write sub O(n*log(n)) sorting algorithms anymore.

Obviously if a result provides no insight, its value is questionable.
But I don't believe that impossibility results (of any kind) are
_inherently_ uninsightful because they may not immediately lead to
applications.

The impossibility of faster than light travel is a cornerstone of the
theory of relativity, for example. I understand the theory vs practice
thing, but for something as complex and as inherently mathematical as
distributed computing, its seems that at some point we will have no
choice but to employ more esoteric math to aid us. DC is just another
example of a 'complex system in my book.

[http://en.wikipedia.org/wiki/Complex_system]


-mike
David C. DiNucci
2005-10-02 04:08:10 UTC
Permalink
Post by d***@hotmail.com
Just a note. I am beginning to regret the title of my post. I hope it
does not give the wrong impression in terms about my respect for the
practice of DC (vs the theory of DC). It was initially due to
frustation over the popular definition of "DC = SETI", as oppposed to
"SETI is a type of DC system".
I wondered. In this particular case, a large population of speakers and
listeners came seemingly out of nowhere and began using the term
differently (or at least more narrowly) than those before. I expect
their numbers will diminish. Then again, some people (in this very
group, I think) have suggested that my own definition of the term is not
fully consistent with that used before my time in the field (e.g. 70s
and 80s). C'est la vie.
Post by d***@hotmail.com
Post by David C. DiNucci
I firmly believe in the ability of theory to provide insight and deeper
understanding while we attempt to prove or disprove various results, but
I also believe that theoretical results pay off only when they give us
useful answers to real-world problems. What bothers me most is people
who view our current practice as immutable, and theoretical results as
just providing tools to explain why they are inadequate in various ways.
If the theory doesn't motivate solutions without those inadequacies,
it's not telling us where to put our time, but only where not to put it.
True. But there is value in knowing where not to put one's time in
(like the halting problem in classical theoretical CS). See below for
more on this.
Sure, but the halting problem is not at all obscure--it refers to the
computation we all do every day, and there are no ways of hiding the
effects of the halting problem by building better tools or more layers.
We can make decidable computational models, but they are also less
powerful. If you want to claim an analog for layering communication in
real-world systems (see below), or if you think you already are, I'd be
more willing to consider some of your cited papers as important.

snip
Post by d***@hotmail.com
Post by David C. DiNucci
in which case they should wait for him to wake
up,
Not sure about what you may mean here in terms of what you feel the
processors 'should' and 'should not' do as dictated by the assumptions
of the proof. However, we are both writing very informally.
Right, maybe I should have said "in which case they may have wanted to
wait for him to wake up".

snip
Post by d***@hotmail.com
Well, in this case we would not be dealing with an asynchronous system,
'must-be-dead exception') as a type of pseudo failure detector.
By this definition, it sounds like the only true asynchronous systems
would be those with subsystems in radically different inertial frames,
or at least where message delivery time is completely unbounded. They
are so rare, I'm not sure anybody but a theorist wastes time trying to
reason about them.
Post by d***@hotmail.com
[on a side note: you may want to check out the idea of 'logical clocks'
- a very interesting idea http://en.wikipedia.org/wiki/Logical_clocks]
I am very familiar with logical clocks, and that work could be used as
another example of taking existing confusing practice (in this case,
describing distributed events as a sequence), considering that practice
as immutable, and developing some reasoning to help iron out just a
little bit of the confusion. In this case, logical clocks help to
expose just a little bit of the causal partial ordering of events, by
assigning a logical time L to each event such that if eventa precedes
eventb in the p.o. then L(eventa) < L(eventb).

But it begs the question: Why are we using sequences to describe these
partially-ordered event traces in the first place? Why are we
expressing the systems that generate these event traces as sequential
processes (i.e. wound-up event sequences) with arcane event constraints
between them (e.g. in the form of locks or messages)? Why aren't we
expressing them in terms of folded up event partial orderings, like a
Petri Net, the same way that we describe sequential events as wound-up
sequences, as sequential algorithms/programs?

At some level, deep down, of course the answer is "because we don't
currently build computers that support petri-net-like semantics in
hardware" (and may never). So, fine, there are a few people close to
the hardware who may need to deal with low-level issues to implement
those higher-level semantics and/or reliable communication, but the
reason they do so is so the rest of us won't have to care. Those people
should (and do) at least use the resources at their disposal...like a
clock on every processor that ticks off at about the same rate as the
clocks on the other processors, and hardware that can deliver messages
in some bounded time (as measured by those clocks) or be considered broken.
Post by d***@hotmail.com
But check out the paper "Unifying Synchronous and Asynchronous Message
Passing Models". The authors use combinatorial topology
And it uses a combinatorial topology because it deals with global states
of distributed systems. At that point, it becomes completely useless to
me, because I have no interest in reasoning about, or even postulating
the existence of, the global state of a distributed system.
Post by d***@hotmail.com
to unify
asynchronous, semi-synchronus and synchronous message passing models
So, someone first makes an ad hoc categorization of message passing
models, and then someone unifies them. Though this is not the best
example, it does fit with the theme of some theorists solving problems
that they themselves have devised rather than problems that the real
world is having.

I am not going to claim that this work, or some of the other work you've
cited, is completely useless for everyone, but it does look like it
would be a waste of my time to explore it much further, in part because
I see no need to ever deal with either asynchronous or synchronous
message passing models (by the definitions you are using), especially
ones where the death of a process cannot be detected, and in part
because I have no desire or need to deal with this somewhat illusory
concept of a global distributed state.

big snip
Post by d***@hotmail.com
But back to FLP for a sec... Part of the importance of the FLP result
was that it led to a complete characterization of the decision tasks
that are solvable in an asynchronous system. This characterization is
constructive, meaning a positive result (ie: possibility) does not only
prove the existence of a protocol to solve the task, but immediately
allows one to essentially 'plug the task into' a generic protocol that
actually solves it.
Good. Now give me a real-world example of an asynchronous system, or
better, an asynchronous system that cannot be fairly easily converted to
a semi-synchronous system.

big snip
Post by d***@hotmail.com
Finally, it was the FLP paper that first explicitly used 'state
indistinguishability' in a proof - a key idea in more than just DC.
In that case, this paper marks a wrong turn that some researchers have
apparently still not recovered from. It's description of a
configuration being a global system state, and of a step as going from
one configuration to another, is an approach borrowed from old
sequential models like the TM, and it is not a surprise that applying it
to fields like DC yields the sorts of complexity that set researchers
down these paths from which they'll seemingly never return.

snip
Post by d***@hotmail.com
The impossibility of faster than light travel is a cornerstone of the
theory of relativity, for example. I understand the theory vs practice
thing, but for something as complex and as inherently mathematical as
distributed computing, its seems that at some point we will have no
choice but to employ more esoteric math to aid us. DC is just another
example of a 'complex system in my book.
I only wish that the majority of the distributed systems camp would have
taken issues like relativity a little more seriously. Why try to regard
the global system state when there's no entity that experiences that
state? When each element of the system can only perceive its
environment via signals (messages,memory) that take time to traverse
space? And for synchronous models, why try to pretend that these
signals don't take such time? Answer: Because it feels easier to get
one's one-track mind around this one big thing than lots of little
things. It's an illusion.

Reasoning tools, like partial orders and atomic transactions and Patri
Nets, have long been available to get us away from global system states
and combinatorial state explosion, and to help achieve non-interference
and modularity and causal reasoning, even using our one-track minds.
These tools, like those you've cited, also came from theoretical
insight, but it was insight about real-world problems which motivated
real-world solutions, instead of about manufactured problems which
motivated only more theory.

Over a decade ago in my own dissertation, I drew upon many of those
tools to devise a computational model (F-Nets) to address many of the
shortcomings of reasoning about global distributed state, and I have
since been building the necessary software engineering structure and
tools around it (Software Cabling and ScalPL). F-Nets' semantics can be
expressed as 5 fairly common-sense axioms on their executions (which are
expressed as partial orderings). Reading your post, I find it
extremely frustrating to consider that today's theoreticians might be
almost completely ignoring such productive directions to spend their
time instead on complex, unrealistic, and wrong-headed pursuits.

[If it sounds like I'm getting too worked up about this and completely
dismissing your arguments, don't be too sure that I'm not playing
devil's advocate.]

-- Dave
http://www.elepar.com/
http://www.elepar.com/Passworded/references.html
Tim Tyler
2005-08-14 10:27:34 UTC
Permalink
Post by Michael N. Christoff
Sorry about the multi-posting, but I wanted to get a general opinion of
those online. The groups I chose (while on the surface may not seem like
they would have a lot in common with distributed computing) all intersect
with the idea of collaboratively solving problems in an environment where
one may not have a complete set of information. Where there is no 'global'
entity that sees the global state of the system - ie: where each
participating entity's view of the world is relative to the other entities
I was wondering if anyone in this group is looking into the theoretical side
of distributed computing.
The theoretical side would get a boost by better availability of parallel
hardware.

Humanity seems to be stuck in the vicious circle where:

* No parallel hardware exists - because no parallel software exists to
take advantage of it, and so it doesn't pay hardware engineers to
build it.

* No parallel software exists - because there's no parallel hardware
to run it on - so extra effort in developing it is pointless.

IMO, it's got to the point where this is a pretty embarrasing state
of affairs - indicating a lack of planning and a communications failure
between the software guys and the hardware guys.

The attempts of hardware engineers to get better parallelism by
developing "VLIW" architectures is the most obvious embarassment.

I reckon the best place to break the circle is by developing new
computer programming languages, that make it easier to write
software that can get better performance on parallel machines.

Threads and "fork" don't cut it. Apart from HDLs, there's been
little effort towards making concurrent programming more popular -
apart from Erlang. ISTM that what's needed is something more like a
system where every inter-object message with no return value can be
sent asynchronously at the flick of a modifier - and those with return
values can be dealt with using return value handlers - and half-decent
simulations of parallel environments exist to test under.

If we had something like that, people might start using it,
and the vicous circle above might start to break down.
--
__________
|im |yler http://timtyler.org/ ***@tt1lock.org Remove lock to reply.
A.L.
2005-08-24 17:11:52 UTC
Permalink
Post by Tim Tyler
* No parallel software exists - because there's no parallel hardware
to run it on - so extra effort in developing it is pointless.
No parallel software exists because software-so-called-"engineers"
after 5 days MS "boot camp" cannot understand how to program parallel
systems. "Program first, think later" (a.k.a. Extreme Programming
"methodology") paradigm is not very useful for programming distributed
and parallel systems.

Once I was working for quite a big companty (telecom) where they were
developing distributed system. "System" was hanging after about 15
seconds and they were attributing this to "bad luck". I sugegsted
concurrency control using semaphores. I was reprimended and fired. I
got a letter stating that "he sugegsted using so called semaphores,
what is a scientific abberration that can be interesting to write
conference papers, but there is no room for such concepts in serious
business environment" (I was working as a consultant there).

I have it framed and hanging on the wall.

This was the same company where I got a memo that "we cannot order
SCSI drives, because SCSI is not on the list of our preferred vendors"
written by a freshly graduated hot-shot from Business Computing
Department of one of top ten universities with entry salary 80K/year
plus bonus.

Until software "engineers" become Software Engineers with solid
education and mathematical backgroung, parallel and distributed
computing will keep being "scientific aberration"

A.L.
Tim Tyler
2005-08-24 20:50:49 UTC
Permalink
Post by A.L.
Post by Tim Tyler
* No parallel software exists - because there's no parallel hardware
to run it on - so extra effort in developing it is pointless.
No parallel software exists because software-so-called-"engineers"
after 5 days MS "boot camp" cannot understand how to program parallel
systems. "Program first, think later" (a.k.a. Extreme Programming
"methodology") paradigm is not very useful for programming distributed
and parallel systems.
Not entirely fair - not all programmers cut their teeth in MS "boot camp"
- and while I've seen some dogmatic stupidity from the XP camp, some good
things have emerged from there as well - and "Program first, think later"
isn't very accurate.

The lack of parallel hardware does have a real impact. It's frustrating
to build a parallel system - and then find that it will run slightly more
slowly than a simple serial version would have managed on 99% of your
clients' computers.
--
__________
|im |yler http://timtyler.org/ ***@tt1lock.org Remove lock to reply.
A.L.
2005-08-24 21:33:48 UTC
Permalink
Post by Tim Tyler
Post by A.L.
Post by Tim Tyler
* No parallel software exists - because there's no parallel hardware
to run it on - so extra effort in developing it is pointless.
No parallel software exists because software-so-called-"engineers"
after 5 days MS "boot camp" cannot understand how to program parallel
systems. "Program first, think later" (a.k.a. Extreme Programming
"methodology") paradigm is not very useful for programming distributed
and parallel systems.
Not entirely fair - not all programmers cut their teeth in MS "boot camp"
- and while I've seen some dogmatic stupidity from the XP camp, some good
things have emerged from there as well - and "Program first, think later"
isn't very accurate.
Don't question this...
Post by Tim Tyler
The lack of parallel hardware does have a real impact. It's frustrating
to build a parallel system - and then find that it will run slightly more
slowly than a simple serial version would have managed on 99% of your
clients' computers.
There IS paraller hardware. Multi mega bucks supercomputers are not
within my reach, but clusters can be easiliy constructed or purchased,
see

http://www.microway.com/clusters.htm

Problem is that to use the power of multiple processors it is
necessary to develop parallel algorithms (or parallelize existing
ones), and this is quite difficult. Moreover, there are some intrinsic
limitations regarding the speedup you can get. Some class of problems
I am currently working on (from quite practical field of supply chain
optimziation) cannot have the speedup better than the square root of
the number of processors, and this is the upper limit. There is a
strict mathematical proof, and to do this some mathematics was needed.
Programming is not enough. Not knowing this in advance can produce
dissapointment and the opinion that "parallel computing stinks".

Additional trouble with clusters is that connection bandwith is
limited, and in many parallel algorithms this is the factor that
strongly limits the speedup that can be achieved. From other end, SMP
computers with shared bus have really bad properties when used for
parallel computing.

Despite of all problems, I have developed several (strictly speaking,
3) commercial applications that scale well up to about 10 processors.
This gives enough power to solve quite large practical problems in
supply chain management.

The pity is that transputer architecture is gone. With their super
fast links, transputers made it easy to do amazing things. Well, in
the era when the fastest PC had 50 MHz clock...

Anyway, I don't think that there is no parallel hardware and that the
lack of hardware is the problem. Parallel hardware exists, but the
difficluties are somewheer else, specifically in a) designing parallel
algorithms or parallelizing existing ones, b) performing analysis to
determine the achievable speedup and limitations of parallel
execution, c) maping algorithm onto parallel hardware, d) using proper
programming methodology

All the above is NOT the subject of average curriculum of average
college (not saying about boot camps and self learners), and in my
opinion this is the main obstacle in making parallel computing more
widely used

A.L.
d***@hotmail.com
2005-08-23 02:41:43 UTC
Permalink
P.S. I don't mean to minimize the engineering side of things. I just
think that that side of the equation is already well represented on the
internet as far as I can see.
From: anon
Sent: Wednesday, August 10, 2005 2:16 PM
To: Michael N. Christoff
Subject: Your contribution to comp.theory / comp.distributed
Dear Mike,
I chose the direct way to contact you (not via newsgroup) since
(1) I don't want to be spammed
and do not know precisely how to defend myself,
(2) my answer is so incomplete that I dare to publish it.
(Is your trick REMOVETHIS appropriate?)
To summarize, I believe that you are pointing to one of
the biggest gaps in (at least: theoretical) computer science.
This feeling (nothing more than a feeling) derives from my experiences
as a developer of "mobile distributed systems"
which I had to fit with my mathematical education.
I did quite a lot of inquiries
to identify some theories which would help people like me.
But I failed.
I'll now look at Algebraic Topology following your hints,
but this will take some time.
(In my Math books, this field is not even mentioned!
But clearly, I know a little bit of Algebra and of Topology.)
Maybe, it helps me, but will it also help
the non-mathematician? What are the concepts
behind Algebraic Topology which might bring the practicioner
to a BETTER understanding of distributed systems?
Or do they have to study modern Math?
Yours
anon
PS: The fact that, up to now, there are only two answers altogether
to your contribution is illuminating!
Hi.

One of the problems is that algebraic topology is hard stuff. Really
hard.
You end up doing a lot of stuff and sit there wondering where its all
leading. But eventually it all starts to make sense. Its just that it
takes a lot of time and a lot of effort, and in the mean time you're
not
writing any 'cool frameworks'. I understand how that can suck and is
probably a turn-off to the 90% of people who just want to have fun,
write
their own systems and try to perfect them through trial and error. The
other 9% will do serious engineering work in designing these systems,
and the remainder will enjoy working in theory. My
personal belief is that distributed computing is just another example
of a
'complex system'. I imagine it would be like telling someone from the
past
who fashioned sailing ships that manipulating a bunch of symbols on a
piece
of paper could possibly help them design a better vessel. Or that the
very
laws of nature could be well predicted by this same seemingly entirely
unrelated endeavour (math). But who knows, right? Maybe all
breakthroughs
in DC will be the result of nothing more than trial, error and
elbow-grease.

Anyhow. I initially wrote my email due to a desire to meet people to
discuss such things with. I really don't have the energy to try and
convince people that this field of research isn't one of the "the
biggest
gaps in theoretical computer science". If it peaks your interest cool.
If
not, then c'est la vie. You're always going to have the adamant
anti-theory
guys and adamant pro-theory guys. I hope I can remain comfortably in
the
centre.

But in leaving, I'll throw in an entirely rudimentary example of the
idea of
'similarity', or 'indistinguishability'. Imagine the simple task of
deciding whether the majority of an odd number of processes (greater
than 2)
was given an input of 0 or 1. So each process starts with a private
input
(a 0 or 1), then after a finite number of steps, all non-faulty
processes
must output 1 if the majority were given the input 1, and 0 if the
majority
were given 0. They communicate by passing messages over a truly
asynchronous network (ie: UDP packets over the internet for example).
By
asynchronous I mean that messages could be delayed for arbitrary
amounts of
time.

So let's say we have 3 processes: a,b,c with inputs 0,0,1 respectively.
a,b,c should all output 0 at the end of the protocol. Easy. Also easy
to
figure out is that if their inputs were 0,1,1 (respectively) then they
should all output 1. Now what if processor b 'fell asleep'? ie: its
messages were extremely delayed, or the computer it was being hosted on
started disk defragmenting and it slowed down to a crawl, etc... Then
a and
c would eventually have to make a decision (since they must decide in a
finite amount of time). But what decision do they make? What if they
decide 1 and b 'wakes up' and announces its input was a 0 - then they
made
the wrong decision. But if they decide 0 and b wakes up announcing its
input was a 1, then they also have made a bad decision. The problem is
that
if b is arbitrarily delayed, a and c alone cannot distinguish between
the
inital conditions 0,0,1 and 0,1,1. Hence there will always be valid
runs of
the protocol were the distributed algorithm makes the wrong decision.

One of the things you can do with algebraic topology is determine in a
system of N processes, whether two states are indistinguishable to N-1
processes. Well, you can do that with graph theory. But graph theory
will
only let you see similarity results in the face of up to one faulty
processor (f = 1). With algebraic topology ("simplicial complexes" in
particular - an n-dimensional extension of graphs) one can determine
whether, for example, 3 global states are indistingusihable to 4
processors,
or 2 global states are are indistinguishable to 6 processors, etc...
This
allows you to prove many impossibility results for almost arbitray
amounts
of failures.

There's way to much to write about in one go. I'm just shaving the tip
of
the ice berg.

But anyways, as I said. If it peaks your interest, great. If not,
there's
not much I can do about it. Good luck.


-mike
Michael N. Christoff
2005-08-23 02:15:33 UTC
Permalink
Sent: Wednesday, August 10, 2005 2:16 PM
To: Michael N. Christoff
Subject: Your contribution to comp.theory / comp.distributed
Dear Mike,
I chose the direct way to contact you (not via newsgroup) since
(1) I don't want to be spammed
and do not know precisely how to defend myself,
(2) my answer is so incomplete that I dare to publish it.
(Is your trick REMOVETHIS appropriate?)
To summarize, I believe that you are pointing to one of
the biggest gaps in (at least: theoretical) computer science.
This feeling (nothing more than a feeling) derives from my experiences
as a developer of "mobile distributed systems"
which I had to fit with my mathematical education.
I did quite a lot of inquiries
to identify some theories which would help people like me.
But I failed.
I'll now look at Algebraic Topology following your hints,
but this will take some time.
(In my Math books, this field is not even mentioned!
But clearly, I know a little bit of Algebra and of Topology.)
Maybe, it helps me, but will it also help
the non-mathematician? What are the concepts
behind Algebraic Topology which might bring the practicioner
to a BETTER understanding of distributed systems?
Or do they have to study modern Math?
Yours
PS: The fact that, up to now, there are only two answers altogether
to your contribution is illuminating!
Hi.

One of the problems is that algebraic topology is hard stuff. Really hard.
You end up doing a lot of stuff and sit there wondering where its all
leading. But eventually it all starts to make sense. Its just that it
takes a lot of time and a lot of effort, and in the mean time you're not
writing any 'cool frameworks'. I understand how that can suck and is
probably a turn-off to the 99% of people who just want to have fun, write
their own systems and try to perfect them through trial and error. My
personal belief is that distributed computing is just another example of a
'complex system'. I imagine it would be like telling someone from the past
who fashioned sailing ships that manipulating a bunch of symbols on a piece
of paper could possibly help them design a better vessel. Or that the very
laws of nature could be well predicted by this same seemingly entirely
unrelated endeavour (math). But who knows, right? Maybe all breakthroughs
in DC will be the result of nothing more than trial, error and elbow-grease.

Anyhow. I initially wrote my email due to a desire to meet people to
discuss such things with. I really don't have the energy to try and
convince people that this field of research isn't one of the "the biggest
gaps in theoretical computer science". If it peaks your interest cool. If
not, then c'est la vie. You're always going to have the adamant anti-theory
guys and adamant pro-theory guys. I hope I can remain comfortably in the
centre.

But in leaving, I'll throw in an entirely rudimentary example of the idea of
'similarity', or 'indistinguishability'. Imagine the simple task of
deciding whether the majority of an odd number of processes (greater than 2)
was given an input of 0 or 1. So each process starts with a private input
(a 0 or 1), then after a finite number of steps, all non-faulty processes
must output 1 if the majority were given the input 1, and 0 if the majority
were given 0. They communicate by passing messages over a truly
asynchronous network (ie: UDP packets over the internet for example). By
asynchronous I mean that messages could be delayed for arbitrary amounts of
time.

So let's say we have 3 processes: a,b,c with inputs 0,0,1 respectively.
a,b,c should all output 0 at the end of the protocol. Easy. Also easy to
figure out is that if they're inputs were 0,1,1 (respectively) then they
should all output 1. Now what if processor b 'fell asleep'? ie: its
messages were extremely delayed, or the computer it was being hosted on
started disk defragmenting and it slowed down to a crawl, etc... Then a and
c would eventually have to make a decision (since they must decide in a
finite amount of time). But what decision do they make? What if they
decide 1 and b 'wakes up' and announces its input was a 0 - then they made
the wrong decision. But if they decide 0 and b wakes up announcing its
input was a 1, then they also have made a bad decision. The problem is that
if b is arbitrarily delayed, a and c alone cannot distinguish between the
inital conditions 0,0,1 and 0,1,1. Hence there will always be valid runs of
the protocol were the distributed algorithm makes the wrong decision.

One of the things you can do with algebraic topology is determine in a
system of N processes, whether two states are indistinguishable to N-1
processes. Well, you can do that with graph theory. But graph theory will
only let you see similarity results in the face of up to one faulty
processor (f = 1). With algebraic topology ("simplicial complexes" in
particular - an n-dimensional extension of graphs) one can determine
whether, for example, 3 global states are indistingusihable to 4 processors,
or 2 global states are are indistinguishable to 6 processors, etc... This
allows you to prove many impossibility results for almost arbitray amounts
of failures.

There's way to much to write about in one go. I'm just shaving the tip of
the ice berg.

But anyways, as I said. If it peaks your interest, great. If not, there's
not much I can do about it. Good luck.


-mike
-----Original Message-----
Markus Redeker
2005-08-27 09:03:48 UTC
Permalink
Post by d***@hotmail.com
One of the things you can do with algebraic topology is determine in a
system of N processes, whether two states are indistinguishable to N-1
processes. Well, you can do that with graph theory. But graph theory will
only let you see similarity results in the face of up to one faulty
processor (f = 1). With algebraic topology ("simplicial complexes" in
particular - an n-dimensional extension of graphs) one can determine
whether, for example, 3 global states are indistingusihable to 4
processors, or 2 global states are are indistinguishable to 6 processors,
etc... This allows you to prove many impossibility results for almost
arbitray amounts of failures.
What concepts from algebraic topology do you use? I could think of homology
groups, since the zeroth homology group counts the number of connected
components of a space. But how do the other homology groups come in (if that
is the method you use)?
--
Markus Redeker Hamburg, Germany
Michael N. Christoff
2005-09-04 10:24:34 UTC
Permalink
Post by Markus Redeker
Post by d***@hotmail.com
One of the things you can do with algebraic topology is determine in a
system of N processes, whether two states are indistinguishable to N-1
processes. Well, you can do that with graph theory. But graph theory will
only let you see similarity results in the face of up to one faulty
processor (f = 1). With algebraic topology ("simplicial complexes" in
particular - an n-dimensional extension of graphs) one can determine
whether, for example, 3 global states are indistingusihable to 4
processors, or 2 global states are are indistinguishable to 6 processors,
etc... This allows you to prove many impossibility results for almost
arbitray amounts of failures.
What concepts from algebraic topology do you use? I could think of homology
groups, since the zeroth homology group counts the number of connected
components of a space. But how do the other homology groups come in (if that
is the method you use)?
They are used to determine, for example, if the input complex of a protocol
has any 'holes' below a certain dimension.

-mike
Kent Paul Dolan
2005-11-03 00:23:43 UTC
Permalink
Some habits are just too horrid to endure in silence:

http://dictionary.reference.com/search?q=pique

HTH

xanthian.
--
Posted via Mailgate.ORG Server - http://www.Mailgate.ORG
David C. DiNucci
2005-11-03 09:14:19 UTC
Permalink
Post by Kent Paul Dolan
http://dictionary.reference.com/search?q=pique
Piquing discussion about distributed computing in the comp.distributed
news group would hardly have been a crime, even assuming it had been
more successful.

-Dave
[followups set to comp.distributed]

Michael N. Christoff
2005-08-23 02:26:07 UTC
Permalink
[sorry if this appears twice]
From: anon
Sent: Wednesday, August 10, 2005 2:16 PM
To: Michael N. Christoff
Subject: Your contribution to comp.theory / comp.distributed
Dear Mike,
I chose the direct way to contact you (not via newsgroup) since
(1) I don't want to be spammed
and do not know precisely how to defend myself,
(2) my answer is so incomplete that I dare to publish it.
(Is your trick REMOVETHIS appropriate?)
To summarize, I believe that you are pointing to one of
the biggest gaps in (at least: theoretical) computer science.
This feeling (nothing more than a feeling) derives from my experiences
as a developer of "mobile distributed systems"
which I had to fit with my mathematical education.
I did quite a lot of inquiries
to identify some theories which would help people like me.
But I failed.
I'll now look at Algebraic Topology following your hints,
but this will take some time.
(In my Math books, this field is not even mentioned!
But clearly, I know a little bit of Algebra and of Topology.)
Maybe, it helps me, but will it also help
the non-mathematician? What are the concepts
behind Algebraic Topology which might bring the practicioner
to a BETTER understanding of distributed systems?
Or do they have to study modern Math?
Yours
anon
PS: The fact that, up to now, there are only two answers altogether
to your contribution is illuminating!
Hi.

One of the problems is that algebraic topology is hard stuff. Really hard.
You end up doing a lot of stuff and sit there wondering where its all
leading. But eventually it all starts to make sense. Its just that it
takes a lot of time and a lot of effort, and in the mean time you're not
writing any 'cool frameworks'. I understand how that can suck and is
probably a turn-off to the 99% of people who just want to have fun, write
their own systems and try to perfect them through trial and error. My
personal belief is that distributed computing is just another example of a
'complex system'. I imagine it would be like telling someone from the past
who fashioned sailing ships that manipulating a bunch of symbols on a piece
of paper could possibly help them design a better vessel. Or that the very
laws of nature could be well predicted by this same seemingly entirely
unrelated endeavour (math). But who knows, right? Maybe all breakthroughs
in DC will be the result of nothing more than trial, error and elbow-grease.

Anyhow. I initially wrote my email due to a desire to meet people to
discuss such things with. I really don't have the energy to try and
convince people that this field of research isn't one of the "the biggest
gaps in theoretical computer science". If it peaks your interest cool. If
not, then c'est la vie. You're always going to have the adamant anti-theory
guys and adamant pro-theory guys. I hope I can remain comfortably in the
centre.

But in leaving, I'll throw in an entirely rudimentary example of the idea of
'similarity', or 'indistinguishability'. Imagine the simple task of
deciding whether the majority of an odd number of processes (greater than 2)
was given an input of 0 or 1. So each process starts with a private input
(a 0 or 1), then after a finite number of steps, all non-faulty processes
must output 1 if the majority were given the input 1, and 0 if the majority
were given 0. They communicate by passing messages over a truly
asynchronous network (ie: UDP packets over the internet for example). By
asynchronous I mean that messages could be delayed for arbitrary amounts of
time.

So let's say we have 3 processes: a,b,c with inputs 0,0,1 respectively.
a,b,c should all output 0 at the end of the protocol. Easy. Also easy to
figure out is that if their inputs were 0,1,1 (respectively) then they
should all output 1. Now what if processor b 'fell asleep'? ie: its
messages were extremely delayed, or the computer it was being hosted on
started disk defragmenting and it slowed down to a crawl, etc... Then a and
c would eventually have to make a decision (since they must decide in a
finite amount of time). But what decision do they make? What if they
decide 1 and b 'wakes up' and announces its input was a 0 - then they made
the wrong decision. But if they decide 0 and b wakes up announcing its
input was a 1, then they also have made a bad decision. The problem is that
if b is arbitrarily delayed, a and c alone cannot distinguish between the
inital conditions 0,0,1 and 0,1,1. Hence there will always be valid runs of
the protocol were the distributed algorithm makes the wrong decision.

One of the things you can do with algebraic topology is determine in a
system of N processes, whether two states are indistinguishable to N-1
processes. Well, you can do that with graph theory. But graph theory will
only let you see similarity results in the face of up to one faulty
processor (f = 1). With algebraic topology ("simplicial complexes" in
particular - an n-dimensional extension of graphs) one can determine
whether, for example, 3 global states are indistingusihable to 4 processors,
or 2 global states are are indistinguishable to 6 processors, etc... This
allows you to prove many impossibility results for almost arbitray amounts
of failures.

There's way to much to write about in one go. I'm just shaving the tip of
the ice berg.

But anyways, as I said. If it peaks your interest, great. If not, there's
not much I can do about it. Good luck.


-mike
Dmitry A. Kazakov
2005-08-24 19:06:34 UTC
Permalink
...They communicate by passing messages over a truly
asynchronous network (ie: UDP packets over the internet for example). By
asynchronous I mean that messages could be delayed for arbitrary amounts of
time.
May I ask what is the reason to assume that the network is fully
asynchronous?

[Much time of our conversations with the customers of the middleware we
sell, is spent in discussions why UDP is unsuitable for their needs, and
why the middleware uses TCP/IP instead. Yet people keep on asking...]
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
d***@hotmail.com
2005-09-04 11:50:24 UTC
Permalink
[sorry if this appears multiple times]
Post by Dmitry A. Kazakov
...They communicate by passing messages over a truly
asynchronous network (ie: UDP packets over the internet for example).
By
asynchronous I mean that messages could be delayed for arbitrary amounts of
time.
May I ask what is the reason to assume that the network is fully
asynchronous?
You would assume this if the system you wish to model is truly
asynchronous.
Other systems may be synchronous or semi-synchronous. Other things
that
will affect the assumptions you make include whether you are using a
message
passing system or a shared memory multiprocessor system. It really
depends
on what you are trying to accomplish.
Post by Dmitry A. Kazakov
[Much time of our conversations with the customers of the middleware we
sell, is spent in discussions why UDP is unsuitable for their needs, and
why the middleware uses TCP/IP instead. Yet people keep on asking...]
The suitability of TCP/IP vs UDP is a function of what your application
does
and the environment in which it is expected to run. There are many
applications for which UDP is unsuitable when compared to TCP/IP (as
seems
to be the case with your application) and vice versa.
Post by Dmitry A. Kazakov
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
-mike
Loading...