K. Diks, A. Malinowski, A. Pelc, Reliable token dispersal with random faults,
*Parallel Processing Letters* 4 (1994), pp. 417-427.

Abstract:

An unrestricted supply of identical tokens is stored in one node of an *n*-node network, called the source. In a unit of time each node of the network that holds tokens can send at most one token to at most one other node. Links and/or nodes of the network fail independently with constant probability. An attempt to send a token to a faulty node or via a faulty link does not succeed (the token remains at the sending node). The aim of token dispersal is to visit every fault-free node by at least one token. We present token dispersal algorithms with optimal order of time complexity and probability of correctness
exceeding *1 - 1/n* .

B. S. Chlebus, K. Diks, A. Pelc, Sparse networks supporting efficient reliable broadcasting,
*Nordic Journal of Computing* 1 (1994), pp. 332-345.

A preliminary version appeared in: Proc. 20th International Colloquium on Automata, Languages and Programming ICALP'93, Lund, Sweden, July 1993, LNCS 700, pp. 388-397.

Abstract:

Broadcasting concerns transmitting information from a node of a communication network to all other nodes. We consider this problem assuming that links and nodes of the network fail independently with given probabilities
* p < 1 * and * q < 1*, respectively. For a positive constant
*e*, broadcasting in an *n*-node network is said to be *e*-safe, if source information is transmitted to all fault-free nodes with probability at least *1- n ^{-e}*. For any

P. Berman, A. Pelc, Reliable distributed diagnosis for multiprocessor systems with random faults,
*Networks* 24 (1994), pp. 417-427.

A preliminary version appeared as: P. Berman, A. Pelc, Distributed probabilistic fault diagnosis for multiprocessor systems, Proc. 20th Annual International Symposium on Fault-Tolerant Computing, Newcastle upon Tyne, UK, June 1990, pp. 340-346.

Abstract:

We study a probabilistic setting for distributed fault diagnosis in multiprocessor systems. A system is an undirected graph with nodes representing processors and edges representing communication links. Processors are assumed to fail independently with some probability *p*. They test their neighbors, and a fault-free processor has probability *1-q* of discovering a fault of a failed neighbor in an individual test. Subsequently, fault-free processors attempt to diagnose all the processors of the system with communication based on the test results. During communication, the behavior of faulty processors may be arbitrary (so-called malicious).For every * p < 1/2*,
* q < 1*, we construct systems with *O(n logn)* links in which distributed probabilistic diagnosis can be achieved with probability of correctness at least *1 - 1/n*. We also show that for some small fixed
*p *and *q* a similar result holds for the hypercube. On the other hand we prove that for sufficiently small *k*, for a system with
*n* processors and *kn log n* links, the probability of achieving correct diagnosis cannot exceed *n ^{-0.5}*.

K. Diks, E. Kranakis, D. Krizanc, B. Mans, A.Pelc, Optimal coteries and voting schemes,
*Information Processing Letters* 51 (1994), pp. 1-6.

Abstract:

We consider coteries (i.e. intersecting, Sperner systems) on arbitrary networks with given node reliability, which maximize the network availability (i.e., the probability that the set of operating nodes has a connected subgraph which contains a set from the coterie). We show that the singleton coterie is always optimal on any network when the node reliability * p * is at most 1/2. For node reliability * p * at least 1/2, we give optimal coteries for the complete multipartite networks. The resulting optimal coteries when restricted to a special subclass of complete bipartite networks are shown to exceed the availability of any voting scheme, for * p > 1/2 *.

A. Pelc, Almost safe group testing with few tests,
*Combinatorics, Probability & Computing* 3 (1994), pp. 411-419.

Abstract:

In group testing, sets of data undergo tests which reveal if the set contains faulty data. Assuming that data are faulty with given probability, independently of one another, we investigate small families of tests that enable us to locate correctly all faulty data with probability converging to one, as the amount of data grows. Upper and lower bounds on the minimum number of such tests are established for different probability functions, and respective location strategies are constructed.

D. Blough, A. Pelc, Almost certain fault diagnosis through algorithm-based fault tolerance,
*IEEE Transactions on Parallel and Distributed Systems* 5 (1994), pp. 532-539.

Abstract:

Algorithm-based fault tolerance has been proposed as a technique to detect incorrect computations in multiprocessor systems. In algorithm-based fault tolerance, processors produce data elements that are checked by concurrent error detection mechanisms. In this paper, we investigate the efficacy of this approach for diagnosis of processor faults. Because checks are performed on data elements, the problem of location of data errors must first be solved. We propose a probabilistic model for the faults and errors in a multiprocessor system and use it to evaluate the probabilities of correct error location and fault diagnosis. We investigate the number of checks that are necessary to guarantee error location with high probability. We also give specific check assignments that accomplish this goal. We then consider the problem of fault diagnosis when the locations of erroneous data elements are known. Previous work on fault diagnosis required that the data sets produced by different processors be disjoint. We show, for the first time, that fault diagnosis is possible with high probability, even in systems where processors combine to produce individual data elements.

J. Czyzowicz, K. B. Lakshmanan, A. Pelc, Searching with local constraints on error patterns,
*European Journal of Combinatorics* 15 (1994), pp. 217-222.

Abstract:

We give an asymptotically optimal algorithm of search for an unknown element in a finite set under the assumption that at most one error can occur in every sequence of *r* consecutive answers, were *r > 2* is a constant.

B. S. Chlebus, K. Diks, A. Pelc, Sorting on a mesh-connected computer with delaying links,
*SIAM Journal on Discrete Mathematics* 7 (1994), pp. 119-132.

Abstract:

We consider a mesh-connected processor array in which the links are faulty in the following sense: each attempt by two neighboring processors to communicate by exchanging messages may fail with some constant probability. A message sent accross a link and not delivered is said to be delayed by the link. It is assumed that all the links delay with the same fixed delay probability, independently of each other.

We address the problem of sorting in this model.
It is proved that an *n* x *n* mesh can be sorted in the expected
time *O(n)* with large probability.
More precisely, it is shown that there are two constants * c > 0 * and
* r > 1 *, depending on the delay probability, such that the
*n* x *n*
mesh is sorted in time *cn+t* with the probability at least
*1-r ^{-t}*.
We consider one specific algorithm, but the analysis shows that
many known algorithms could sort in the expected time

B. S. Chlebus, K. Diks, A. Pelc, Fast gossiping with short unreliable messages,
*Discrete Applied Mathematics* 53 (1994), pp. 15-24.

Abstract:

Each of *n* nodes of a communication network has a piece of information (gossip) which should be made known to all other nodes. Gossiping is done by sending letters. In a unit of time each node can either send one letter to a neighbor or receive one such letter, containing one gossip currently known to the sender. Letters reach their destinations with constant probability *0 < q < 1*, independently of one another. For a large class of networks, including rings, grids, hypercubes and complete graphs, we construct gossip schemes working in linear time and successfully performing gossiping with probability converging to 1, as the number of nodes grows.

A. Pelc, Searching with permanently faulty tests,
*Ars Combinatoria* 38 (1994), pp. 65-76.

Abstract:

We investigate searching strategies for the set *{1, ..., n}* assuming a fixed bound on the number of erroneous answers and forbidding repetition of questions. This setting models the situation when different processors provide answers to different tests and at most *k* processors are faulty. We show for what values of *k* the search is feasible and provide optimal testing strategies if at most one unit is faulty.

P. Denejko, K. Diks, A. Pelc, M. Piotrow, Reliable minimum finding comparator networks, Proc. 19th Int. Symp. Mathematical Foundations of Computer Science MFCS-94, Kosice, Slovakia, August 1994, LNCS 841, pp. 306-315.

Abstract:

We consider the problem of constructing reliable minimum finding networks built from unreliable comparators. In case of a faulty comparator, inputs are directly output without comparison. Our main result is the first nontrivial lower bound on depths of networks computing minimum among * n > 2 * items in the presence of * k > 0 * faulty comparators. We also describe a network whose depth nearly matches the lower bound.

B. S. Chlebus, K. Diks, A. Pelc, Waking up an anonymous faulty network from a single source, Proc. 27th Annual Hawaii International Conference on System Sciences, January 1994, Vol.2, pp. 187-193.

Abstract:

We consider anonymous complete networks whose links and nodes are subject to random independent failures with fixed probabilities * p < 1 * and
* q < 1 *, respectively. A single fault-free node has to wake up all fault-free nodes by propagating a wakeup message through the network. In a unit of time each node can communicate with at most one other node. Communications with a faulty node or via a faulty link do not succeed. For any * e > 0 *, we present a wakeup algorithm for *n*-node networks, running in expected time *O(log n)*, using an expected number of
*O(n log n)* message bits and working correctly with probability exceeding *1- n ^{-e}*, for sufficiently large