Multiprocessors

Model
Digital Document
Publisher
Florida Atlantic University
Description
In recent years multiprocessor systems are becoming increasingly important in critical applications. In particular, their fault tolerance properties are of great importance for their ability to be used in these type of applications. We have developed a multiprocessor simulator that can be used to test different fault detection algorithms. The processors must have four communication links. This simulator operates by passing messages between processors. An algorithm was developed for routing the messages among the processors. The simulator can also be used to try different reconfiguration strategies. In particular we have tested Malek's comparison algorithm using different multiprocessor configurations. We also developed a program which determines the configuration of an unknown network of transputers.
Model
Digital Document
Publisher
Florida Atlantic University
Description
A comparative analysis of two methods of bus arbitration used by a
multiprocessing system with a single main bus is performed. The major
performance parameters are delay time and the average number of active
processors, called processing power. The bus arbitration methods are
described using state and timing diagrams. Multiprocessor systems using
these methods of arbitration are then modeled using Markov chains to
enable the formulation of processing power. From processing power
other performance parameters can be derived. A comparison is then made
among the two bus arbitration methods based on the analytical results
where processors access the bus at an equal rate. Simulations of
multiprocessor systems using either of the two arbitration methods were
performed to validate the analytical models for equal bus request rates.
Simulations are also performed of a system of processors using the main
bus at unequal rates.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This research investigates memory latency of cluster-based cache-coherent multiprocessor systems with different interconnection topologies. We focus on a cluster-based architecture which is a variation of Stanford DASH architecture. The architecture, also, has some similarities with the STiNG architecture from Sequent Computer System Inc. In this architecture, a small number of processors and a portion of shared-memory are connected through a bus inside each cluster. As the number of processors per cluster is small, snoopy protocol is used inside each cluster. Each processor has two levels of caches and for each cluster a separate directory is maintained. Clusters are connected using directory-based scheme through an interconnection network to make the system scaleable. Trace-driven simulation has been developed to evaluate the overall memory latency of this architecture using three different network topologies, namely ring, mesh, and hypercube. For each network topology, the overall memory latency has been evaluated running a representative set of SPLASH-2 applications. Simulation results show that, the cluster-based multiprocessor system with hypercube topology outperforms those with mesh and ring topologies.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The use of cache memories in multiprocessor systems increases the overall systems performance. Caches reduce the amount of network traffic and provide a solution to the memory contention problem. However, caches introduce memory consistency problems. The existence of multiple cache copies of a memory block will result in an inconsistent view of memory if one processor changes a value in its associated cache. Cache coherence protocols are algorithms designed in software or hardware to maintain memory consistency. With the increased complexity of some of the more recent protocols, testing for the correctness of these protocols becomes an issue that requires more elaborate work. In this thesis, correctness analysis of a selected group of representative cache coherence protocols was performed using Petri nets as a modeling and analysis tool. First, the Petri net graphs for these protocols were designed. These graphs were built by following the logical and coherence actions performed by the protocols in response to the different processors' requests that threatens memory consistency. Correctness analysis was then performed on these graphs.
Model
Digital Document
Publisher
Florida Atlantic University
Description
We study the embedding of binomial trees with variable roots in faulty hypercubes. Based on novel embedding strategies, we propose three embedding algorithms with variable nodes as the root. The first algorithm can tolerate up to n - 1 faulty links, but the execution can be done within log2(n - 1) subcube splits. The second one can tolerate up to [(3(n - 1))\2] faulty links. The last one can tolerate up to [(3(n - 4))\2] faulty nodes.
Model
Digital Document
Publisher
Florida Atlantic University
Description
We propose the enhanced Fibonacci cube (EFC), which is defined based on the sequence Fn = 2(n-2) + 2F(n-4). We study its topological properties, embeddings, applications, routings, VLSI/WSI implementations, and its extensions. Our results show that EFC retains many properties of the hypercube. It contains the Fibonacci cube (FC) and extended Fibonacci cube of the same order as subgraphs and maintains virtually all the desirable properties of FC. EFC is even better in some structural properties, embeddings, applications and VLSI designs than FC or hypercube. With EFC, there are more cubes with various structures and sizes for selection, and more backup cubes into which faulty hypercubes can be reconfigured, which alleviates the size limitation of the hypercube and results in a higher level of fault tolerance.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Caches are used in shared memory multiprocessors to reduce the effective memory latency, and network and memory bandwidth requirements. But the data spreading across the caches leads to the cache coherence problem. In this thesis, a new directory based cache coherence scheme, called the cache-vector protocol, is proposed and evaluated. The said scheme entails a low memory overhead but delivers a performance that is very close to that of the scheme proposed by Censier and Feautrier (3), which offers the best performance of all the directory based schemes. The performance of the cache-vector protocol is evaluated using trace-driven simulation. A figure of merit which takes into account the average memory latency, network traffic and the hardware overhead is introduced and used as the basis of comparison between the two schemes. The simulation results indicate that the cache-vector protocol is a viable solution to the cache coherence problem in large scale multiprocessors.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Multiprocessor systems have demonstrated great potential for meeting the ever increasing demand for higher performance. In this thesis, we develop simulation models with fewer and more realistic assumptions to evaluate the performance of the circuit-switched cluster-based multiprocessor system. We then introduce a packet-switched variation of the cluster-based architecture and develop simulation models to evaluate its performance. The analysis of the cluster-based systems is performed for both uniform and non-uniform memory reference models. We conducted similar analysis for the crossbar and multiple-bus systems. Finally, the results of the cluster-based systems are compared to those obtained for the crossbar and the multiple-bus systems.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The utilization of a multiprocessor system is enhanced when idle time of processors is reduced. Allocation of processes from overloaded processors to idle processors can balance the load on multiprocessor systems and increase system throughput by reducing the process execution time. This thesis presents a study of parameters, issues and existing algorithms related to load balancing. The performance of load balancing on hypercubes using three new algorithms is explored and analyzed. A new algorithm to balance load on hypercubes in the presence of link faults is presented and analyzed here. Another algorithm to balance load on hypercube systems containing faulty processors is proposed and studied. The applicability of load balancing to real life problems is demonstrated by showing that the execution of branch and bound problem on hypercubes speeds up when load balancing is used.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In the last few years, it has become profound to achieve higher performance of computers by solely upgrading logic technology. This required a move to a parallel processing system or a multiprocessor system in order to build faster computer systems. The importance of multiprocessor systems is increasing due to many reasons, one of which is reliability. In a multiprocessor system, a number of tasks may concurrently exist. To operate the system efficiently, one must carefully schedule the tasks. This thesis proposes a set of algorithms to schedule these tasks exploiting the inherent redundancy of processors in a multiprocessor system. Also discussed are some reliability issues and application to different networks with some examples.