Adaptive De-Clustering of Data ~ Engineering Projects

02:39 in Clustering, De-clustering of Data, Engineering Projects, Guidance, Project reality, research paper

:Adaptive Overlapped De-Clustering:

In a competitive environment, enterprises based on information systems are required to have computer systems that are both highly available and easily scalable. High availability is essential to both customer satisfaction and employee productivity. Scalability is vital in responding to spikes in business and supporting the rapid rollout of new products and services.

The system represents a new strategy for increasing the availability of data in multi-processor, shared-nothing database machines. This technique, termed chained declustering, is demonstrated to provide superior performance in the event of failures while maintaining a very high degree of data availability. Furthermore, unlike earlier replication strategies, the implementation of chained declustering requires no special hardware and only minimal modifications to existing software.

The Project proposes a new data-placement method named Adaptive Overlapped Declustering, which can be applied to a parallel storage system using a value range partitioning-based distributed directory and primary-backup data replication, to improve the space utilization by balancing their access loads. The proposed method reduces data skews generated by data migration for balancing access load. While some data-placement methods capable of balancing access load or reducing data skew have been proposed, both requirements satisfied simultaneously. It also improves the reliability and availability of the system because it reduces recovery time for damaged backups after a disk failure. The method achieves this acceleration by reducing a large amount of network communications and disk I/O.

As far as any network is concerned the basic problem is to identify the amount of data storage on the computer system. To achieve high availability and scalability of large storage system is the very important task of today’s existing Information based system. To fulfill these requirements many data placement methods are available in the current market while some data placements methods capable enough to balance the load or reducing data skew simultaneously.

Adaptive overlapped Chained De-clustering is a method used to manage replicated data. It is usually implemented in shared nothing architecture. This technique is capable of providing both high availability in addition to being able to fully balance the workload among the operational nodes in the event of a failure.

We describe and evaluate a strategy for de-clustering the parity encoding in a redundant disk array. This de-clustered parity organization balances cost against data reliability and performance during failure recovery. It is targeted at highly-available parity-based arrays for use in continuous operation systems. It improves on standard parity organizations by reducing the additional load on surviving disks during the reconstruction of a failed disk’s contents. This yields higher user throughput during recovery, and/or shorter recovery time. We first address the generalized parity layout problem, basing our solution on balanced incomplete and complete block designs. A software implementation of de-clustering is then evaluated using a disk array simulator under a highly concurrent workload comprised of small user accesses. We show that de-clustered parity penalizes user response time while a disk is being repaired (before and during its recovery) less than comparable non-de-clustered (RAID5) organizations without any penalty to user response time in the fault-free state.

We then show that previously proposed modifications to a simple, single-sweep reconstruction algorithm further decrease user response times during recovery, but, contrary to previous suggestions, the inclusion of these modifications may, for many configurations, also slow the reconstruction process. This result arises from the simple model of disk access performance used in previous work, which did not consider throughput variations due to positioning delays.

The disk array architecture known as Mirrored Array, a data replication approach, can provide significant increasing at the levels of reliability and performance of disk systems. One solution for organising two data copies in the disk system is the method called Chained De-clustering. In this work, the performances provided by three policies for the Chained De-clustering are compared. These policies are: Shortest Queue Policy, Shortest Seek-Time Policy and Delayed Secondary Write Policy. Simulation results showed that the Shortest Seek-Time policy provides slightly better performance than the Shortest Queue policy. However, when the number of secondary writes performed immediately decreases, the Delayed Secondary Write policy can provide better performance than the Shortest Seek-Time policy. The Delayed Secondary Write policy can improve the performance of disk arrays considering the Chained De-clustering technique, maintaining a high level of reliability.How proposed system different:

We are going to develope a structure of multi-processor, shared-nothing database machines for increasing the availability of data. This technique is associated with the term chained de-clustering, which is demonstrated to provide superior performance in the event of failures while maintaining a very large data with high degree of availability. In the implementation of chained de-clustering there is no requirement of special hardware and only minimal modifications to existing software.

Engineering Projects

Saturday, 30 July 2011

Adaptive De-Clustering of Data

0 comments:

Post a Comment

Tweet with us....

Core Technology Team

Follow us....

Popular Posts

Categories

Total Pageviews