Date on Master's Thesis/Doctoral Dissertation

8-2011

Document Type

Doctoral Dissertation

Degree Name

Ph. D.

Department

Computer Engineering and Computer Science

Committee Chair

Elmaghraby, Adel S.

Author's Keywords

Data center; Networking; Resilience; RAS; Storage

Subject

Data libraries--Security measures; Computer networks--Security measures; Information storage and retrieval systems; Digital preservation

Abstract

Data centers (DC) are the core of the national cyber infrastructure. With the incredible growth of critical data volumes in financial institutions, government organizations, and global companies, data centers are becoming larger and more distributed posing more challenges for operational continuity in the presence of experienced cyber attackers and occasional natural disasters. The main objective of this research work is to present a new methodology for data center resilience assessment, this methodology consists of: • Define Data center resilience requirements. • Devise a high level metric for data center resilience. • Design and develop a tool to validate and the metric. Since computer networks are an important component in the data center architecture, this research work was extended to investigate computer network resilience enhancement opportunities within the area of routing protocols, redundancy, and server load to minimize the network down time and increase the time period of resisting attacks. Data center resilience assessment is a complex process as it involves several aspects such as: policies for emergencies, recovery plans, variation in data center operational roles, hosted/processed data types and data center architectures. However, in this dissertation, storage, networking and security are emphasized. The need for resilience assessment emerged due to the gap in existing reliability, availability, and serviceability (RAS) measures. Resilience as an evaluation metric leads to better proactive perspective in system design and management. The proposed Data center resilience assessment portal (DC-RAP) is designed to easily integrate various operational scenarios. DC-RAP features a user friendly interface to assess the resilience in terms of performance analysis and speed recovery by collecting the following information: time to detect attacks, time to resist, time to fail and recovery time. Several set of experiments were performed, results obtained from investigating the impact of routing protocols, server load balancing algorithms on network resilience, showed that using particular routing protocol or server load balancing algorithm can enhance network resilience level in terms of minimizing the downtime and ensure speed recovery. Also experimental results for investigating the use social network analysis (SNA) for identifying important router in computer network showed that the SNA was successful in identifying important routers. This important router list can be used to redundant those routers to ensure high level of resilience. Finally, experimental results for testing and validating the data center resilience assessment methodology using the DC-RAP showed the ability of the methodology quantify data center resilience in terms of providing steady performance, minimal recovery time and maximum resistance-attacks time. The main contributions of this work can be summarized as follows: • A methodology for evaluation data center resilience has been developed. • Implemented a Data Center Resilience Assessment Portal (D$-RAP) for resilience evaluations. • Investigated the usage of Social Network Analysis to Improve the computer network resilience.

Share

COinS