Date on Master's Thesis/Doctoral Dissertation

5-2005

Document Type

Master's Thesis

Degree Name

M.S.

Department

Computer Engineering and Computer Science

Committee Chair

Kumar, Anup

Subject

Data mining

Abstract

This study defines a new approach for building a Web Services based infrastructure for distributed data mining applications. The proposed architecture provides a roadmap for "autonomic" functionality of the infrastructure hiding the complexity of implementation details and enabling the user with a new level of usability in data mining process. Web Services based infrastructure delivers all required data mining activities in a utility-like fashion enabling heterogeneous components to be incorporated in a unified manner. Moreover, this structure allows the implementation of data mining algorithms for processing data on more than one source in a distributed manner. The purpose of this study is to present a simple, but efficient methodology for determining when data distributed at several sites can be centralized and analyzed as data from the same theoretical distribution. This analysis also answers when and how the semantics of the sites is influenced by distribution in data. This hierarchical framework with advanced and core Web Services improves the current data mining capability significantly in terms of performance, scalability, efficiency, transparency of resources, and incremental extensibility.

Share

COinS