FREE ELECTRONIC LIBRARY - Thesis, dissertations, books

Pages:   || 2 | 3 | 4 | 5 |   ...   | 8 |

«Diego Calvanese Faculty of Computer Science Free University of Bolzano/Bozen Piazza Domenicani 3 I-39100 Bolzano-Bozen BZ, Italy Luigi Dragone, ...»

-- [ Page 1 ] --

Enterprise Modeling and Data Warehousing

in Telecom Italia

Diego Calvanese

Faculty of Computer Science

Free University of Bolzano/Bozen

Piazza Domenicani 3

I-39100 Bolzano-Bozen BZ, Italy

Luigi Dragone, Daniele Nardi, Riccardo Rosati

Dipartimento di Informatica e Sistemistica,

Universit` di Roma “La Sapienza”,


Via Salaria 113, 00198 Roma, Italy

Stefano M. Trisolini

Telecom Italia and

Data Warehouse and DMDWM Consulting S.A.S. (present affiliation)


We present a methodology for Data Warehouse design and its application within the Telecom Italia information system. The methodology is based on a conceptual representation of the Enterprise, which is exploited both in the integration phase of the Warehouse information sources and during the knowledge discovery activity on the information stored in the Warehouse. The application of the methodology in the Telecom Italia framework has been supported by prototype software tools both for conceptual modeling and for data integration and reconciliation.

Key words: Data Warehousing, Data Integration, Conceptual Modeling, Automated Reasoning Email addresses: calvanese@inf.unibz.it (Diego Calvanese), dragone@dis.uniroma1.it (Luigi Dragone), nardi@dis.uniroma1.it (Daniele Nardi), rosati@dis.uniroma1.it (Riccardo Rosati), stefano.trisolini@tin.it (Stefano M. Trisolini).

URLs: http://www.inf.unibz.it/~calvanese/ (Diego Calvanese), http://www.dis.uniroma1.it/~dragone/ (Luigi Dragone), Accepted for publication in Information Systems July 2004 1 Introduction Information integration (1) is one of the main problems to be addressed when designing a Data Warehouse (2). Possible inconsistencies and redundancies between data residing at the operational data sources and migrating to the Data Warehouse need to be resolved, so that the Warehouse is able to provide an integrated and reconciled view of data within the organization. The basic components of a data integration system are wrappers and mediators (3; 4).

A wrapper is a software module that accesses a data source, extracts the relevant data, and presents such data in a specified format, typically as a set of relational tables. A mediator collects, cleans, and combines data produced by wrappers and/or other mediators, according to a specific information need of the integration system. The specification and the realization of mediators is the core problem in the design of an integration system.

In Data Warehouse applications, the data sources are mostly internal to the organization. Moreover, large organizations typically provide informational needs in terms of an integrated conceptual representation of the corporate data that abstracts from the physical and logical structure of data in the sources.

The data stored in the Data Warehouse should reflect such an informational need, and hence should be defined in terms of the corporate data.

Telecom Italia is the main Italian provider of national and international telecommunication services, and is among the largest companies worldwide.

In large companies the need to access company data for business intelligence is both an organizational and a technical challenge, requiring a considerable amount of financial and human resources. Given the development of information technology in the nineties, in Telecom Italia data warehousing (5) has been a natural evolution of enterprise-wide data management and data integration. A Data Warehouse can be defined as a set of materialized views over the operational information sources of an organization, designed to provide support for data analysis and management’s decisions.

In the last years, Telecom Italia has carried out a large integration initiative of enterprise information systems, called IBDA 1, resulting in the construction of an enterprise-wide database integrated at the conceptual level.

Due to the limitations of the available technologies and the costs of replacing and re-engineering legacy applications, such an activity has lead to a solution based on federated databases and legacy systems wrapping, according to the main guidelines of virtual enterprise system integration.

http://www.dis.uniroma1.it/~nardi/ (Daniele Nardi), http://www.dis.uniroma1.it/~rosati/ (Riccardo Rosati).

1 IBDA stands for “Integrazione di Basi di Dati Aziendali”, i.e., integration of company databases.

Meanwhile, the information systems of Telecom Italia have quickly evolved, in particular new applications have been developed, and existing ones have been upgraded or replaced. Such a rapid evolution has been due to both internal and external factors: the birth and growth of new markets (such as mobile telephone and Internet services) and new competitors, the privatization of the company and the subsequent buyouts. Given also the various and disparate information requirements of business intelligence and decision making activities at different levels (e.g., tactical and strategical marketing), the integrated information system started showing inadequate to suit the company’s new informational needs. Consequently, in order to provide an adequate timely deployment, the development of Online Analytical Processing (OLAP) and Decision Support Systems (DSS) applications has been carried out in an unstructured way, resulting in a low modularization and non-effective usage of the Data Warehouse infrastructure.

These issues have pointed out the necessity of adopting an incremental Data Warehousing methodology, in particular, the local-as-view (LAV) approach to Data Warehousing proposed in the context of the European project DWQ 2 (6; 5). In such an approach, each table both in a source and in the Data Warehouse is defined in terms of a view over the global model of the corporate data. This extends the traditional LAV approach to integration, where the information content of each data source is defined in terms of a query over (possibly materialized) global relations constituting the corporate view of data (7; 1; 8; 9; 10). The LAV approach is in contrast to the global-as-view (GAV) approach for data integration (11; 12; 13; 14; 15; 16; 17), typically proposed in Data Warehousing (2; 18). Such an approach requires, for each information need, to specify the corresponding query in terms of the data at the sources. Notably, the LAV approach enables decoupling between information availability and information requirements. Therefore, the introduction of a new information source or the replacement of an existing one does not have any impact on the definition of the Data Warehouse expressed over the global model of corporate data.

Nonetheless, there are several important questions that are not addressed by the work on integration. More specifically, integration anticipates semantic problems with data, but does not address efficiency issues, which are critical for data warehousing. Thus, to guarantee a proper performance of the Data Warehouse, a major re-organization of the data store may be required, with additional costs. This motivates a layered architecture of the Data Warehouse (5), where a primary Data Warehouse feeds the data to several layers of aggregation (called secondary Data Warehouses or Data Marts) before they become available to the final user. Moreover, typically there is the need to effiESPRIT Basic Research Action Project EP 22469 “Foundations of Data Warehouse Quality (DWQ)”, http://www.dbnet.ece.ntua.gr/~dwq/.

ciently take into account legacy systems that are not integrated, and external or extemporaneous data sources that can provide relevant information to the Data Warehouse, possibly for a limited time window.

In this paper we report on the experience of Telecom Italia in the development of its Enterprise Data Warehouse. Such a Data Warehouse adopts a layered architecture, including various Primary Data Warehouses concerning phone traffic of different types and customer information, and several Secondary Data Warehouses, which are at the basis of the Knowledge Discovery and Data Mining activity carried out in Telecom Italia. For the development of its Data Warehouse Telecom Italia has followed the methodology proposed in the DWQ project (6; 5), which is based on the LAV approach.

Indeed, one of the distinguishing features of the approach is a rich modeling language for the conceptual level that extends the Entity-Relationship data model, and thus is fully compatible with the conceptual modeling tools adopted by Telecom Italia. Moreover, the modeling formalism is equipped with automated reasoning tools, which can support the designer during Data Warehouse construction, maintenance and evolution. At the logical level, the methodology allows the designer to declaratively specify several types of Reconciliation Correspondences between data in different sources and in the Data Warehouse, that allow her to take care of differences in the representation at the logical level of the same entities at the conceptual level. Such correspondences are then used to automatically derive the specification of the correct mediators for the loading of the materialized views of the Data Warehouse.

This is done by relying on a query rewriting algorithm, whose role is to reformulate the query that defines the view to materialize in terms of both the source relations and the Reconciliation Correspondences. The characteristic feature of the algorithm is that it takes into account the constraints imposed by the Conceptual Model, and uses the Reconciliation Correspondences for cleaning, integrating, and reconciling data coming from different sources.

The paper is organized as follows. In Section 2 we describe the enterprise information system in Telecom Italia. In Section 3 we introduce the enterprise modeling framework at the basis of the Data Warehouse design methodology that is discussed in Section 4. In Section 5 we present the development process of a portion of the Data Warehouse of Telecom Italia, concentrating on the Primary Data Warehouse design activity. In Section 7 we briefly discuss the use of the Secondary Data Warehouse for Data Mining and Decision Support applications. Finally, in Section 8 we draw some conclusions.

The Enterprise Information System in Telecom Italia

In this section we sketch the main methodological and technological issues that arose in the last years in the development of the enterprise integrated database of Telecom Italia and, subsequently, in the design and implementation of a Data Warehouse for Telecom Italia. The two efforts, although driven by different needs and requirements can be regarded as a continuous development of an integrated view of the enterprise data. Although we analyze the design and the development of the Data Warehouses in Telecom Italia, we deal with many issues that are common in a large enterprise; so our conclusions are easily generalizable to other scenarios.

2.1 Data Base Integration In 1993, Telecom Italia has launched a strategic project, called IBDA, with

the following main goals:

• the definition of an Enterprise Data Model and the migration/evolution of existing data;

• the design and implementation of databases covering the main sections of the Enterprise Data Model (customers, suppliers, network, administration, etc.);

• the design and implementation of a client/server architecture and of the communication middleware;

• the design and implementation of data access services.

The driving motivations of the IBDA strategic project are typical of a common scenario in many world-wide large enterprises, where there is a proliferation of legacy databases with a large overhead in the design and maintenance of software for interfacing applications and providing access to the data. IBDA was based on a staged implementation of services that form a layer separating data from application processes. More specifically, the IBDA service for a database is the exclusive agent that provides access to the data. A database integrated in the IBDA architecture is denoted as BDA and is identified by a unique number. The access is actually accomplished through contracts that enforce the semantic and referential policies for the database.

In this case, we must cope with the numerous and extremely differentiated data sources of the Enterprise Information System of Telecom Italia. In particular, one can find different applications, based on different data management technologies (from hierarchical to object-relational DBMSs), that share information, or, in other words, that manage common concepts.

In 1996, the project was formally completed with the integration of 48 operational databases, while in the subsequent years new databases have been continuously added to IBDA. At present, several other databases are included in IBDA as BDAs and there are ongoing projects for adding more. In the following years it has been realized that the process of database inclusion in IBDA is basically incremental.

2.2 Data Warehousing

In Telecom Italia, data warehousing has been a natural evolution of data integration. Starting from a large integrated enterprise database, and given the size of the data to be stored in the Data Warehouse, the architecture of the Telecom Italia Enterprise Data Warehouse includes a group of Primary Data Warehouses, which are devolved to collect, integrate and consolidate the data extracted from the Operational Data Stores. The Primary Data Warehouses feed the data to several systems on which the user applications (e.g., Decision Support System) rely. These systems are also included in the Enterprise Data Warehouse as Secondary Data Warehouses, also known as Data Marts.

The main difference between the two kinds of Data Warehouses is that the Primary Data Warehouses contain only “atomic” level data, while the Secondary Data Warehouses typically contain information that has been aggregated at different levels of detail. The Enterprise Data Warehouse architecture is basically stratified, therefore the Secondary Data Warehouses are loaded only with data extracted from the Primary Data Warehouses.

Presently, the Telecom Italia Enterprise Data Warehouse includes the following Primary Data Warehouses:

• IPDW–Interconnection Traffic Primary Data Warehouse, containing call detail records (CDRs), whose main purpose is to analyze network usage patterns between Telecom Italia Network Nodes and other service providers.

Pages:   || 2 | 3 | 4 | 5 |   ...   | 8 |

Similar works:

«Athlete Log Book Academy Foreward by Nick Hume, National Performance Director Welcome to the Academy Programme. The National Academy programme is now in its 7th year and over that time it has continued to expand. The Academy is the first step towards being a National Team water polo player. From where you are today (in the Academy age group) the pathway to become an Olympic athlete is in front of you. You will spend 2 years in the Academy age group before moving on to the Youth Squad, from...»


«Cubiertas de Seguros Esenciales para tu Negocio Cámara de Comercio de P.R.INSTRUCTOR Carlos Olivencia Gayá, CIC, CRM Vicepresidente Ejecutivo & Socio Carrión, Laffitte & Casellas, Inc. ¿Temas a cubrir?  Análisis de Riesgos  Mitigación de Riesgos  Empresa  Seguros  Tipos de Seguros Disponibles  Evaluación de Protección adecuada  Guías para Revisión de Programa de Seguros Catastrófes de Impacto Estados Unidos de América Las 10 Catastrófes de Mayor Impacto...»

«HP Web Jetadmin Report Generation Plug-in HP Web Jetadmin Report Generation Plug-in Referenzhandbuch Copyright Marken © 2006 Copyright Hewlett-Packard Microsoft® ist in den USA eine eingetragene Development Company, L.P. Marke der Microsoft Corporation. Vervielfältigung, Adaption oder Übersetzung Die Software basiert teilweise auf der Arbeit sind ohne vorherige schriftliche der Independent JPEG Group Genehmigung nur im Rahmen des (Unabhängigen JPEG-Gruppe). Urheberrechts zulässig. Die...»

«OXYGEN FORENSIC® DETECTIVE OXYGEN FORENSICS GETTING STARTED | http://www.oxygen-forensic.com ©2000-2016 Oxygen Forensics Table of contents General information Installation Extracting information from mobile device Activation Internet license (Demo) USB dongle license Using Oxygen Forensic® Extractor Data extraction from backups and device images Android Rooting Viewing extracted data Analyzing extracted data Timeline Aggregated Contacts Links and Stats Social Graph Web Connections...»

«Capture One CULTURAL HERITAGE USER GUIDE Introduction Capture One Cultural Heritage edition is a Raw work-flow application based on the Capture One DB solution and features exclusive new tools expressly designed to aid museums, libraries, archives and other institutions when digitizing a wide range of materials. These new tools have been designed to simplify and automate highly repetitive tasks, saving time and improving productivity. Activation Capture One CH edition is available via Phase One...»

«Spectrum Sharing The Promise and The Reality July 2012 Copyright ©2012 Rysavy Research, LLC. All rights reserved. http://www.rysavy.com Table of Contents NOTICE INTRODUCTION TYPES OF SHARING EXAMPLES OF SHARING ISSUES AND CONSIDERATIONS CONCLUSION ABOUT RYSAVY RESEARCH Notice Rysavy Research provides this document and the information contained herein to you for informational purposes only. Rysavy Research provides this information solely on the basis that you will take responsibility for...»

«One Town's War on Gay Teens By Sabrina Rubin Erdely, Rolling Stone 04 February 12 In Michele Bachmann's home district, evangelicals have created an extreme anti-gay climate. After a rash of suicides, the kids are fighting back. very morning, Brittany Geldert stepped off the bus and bolted through the double doors of Fred Moore Middle School, her nerves already on high alert, bracing for the inevitable. Dyke. Pretending not to hear, Brittany would walk briskly to her locker, past the sixth-,...»

«THE UNMASKING OPTION JAMES GRIMMELMANN† I’d like to tell a story about online harassment and extract a surprising proposal from it. I’m going to argue that we should consider selectively unmasking anonymous online speakers, not as an aid to litigation, but as a substitute for it. Identifying harassers can be an effective way of holding them accountable, while causing less of a chilling effect on socially valuable speech than liability would. In the end, I’ll conclude that this proposal...»

«РОССИЯ И КИОТСКИЙ ПРОТОКОЛ : проблемы и возможности под редакцией: Анны Корппоо, Жаклин Карас, Майкла Грабба РОССИЯ И КИОТСКИЙ ПРОТОКОЛ Программа по энергетике, окружающей среде и развитию (ранее Про грамма по устойчивому развитию) Chatham House нацелена на заблаговременное...»

«Publish detail Applied Partial Differential Equations Logan Solutions Manual books document, also Download PDF Applied Partial Differential Equations Logan Solutions Manual digital file APPLIED PARTIAL DIFFERENTIAL EQUATIONS LOGAN SOLUTIONS MANUAL PDF Complete data published is really a hard copy manual thats printed APPLIED PARTIAL DIFFERENTIAL EQUATIONS LOGAN SOLUTIONS MANUAL Document nicely bound, and functional. It operates as a reference manual skim the TOC or index, get the page, and...»

«SCRIPT Hell’s Bells 2: The Power and Spirit of Popular Music After numerous requests for a transcript of Hell’s Bells 2 we decided to see what we could do. The following represents the best version we could find in our files. Unfortunately, it is not accurate in more than a few places. Eric Holmberg, the writer, producer and host of the series, often changed things on the fly as he was doing the narration. What follows doesn’t reflect these changes. And for some reason the citations for...»

<<  HOME   |    CONTACTS
2016 www.dis.xlibx.info - Thesis, dissertations, books

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.