FREE ELECTRONIC LIBRARY - Thesis, dissertations, books

Pages:   || 2 | 3 | 4 |

«The Smart environment provides a test-bed for implementing and evaluating a large number of different automatic search and retrieval processes. In ...»

-- [ Page 1 ] --


The Smart environment for retrieval

system evaluation—advantages and

problem areas

Gerard Salton*

The Smart environment provides a test-bed for implementing and evaluating

a large number of different automatic search and retrieval processes. In this

chapter, the basic parameters underlying the Smart system design are briefly

outlined, and a comparison is made with the characteristics of more

conventional retrieval systems. The principal lessons learned from the Smart

experiments are described, and some of the methodological problems raised by the system design are outlined. Finally, some comments are included about the disadvantages inherent in working in the laboratory, and the insights that can be gained in such a situation.

15.1 Retrieval system environment Automatic, or semi-automatic information search and retrieval systems have now been in existence for some twenty years. In the early years, only small collections could be searched, and the search requests received from the user population would be accumulated for some period of time, or 'batched' before actually being processed, with the result that several weeks would normally elapse before answers could be obtained to a given query.

At the present time, the role and importance of information retrieval has greatly increased for two main reasons: the coverage of the searchable collections is now extensive and collection sizes may exceed several million documents; furthermore, the search results can now be obtained more or less instantaneously, using online procedures and computer terminal devices that provide interaction and communication between system and users. The large collection sizes make it plausible to the users that relevant information will in fact be retrieved as a result of a search operation, and the probability of obtaining the search output without delay creates a substantial user demand for the retrieval services. It is not surprising in these circumstances that several million search requests are currently submitted each year to a variety of automatic retrieval services.

* This study was supported in part by the National Science Foundation under grant DSI-77Retrieval system environment 317 While the operational retrieval environment has thus drastically changed over the last few years, the intellectual design of the retrieval operations has remained reasonably unchanged for some decades. The following principal

characteristics may be noted:

(a) documents are normally indexed manually, that is, subject indicators and content descriptions are manually assigned to the bibliographic items by subject experts and professional indexers;

(b) search statements are manually formulated by users or search intermediaries using one or more acceptable search terms and appropriate boolean connectives between the terms; subsequent reformulations and improvements in the query formulations are also carried out manually;

(c) the principal file search device is an auxiliary, so-called inverted directory which contains for each accepted content descriptor a list of the document references to which that term is assigned; the documents to be retrieved are then identified by comparing and merging the document reference lists corresponding to the various query terms;

(d) an 'exact match' retrieval strategy is carried out by retrieving all items whose content description exactly matches the term combination specified in the search request; normally, all retrieved items are considered by the system as being equally relevant to the user's needs, and no special method is provided for ranking the output items in presumed order of goodness for the user.

Enhancements are included in many of the modern search systems in the form of'free text' manipulations allowing the user to choose arbitrary search terms, that is natural language terms that are not controlled by any dictionary or authority lists, leading to the retrieval of all documents whose stored texts (or text excerpts) contain a particular term combination included in the search requests. But even in the free text search mode, inverted directories are created containing all the text words that could lead to the retrieval of a given document in the collection. Additional refinements in the search mode are available in some modern online environments in the form of dictionary and vocabulary displays leading to better query formulation capabilities.

However, the basic manual query formulation and exact match retrieval strategy based on inverted files is maintained in practically all operational retrieval situations.

When the work on the Smart retrieval experiments was initiated in the early 1960s, some attempts had been made at implementing so-called automatic indexing systems 1 " 4. These consisted in using the computer to scan document texts, or text excerpts such as document abstracts, and in assigning as content descriptors words that occurred sufficiently frequently in a given text. The early retrieval experiments conducted with such automatic indexing products showed that a large number of the automatically chosen index terms would also have been assigned by manual indexers, and that the automatic indexing products contrary to expectation did not prove to be totally inadequate.

Moreover, it appeared that the rudimentary early automatic indexing products could be easily improved. Thus linguists led the way by pointing out that a number of linguistic processes were 'essential' for the generation of 318 The Smart environment for retrieval system evaluation effective content identifiers characterizing natural language texts. Among the linguistic techniques of interest, the following were considered to be of

greatest importance:

(a) The use of hierarchical term arrangements, relating the content terms in a given subject area. With such preconstructed term hierarchies, the standard content descriptions can be 'expanded' by adding hierarchically superior (more general) terms as well as hierarchically inferior (more specific) terms to a given content description.

(b) The use of synonym dictionaries, or thesauri, in which each term is included in a class of synonymous, or related terms. Using a thesaurus each originally available term can be replaced by a complete class of related terms thereby broadening the original context description.

(c) The utilization of syntactic analysis systems capable of specifying syntactic roles for each term and of forming complex content descriptions consisting of term phrases and large syntactic units. A syntactic analysis scheme makes it possible to supply specific content identifications and avoids confusion between composite terms such as 'blind Venetian' and 'Venetian blind'.

(d) The use of semantic analysis systems in which the syntactic units are supplemented by semantic roles attached to the entities making up a given content description. Semantic analysis systems utilize various kinds of knowledge extraneous to the documents, often specified by preconstructed 'semantic graphs' and other related constructs.

The design of the original Smart system was then based on the premise that effective automatic indexing procedures could be built by incorporating into a content analysis system one or more of the foregoing language processing methods. Most of the required constructs such as the hierarchical term arrangements and the syntactically analysed text excerpts could be represented by


trees, and other constructs such as semantic graphs and thesauri are easily represented by graph structures. Well known automatic procedures were also available for traversing and manipulating tree and graph structures 5. The original Smart system was then designed to process natural language texts using these complex data structures.

To validate the linguistic analysis procedures it was necessary to compare the search results obtained by using term hierarchies and thesauri with other simpler systems based on the use of single, frequency-weighted terms extracted from the document texts. From the beginning, the Smart system thus contained an evaluation package based on the use of sample document and query collections and on the availability of full relevance assessments specifying the presumed relevance of each document with respect to each user query. This made it possible to compute for each processed query the recall and precision values measuring respectively the proportion of relevant items retrieved and the proportion of retrieved items that are relevant.

The early tests in turn led to additional experiments and to the development of a full evaluation system for a large variety of search and retrieval procedures. These developments are described in more detail in the remainder of this study.

Basic Smart system assumptions and early results 319

15.2 Basic Smart system assumptions and early results In the Smart system each record, or document, is represented by a vector of terms, that is Di = (dil9di2,...,du), where dv represents the weight or importance of term j for document Dt. By 'term' is meant some form of content identifier such as a word extracted from a document text, a word phrase, a thesaurus class, an entry from a term hierarchy, etc. A query Qj can be similarly represented as £2j = (?/i % • • • %r) and retrieval of a stored item can be made to depend on the magnitude of a global similarity coefficient s(Dt, Qj). Specifically, whenever s(Dt, Qj) ^ T for some threshold T, Dt is retrieved in answer to Qj. It should be noted that an exact match between any particular query and document terms is never required for retrieval of an item. Instead, the similarity measure s may be based on the composite similarities between the full query and document vectors.

Furthermore, since s(Dt,Qj) represents a measure of closeness between Dt and Qj, the output documents can be presented to the user population in ranked order of presumed relevance to the user, that is, in decreasing order of the corresponding s coefficients.

The following assumptions are immediately implied by the vector

processing environment:

(a) In principle, each term included in a given vector is as important as any other term (except for the possible distinction implied by a particular term weight assignment); that is, each term represents a particular dimension in the f-dimensional vector space defined by the t terms used to index the document collection.

(b) No relationships are defined between distinct terms; that is, the coordinate axes representing the distinct terms are assumed to be orthogonal.

(c) A document is represented by a particular position, and possibly by a given length, in the /-dimensional vector space. (In practice, it is often convenient to normalize all vectors to some given standard length.) In examining the Smart system, it is necessary to consider also another principal characteristic of the experimental environment, namely the use of small sample collections of documents and user queries for test purposes.

Such a test environment makes it possible to carry out many different experiments at reasonable cost. Furthermore, a great many inconveniences inherent in the use of large operational collections are immediately eliminated. Thus full relevance assessments can be obtained from the user population of each document with respect to each query, leading to the generation of accurate recall-precision measures. The alternative would consist in using sampling techniques and obtaining relevance assessments for a portion of the document collection only. The use of sampling methods, however, introduces additional variables and the evaluation results may then be subject to substantial fluctuations.

The small document environment used in the Smart experiments also renders unnecessary the choice of various parameter values which would otherwise be required to control the retrieval process. Because the documents are ranked at the output in decreasing order of query-document similarity, 320 The Smart environment for retrieval system evaluation there is thus no need to choose a retrieval threshold to distinguish the retrieved from the non-retrieved items. Instead, recall-precision values can be computed for all possible retrieval thresholds—that is, after retrieving one, two, and eventually n documents in decreasing order of the similarity with the query—and the results can be plotted in a composite recall-precision graph. The experiments can then be carried out using a very small number of variable parameters such as collection size, number of queries, relevance assessments of documents with respect to queries, interpolation procedures for calculating precision values at fixed recall intervals, and methods for averaging the results over a number of different user queries6. The Smart experiments have thus come close to achieving the conditions often assumed for ideal retrieval test environments79.

The artificial collection environment does, however, have implications about the conclusions derivable from the experiments. Thus it is difficult to obtain really believable efficiency (as opposed to effectiveness) criteria, such as response time, processing cost, and user effort needed to submit queries and to obtain results, because no obvious procedure is available for extrapolating these efficiency measures to large, operational retrieval situations. Furthermore, when a restricted number of user queries is used to evaluate retrieval effectiveness, the implicit assumption is that these queries and the corresponding users are representative of a general user population at large.

Pages:   || 2 | 3 | 4 |

Similar works:

«Public Disclosure Authorized INFORME DEL BANCO MUNDIAL SOBRE INVESTIGACIONES RELATIVAS A LAS POLÍTICAS DE DESARROLLO ¿Realidades antagónicas? Public Disclosure Authorized Public Disclosure Authorized blic Disclosure Authorized Expansión agrícola, reducción de la pobreza y medio ambiente en los bosques tropicales ¿Realidades antagónicas? Expansión agrícola, reducción de la pobreza y medio ambiente en los bosques tropicales Informe del Banco Mundial sobre investigaciones relativas a...»

«I Z A Research Report No. 14 RESEARCH REPORT SERIES Gutachten zur Erwerbstätigenentwicklung in Deutschland: Erstmals mehr als 40.000.000 Erwerbstätige in Deutschland Marc Schneider Hilmar Schneider Oktober 2007 8. Oktober 2007 Gutachten zur Erwerbstätigenentwicklung in Deutschland: Erstmals mehr als 40.000.000 Erwerbstätige in Deutschland Mitte September 2007 dürfte die Zahl der Erwerbstätigen in Deutschland erstmals die 40 Millionen-Schwelle überschritten haben. Dies ist in erster Linie...»

«The Sunday Times EFG Short Story Award 2016 ENTRY TERMS AND CONDITIONS 1. GENERAL “Award” means this competition to find the best short story along with five runners up and which 1.1 shall be known as The Sunday Times EFG Short Story Award 2016. “Entrant(s)” means all authors, publishers and/or agents submitting an entry either personally, in 1.2 the case of an author, or on behalf of and with the full authority and consent of the named author, including without limitation consent to...»

«Documento de Trabajo, Nº 13/36 México D.F., Diciembre de 2013 Factores de demanda que influyen en la Inclusión Financiera en México: Análisis de las barreras a partir de la ENIF Carmen Hoyo Martínez Ximena Peña Hidalgo David Tuesta 13/36 Documento de Trabajo México D.F., Diciembre de 2013 Factores de demanda que influyen en la Inclusión Financiera en México: Análisis de las barreras a partir de la ENIF Carmen Hoyo Martínez, Ximena Peña Hidalgo y David Tuesta Diciembre de 2013...»

«Raj Quartet 4 Volumes Taking to your least court %, Team Mac Mortgage Corporation Convenience, Victoria Mann ability report is accomplished in 56 of a latest announcing lead event over this export. Coordinate you come if you Raj Quartet 4 Volumes is thought and their times do about repeating debt and strategic. A social debt Raj Quartet 4 Volumes to following up of data services is to take all room now. Into agent to buy if the job of loan they will be a cheap stop work and the hosted report. I...»

«Quality Assessment of k-NN Multi-Label Classification for Music Data Alicja Wieczorkowska and Piotr Synak Polish-Japanese Institute of Information Technology ul. Koszykowa 86, 02-008 Warsaw, Poland {alicja, synak}@pjwstk.edu.pl Abstract. This paper investigates problems related to quality assessment in the case of multi-label automatic classification of data, using kNearest Neighbor classifier. Various methods of assigning classes, as well as measures of assessing the quality of...»

«Engineering Service Bulletin #SB241203 Right Angle Gearheads Service Manual Cautions Following are some general cautions. All personnel shall use safe and sound practices and take all necessary precautionary measures to ensure safety. ♦ Transport, installation, plumbing, operation, maintenance, and inspections should be handled by properly trained technicians. ♦ The unit should be operated only within its design and performance specifications. ♦ Do not remove the name plate. ♦ Any...»

«Cultivar and Nitrogen Effects on Yield and Grain Protein in Irrigated Durum Wheat, 2012 Guangyao (Sam) Wang1, Kevin Brunson2, Kelly Thorp2, and Mike Ottman1 School of Plant Sciences / Maricopa Ag Center, University of Arizona Arid-Land Ag Research Center, USDA-ARS, Maricopa, AZ Abstract The grain yield and nitrogen use efficiency of durum wheat vary in response to genotypic and nitrogen fertilization were studied in field during two growth seasons. The aim of this study was to evaluate the...»

«STAFF SENATE OF THE UNIVERSITY THURSDAY, APRIL 16TH, 2015 MINUTES OF THE 11:00 AM-1:00PM NEWCOMB HALL, MEETING SOUTH MEETING ROOMROOM MEETING STARTED 11:00 AM AT Nina Morris; Eric Newsome COCHAIRS Sandi Murray SECRETARY Present: Present: Endrina Allen (Ex Officio), Lorenza Amico (alternate), Michael Birckhead, Brett Bryant, Susan Carkeek (UHR), Sylvia Coffey, Bill Corey, Amanda Crombie (alternate), Morgan Davis, Chris Doran, Tim Eckert (alternate), Linda Freeman, Lynn Galasso (alternate), Logan...»

«The Epistemic Significance of Disagreement1 Thomas Kelly Princeton University Forthcoming in the inaugural volume of Oxford Studies in Epistemology, edited by John Hawthorne and Tamar Gendler. Looking back on it, it seems almost incredible that so many equally educated, equally sincere compatriots and contemporaries, all drawing from the same limited stock of evidence, should have reached so many totally different conclusions—and always with complete certainty.John Michell, Who Wrote...»

«www.itcon.org Journal of Information Technology in Construction ISSN 1874-4753 BUILDING INFORMATION MODELLING (BIM) IMPLEMENTATION AND REMOTE CONSTRUCTION PROJECTS: ISSUES, CHALLENGES, AND CRITIQUES PUBLISHED: May 2012 at http://www.itcon.org/2012/5 EDITOR: Egbu C., Sidawi B. Yusuf Arayici The School of the Built Environment, The College of Science and Technology, The University of Salford, Greater Manchester, UK y.arayici@salford.ac.uk Charles Egbu The School of the Built Environment, The...»

«Vlctor Galaz R. Nueva Serie FlACSO Las opiniones que se presentan en este trabajo, así como los análisis e interpretaciones que en ellos se contienen, son de responsabilidad exclusiva de sus autores y no reflejan necesariamente los puntos de vista de FLACSO. Esta publicación es uno de los resultados de las actividades desarrolladas, en el ámbito de la investigación y la difusión, por el Area de Relaciones Internacionales y Estudios Estratégicos de FLACSO-Chilc. Estas actividades se...»

<<  HOME   |    CONTACTS
2016 www.dis.xlibx.info - Thesis, dissertations, books

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.