Thesaurus


Background

Large literature databases assign keywords (indexing terms) to increase the information provided by the abstract of an article, and thus to enable precise retrievals. Indexing terms must represent the subject matter and the most important points in the content of a document in a way that is reproducible, meaningful, and consistent. At the same time, the index terms must be easily recognised by the user, typically searching on-line, as terms appropriate to their particular questions. Large database producers have developed their own thesauri as a basis for indexing.

However, even with significant efforts to improve retrieval, successful use of the databases often requires special expertise and knowledge of the various indexing terms and indexing strategies applied by the different database producers. This special expertise is not usually possessed by an average user and, thus, poses several problems for them to retrieve information.

Objective & Application

The Thesaurus project has been initiated by ECVAM to contribute to the harmonisation and organisation of terms and synonyms used in the animal alternatives topic area by using a novel technology which differs from previous methods as it is based on phrases which occur in original documents. The project  focuses on the identification of terms most commonly used by scientists active in the field of animal alternatives which could be used for indexing purposes and, thus, contributes to enhancing the success in retrieving documents and other pieces of related information stored in databases. 

The value of developing a thesaurus specifically for the topic area of animal alternatives must be explored and verified with the end-user.

Methodology and Contents

The thesaurus has been generated in a semi-automatic manner, by selecting actual phrases that occur in documents, and should therefore reflect the  preferred terminology used by the authors of the articles. This was done by analysing original articles published in this field and retaining those terms most commonly used. This novel approach has been reported in the literature as the so called "bottom-up approach ". This means that the index and thesaurus are built out of the text and not added on as usually done for the creation of a thesaurus.

In order to investigate the feasibility of this approach in a future broader context, as a first step, a small subset of 2,000 selected articles/documents have been electronically processed and afterwards manually ordered into a hierarchical thesaurus structure. The resulting list consisted of 115,150 unique terms. Eliminating those of very high or low frequencies led to 14,729 terms for potential inclusion. Following further evaluation and discrimination steps, approximately 1,000 terms were retained for the thesaurus. In order to organise them in a hierarchical structure, the following categories have been established: Phenomena and Effects, Methods and Strategies, Materials, Endpoints, Validation Methods, Governments, Organisations, Regulations and Regulatory Bodies, Animal Welfare Issues, Information Systems, Alternatives in Education and Disease Conditions. The first classification scheme contains 1,000 unique terms. The main focus was on in vitro toxicology.

Status & Prospects

A first pilot version of the thesaurus has been finalised and subsequently, two consultation rounds were conducted with selected experts in the various fields of in vitro toxicity testing and information sciences.  The resulting and preliminary version of the thesaurus with current focus on in vitro toxicity testing is herewith presented.

At this stage, the ECVAM Thesaurus has been made available as an open source list to promote end-user discussions. The follow-up will depend on these discussions.

Collaborations

For this project, the DB-ALM (formerly SIS) has collaborated, in the first place, with the Head of the Thesaurus section of the National Library of Medicine (NLM, USA) and also with the Centre for Documentation and Evaluation of Alternative Methods to Animal Experiments (ZEBET, D) and with the Fund for the Replacement of Animals in Medical Experiments (FRAME, UK).

Furthermore, the two subsequent consultation rounds involved experts from:

  • Akademie für Tierschutz, D
  • Consejo Superior Investigaciones Cientificas (CSIC), E
  • TNO Nutrition and Food Research Institute, NL
  • Unilever, SEAC-Safety and Environmental Assurance Centre, UK

References

  • Bates M.J.: Indexing and Access for Digital Libraries and the Internet: Human, Database, and domain Factors. J Am Soc. Info Sci 49(13):1185-1205, 1998.
  • Creating a Thesaurus of Alternative Methods to Animal Experimentats. A project of the ECVAM Task Force on Alternatives Databases. Poster presentation at the 3rd World Congress on Alternatives and Animal Use in Life Sciences. Bologna, Italy, August 1999.
  • Stuart J. Nelson, Thom Kuhn, Daniel Radzinski, David D. Sherertz, Mark S. Tuttle, Robert Spena: Creating a Thesaurus From Text: A "Bottom-Up" Approach to Organizing Medical Knowledge. J Am Med Informatics Assoc (Symposium Suppl) 1046, 1998.
  • The ECVAM Thesaurus of Advanced Alternative Methods to Animal Experiments. Poster presentation at the 4th World Congress on Alternatives and Animal Use in Life Sciences. New Orleans, USA, August 2002.
feedbacktutorial