Terminology management and knowledge processing in lesser-used languages
Terminology management, that is collection, analysis, validation and distribution of terms, is crucial for converting information into comprehensible and applicable knowledge. Specialists of all persuasions are involved in coining terms, modifying existing terminology, rendering terms archaic or re-introducing discarded terms with new meanings. One task of specialists, it appears, is to coin neologisms, introduce retronyms, translate terms, validate terms and, in a rather indirect manner, compile or help to compete terminology collection of their specialisms. Modern science, contemporary leisure and entertainment, innovative enterprises, all distinguish themselves from their older incarnations not merely through goods, services or artefacts, but also through the terminology they use to describe the sciences, arts and culture, and business and enterprises. Terminology management is by and large a manual task that relies on the existence of well-motivated documentalists, translators and terminologists; the later performing the function of the former two. Currently available terminology management systems have alleviated some of the storage and retrieval tasks associated with the archival and presentation of specialist terms. However, the tasks of collection, analysis and validation are undertaken with skilled human beings. In languages used by numerical majorities, the expensive task of terminology management is underwritten by the expectation that there is a potentially large numbers of people who require and are willing to invest in creating terminology databases. For languages used by numerically smaller number of people this indeed is not the case; terminology management here is often linked with the politically-motivated, and often emotionally charged, work of language planning. The dependence on human beings for terminology management is greater in the lesser-used language communities than say may be the case of other languages.
The automation of terminology involvement management is not merely a task of writing computer programs, although such and undertaking is onerous in itself. Such an automation requires an understanding of how specialist text is written, how human beings deal with semantics of specialist domains, how discourse pattern change according to the needs of the authors and the readers of the texts. Writings in and about lesser-used languages are not easy come by and it appears that it is difficult to persuade people to undertake research and data collection in these areas. Somehow, specialist writing is associated with term 'technical writing', a discourse pattern which in turn is associated with machines and thereby not given the same status as the more abstract task of parsing sentences according to a mathematical model of language or searching for cultural icons in texts for instance.
But knowledge processing, a term that can be used for elaborating related activities like education, training, teaching and learning, problem solving and so on, is crucially dependent on the availability of specialist terminology collections. This especially true during the formative years of a child, the early carrier of a novice or a person being retrained; facts, principles, theories, and rules of thumb related to any human enterprise, collectively known as knowledge, are to be assimilated and then applied for teaching, learning, problem-solving etc. The 'raw material' available in text books, journals, unarticulated past experience, has to be communicated through the agency of terms whose meanings are well defined and used frequently by a specialist enterprise. Knowledge processing therefore is inextricably linked terminology management which, in turn, is linked with language planning and politics. Over the last ten years we have been building terminology collections in languages used by numerically larger groups of people, like English, German and Spanish, whilst at the same time attempting to adapt such methods for lesser used languages like Welsh, Norwegian, Flemish and Catalan. This paper will discuss challenges encountered, opportunities identified and solutions suggested for managing terminology of specialist languages in multilingual environments where at least one language belongs to the lesser used category on numerical groups. Our theoretical framework draws from recent work in corpus linguistics, philosophy and history of science on the one hand and computing sciences on the other.