From tags to topic maps : using marked-up Hebrew text to discover linguistic patterns

Loading...
Thumbnail Image

Authors

Kroeze, J.H. (Jan Hendrik)

Journal Title

Journal ISSN

Volume Title

Publisher

Proceedings of the 2008 International Conference on Information Resources Management

Abstract

The paper discusses a series of related techniques that prepare and transform raw linguistic data for advanced processing in order to unveil hidden grammatical patterns. It identifies XML as a suitable mark-up language to build an exploitable data bank of multi-dimensional data in the Hebrew text of the Old Testament. This concept is illustrated by tagging a transcription of Gen. 1:1-2:3 and manipulating this data bank. Transferring the data into a three-dimensional array allows advanced processing of the data in order to either confirm existing knowledge or to mine for new, yet undiscovered, linguistic features. Visualisation is discussed as a technique that enhances interaction between the human researcher and the computerised technologies supporting this process of knowledge creation. The empirical study is a small experiment that illustrates the viability and usefulness of the proposed expert devices as well as the benefits of applying information system techniques to linguistic databases.

Description

Keywords

Text data mining, Data warehousing, MOLAP, XML, Genesis

Sustainable Development Goals

Citation

Kroeze, JH ,Bothma, TJD, & Matthee, MC 2008, ' From tags to topic maps: using marked-up Hebrew text to discover linguistic patterns',Proceedings of the 2008 International Conference on Information Resources Management (Conf-IRM 2008),[http://www.sprott.carleton.ca/conf-irm/CFP2008.pdf]