19th International CODATA Conference
Category: Poster, Infoscience

An Approach for Designing a Restricted Bulgarian Natural Language Database Query System

Principal assist. Prof. Silyan Arsov (sarsov@ecs.ru.acad.bg), University of  Rousse,  Department of Computer Systems and Technologies, Bulgaria
Prof. Dr. Boris Rachev (Bob_Ra@acm.org), Technical University¯Varna, Department of Computer Sciences and Technologies, Bulgaria


Database information is recorded in terms of files, records and fields, while natural language expressions refer to the same information in terms of entities and relationships in the world. A major problem in designing a natural language interface is to determine how to encode and use the information needed to adapt these two views. Most of natural language database query systems translate a question in any natural language into corresponding query in a formal database query language as Structured Query Language (SQL) instructions. Usually user questions are first translated into a logic language and subsequently into SQL.

Besides the different methods used for designing of the preprocessors for processing of the query and its translation into query in a formal database query language, exploration is possible to be implemented on data models purposing (i) more suitable structuring of data in databases, (ii) storing in advance the basic elements of the natural language sentences, such as the verbs in the data structures.

The aim of our researches is to propound a methodology for facilitation of the natural language access to database. With regard to this the proposed implementation carries out researches on the effect of development of the model Entity-Relationship- Attribute (ERA), by marking relationships names between different entities as well as between entities and their own attributes. In this connection it is proposed methods for data storage that will enable storing both the relationships names between the entities as well as the relationships names between the objects and their own attributes, which are verbs in the most common case. Also we propose a suitable limited set of natural Bulgarian language query constructions with purpose direct access to data in databases. Finally we developed an experimental Database management system for testing of the qualities of the proposed methods by using the stored relationships names in the database.

The paper is organized as follows: In Section 1 we describe the database by the developed entity-relationship-attribute diagram as an example. It is shown the database tables and their attributes values as an example. In Section 2, for the purpose of direct natural language access to data, we introduce a three-dimensional internal representation of the database. In Section 3 we present a limited set of natural Bulgarian language queries for access to data. In Section 4 it is discussed the experimental implementation of the proposed natural Bulgarian language queries constructions. In Section 5 the conclusions of the exploration work is presented.