Núcleo Interinstitucional de Lingüística Computacional
An Interinstitutional Center for Research and Development in Computational Linguistics

DIADORIM

A Lexical Database of Brazilian Portuguese

 

Period: 2000-2002

Goals
DIADORIM aims at unifying two very different lexical databases, used by two very different tools: ReGra and the UNL-Portuguese System. The former is a Brazilian Portuguese grammar and style checker; the latter is a multilingual interlingua-based MT system. Furthermore, DIADORIM was enriched by a vast set of synonyms and antonyms coming from an electronic thesaurus for Brazilian Portuguese (TeP).


Project's Features

DIADORIM was conceived as a general database formed by two very different structures connected by dictionary entries. The first is a net-like structure representing the relations between lexical items and their referents in the world and in the culture, as required by the UNL Project. We call this the gnosiologic structure. The second is a tree-like structure representing relations between lexical items and the language system. Both the grammar checker and the MT system use it. That is the linguistic structure. The lexical entry, as a node of the gnosiologic structure and, at the same time, as a root of the linguistic structure, works as a bridge entity between both information sets. It brings the database as flexible as necessary to represent all required features.

An ER-to-Relational mapping guided the implementation of a database through Microsoft SQL Server 6.5.

Access to the database is performed as a web search interface, which contains also a data editor and specialized list generation tools.


Results
A huge lexical database of Brazilian Portuguese accessible by specialized interfaces, including via Web.


Team (2001)
Juliana Galvani Greghi

Maria das Graças Volpe Nunes (supervisor)

Ronaldo Teixeira Martins

Bento Carlos Dias da Silva


Finantial Support
FAPESP (2000-2002)

PADCT/Finep - Itautec-Philco (2000-2001)


Contact
Maria das Graças Volpe Nunes: mailto:gracan@icmc.usp.br


Related Publications

Greghi, J. G.; Martins, R. T.; Nunes, M. G. V. Diadorim: a Lexical database for Brazilian Portuguese In. International Cconference on Language Resources and Evaluation LREC 2002, Las Palmas de Gran Canaria Proceedings of the Third International Conference on language Resources and Evaluation, Manuel G. Rodríguez and Carmem P. S. Araujo (Eds.), 2002, v. iV, n. , p. 1346-1350. download file

Greghi, J.G. Projeto e desenvolvimento de uma base de dados lexicais do português. Msc. Thesis. Mar 2002. download file

Greghi, J.G.; Martins, R.T; Nunes, M.G.V. O Processo de Desenvolvimento da BDL-NILC, NILC-TR-01-7, Outubro, 2001. Download file

Dias da Silva, B. C.; Oliveira, M. F.; Moraes, H. R.; Hasegawa, R.; Amorim, D.; Paschoalino, C.; Nascimento, A. C. A Construção de um Thesaurus Eletrônico para o Português do Brasil. V Encontro para o processamento computacional da Língua Portuguesa Escrita e Falada (PROPOR'2000) Atibaia, SP, 20 a 22 Novembro 2000.

 

Voltar
Voltar