An Interinstitutional Center for Research and Development in Computational Linguistics

ReTraTos

Machine translation based on linguistic resources induced from aligned parallel texts

Voltar

Starting Time: 2003

Status: Concluded in 2007

Goals

    To automatically induce linguistic knowledge useful for machine translation ---transfer rules and bilingual dictionaries--- from PoS-tagged and lexically aligned parallel corpora. Another goal of this project is to develope a simple machine translation system to translate source sentences into target sentences based on the induced resources.

Project's Features

   The experiments carried out for machine translation involving three languages ---Brazilian Portuguese (pt), Spanish (es) and English (en), combined in two pairs of languages: pt-es and pt-en--- showed reasonable results.

Results

   Computational resources:

      Induction Systems available at SourceForge

   Linguistic resources:

Team
    Helena de Medeiros Caseli (PhD Student, 2003-2007)
    Maria das Graças Volpe Nunes (supervisor, 2003-2007)
    Mikel L. Forcada (foreign supervisor, 2004-2005)

Finantial Support
   FAPESP: 2004-2007
   CAPES (Sandwich): 2004-2005

Contact
   Helena de Medeiros Caseli

Publications

2008

Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation. Machine Translation. v. 1, p. 227-245, 2008.

Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. From free shallow monolingual resources to machine translation systems: easing the task. In Proceedings of the Workshop on Mixing Approaches to Machine Translation (MATMT08). San Sebastian, Spain: University of the Basque Country, 2008. v. 1. p. 41-48.

2007

Caseli, H.M.; Nunes, M.G.V. Automatic induction of bilingual lexicons for machine translation. International Journal of Translation. v. 19, p. 29-43, 2007.

Caseli, H.M.; Nunes, M.G.V. Automatic induction of translation lexicons from aligned parallel corpus. In Anais do XXVII Congresso da Sociedade Brasileira de Computação - V Workshop em Tecnologia da Informação e da Linguagem Humana (TIL). p. 1669-1678. Rio de Janeiro - RJ, Brazil, 2007. PDF

Caseli, H.M. Indução de léxicos bilíngües e regras para a tradução automática. Tese de Doutorado. ICMC-USP, Abril, 2007. 158 p. PDF (versão defendida) PDF (versão revisada)

2006

Caseli, H.M.; Nunes, M.G.V. Automatic transfer rule induction from parallel corpora. In Proceedings of the International Joint Conference IBERAMIA/SBIA/SBRN 2006 - 3rd Workshop on MSc dissertations and PhD thesis in Artificial Intelligence (WTDIA'2006). Ribeirão Preto, Brazil, October 23-28, 2006. PDF

Caseli, H.M.; Nunes, M.G.V. Anali: uma ferramenta de análise morfossintática. Série de Relatórios do Técnicos do ICMC, 285 (NILC-TR-06-09), Outubro 2006. 44 p.ZIP

2005

Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. Evaluating the LIHLA lexical aligner on Spanish, Brazilian Portuguese and Basque parallel texts. Procesamiento del Lenguaje Natural, v. 35, Granada, Spain, pp.237-244, 2005. ISSN 1135-5948. Also in Cadernos de Computação, v. 6, n. 2, ICMC-USP, pp.149-163, October 2005. PDF

Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. O Alinhador Lexical LIHLA: Experimentos com o Português do Brasil. In Caderno de resumos do V Encontro de Corpora, pp. 21-22. São Carlos -- SP, Brasil. 24 e 25 de novembro de 2005.

Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. LIHLA: A lexical aligner based on language-independent heuristics. In Proceedings of the V Encontro Nacional de Inteligência Artificial (ENIA), pp. 641-650. São Leopoldo -- RS, Brazil. July 25-19, 2005. PDF

Caseli, H.M.; Nunes, M.G.V.; Forcada, M.L. LIHLA: Shared task system description. In Proceedings of the ACL Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, pp. 111-114. Ann Arbor, Michigan, USA. June 29-30, 2005. PDF

Caseli, H.M.; Scalco, M.A.G.; Nunes, M.G.V. Manual para marcação de alinhamentos lexicais. Série de Relatórios Técnicos do ICMC, 256 (NILC-TR-05-09), Abril 2005. 21 p.ZIP ZIP English version

2004

Caseli, H.M. Regras de tradução automática induzidas de textos paralelos envolvendo o português do Brasil. Monografia de Qualificação. ICMC-USP, Agosto, 2004. 67 p. PDF

Related Publications

2000

Oliveira Jr., O. N.; Marchi, A. R.; Martins, M. S.; Martins, R. T. A Critical Analysis of the Performance of English-Portuguese-English MT Systems. V Encontro para o processamento computacional da Língua Portuguesa Escrita e Falada (PROPOR'2000) Atibaia, SP, 20 a 22 Novembro 2000. ZIP Voltar

Last updated: 07/05/2008