Euromatrix

From Wikipedia, the free encyclopedia

The EuroMatrix is a project that ran from September 2006 to February 2009. The project aimed to develop and improve machine translation (MT) systems between all official languages of the European Union (EU).

EuroMatrix was followed up by another project EuroMatrixPlus (March 2009 to February 2012).

Approach to translation[edit]

EuroMatrix explored using linguistic knowledge in statistical machine translation. Statistical techniques were combined with rule-based approach, resulting in hybrid MT architecture. The project experimented with combining methods and resources from statistical MT, rule-based MT, shallow language processing and computational lexicography and morphology.

Project objectives[edit]

EuroMatrix focused on high-quality translation for the publication of technical, social, legal and political documents. It applied advanced MT technologies to all pairs of EU languages; languages of new and likely-to-become EU member states were also taken into account.

Annual international evaluation[edit]

Competitive annual international evaluation of machine translation meetings (“MT marathons”) were organized to bring together MT researchers. Participants of the marathons translated test sets with their systems. The test sets were then evaluated by manual as well as automatic metrics.

MT marathons were multi-day happenings consisting of several events — summer school, lab lessons, research talks, workshops, open source conventions, research showcases.

List of MT marathons[edit]

Name Date Place
Machine Translation Marathon 2007[1] April 16–20, 2007 Edinburgh, United Kingdom
Machine Translation Marathon 2008[2] May 12–17, 2008 Berlin, Germany
Machine Translation Marathon 2009[3] January 26–30, 2009 Prague, Czech Republic

Outcome[edit]

Several tools and resources were created or supported by the project:[4]

  • Moses, an open source statistical machine translation engine
  • Europarl Corpus, version 3
  • Results from Workshops on Statistical Machine Translation (2007, 2008, 2009)
  • CzEng Corpus, version 0.7

Funding[edit]

The EuroMatrix project was sponsored by EU Information Society Technology program.

Total cost of the project was 2 358 747 €, from which the European Union contributed 2 066 388 €.[5]

Project members[edit]

Experienced research groups in machine translation that are internationally recognized, as well as relevant industrial partners participated in the project. The consortium included the University of Edinburgh (United Kingdom), Charles University (Czech Republic), Saarland University (Germany), Center for the Evaluation of Language and Communication Technologies (Italy), MorphoLogic (Hungary), and GROUP Technologies AG (Germany).[5]

The project was coordinated by Hans Uszkoreit, a professor of Computational Linguistics at Saarland University.[5]

References[edit]

External links[edit]