Revision as of 03:11, 30 June 2014 edit240f:3:c876:1:fcde:38e1:ed03:d019 (talk) →How OmegaT works← Previous edit | Latest revision as of 06:49, 28 February 2024 edit undoLiz (talk | contribs)Autopatrolled, Checkusers, Oversighters, Administrators759,811 edits Removing link(s) to "Help & Manual": Removing links to deleted page Help & Manual.Tag: Twinkle | ||
(49 intermediate revisions by 34 users not shown) | |||
Line 1: | Line 1: | ||
{{short description|Computer assisted translation tool written in Java}} | |||
{{Blacklisted-links|1= | |||
{{Infobox software | |||
*http://www.articlesbase.com/corporate-articles/free-open-source-translation-memory-software-omegat-vs-anaphraseus-tm-1472085.html | |||
*:''Triggered by <code>\barticles(?:base|vana)\.com\b</code> on the global blacklist''|bot=Cyberbot II}} | |||
{{Infobox Software | |||
| name = OmegaT | | name = OmegaT | ||
| logo = |
| logo = OmegaT_Logo.png | ||
| screenshot = |
| screenshot = OmegaT 3.1.9 translating LibreOffice en-eu Fedora 22.png | ||
| caption = OmegaT 1. |
| caption = OmegaT 3.1.9 translating LibreOffice from English to Basque, "Project Files" window | ||
| author = Keith Godfrey | | author = Keith Godfrey | ||
| developer = Didier Briel, Alex Buloichik, Zoltan Bartko, Tiago Saboga, etc |
| developer = Aaron Madlon-Kay, Didier Briel, Alex Buloichik, Zoltan Bartko, Tiago Saboga, etc. | ||
| released = November 28, 2002 | | released = November 28, 2002 | ||
| frequently updated = yes <!-- Release version update? Don't edit this page, just click on the version number! --> | |||
| operating_system = ] | |||
| genre = ] | | genre = ] | ||
| operating system = ], ], ], ] | |||
| license = ] | |||
| license = ]<ref>{{Cite web|url=https://sourceforge.net/p/omegat/code/ci/master/tree/release/OmegaT-license.txt|title = OmegaT - multiplatform CAT tool / Code / [86775c] /Release/OmegaT-license.TXT}}</ref> | |||
| website = | |||
| website = {{URL|omegat.org}} | |||
}} | }} | ||
'''OmegaT''' is a ] tool written in the ]. It is ] originally developed by Keith Godfrey in 2000, and is currently developed by a team led by |
'''OmegaT''' is a ] tool written in the ]. It is ] originally developed by Keith Godfrey in 2000, and is currently developed by a team led by Aaron Madlon-Kay. | ||
OmegaT is intended for professional translators. Its features include customisable segmentation using ], ] with fuzzy matching and match propagation, glossary matching, dictionary matching, translation memory and reference material searching, and inline spell-checking using ] spelling dictionaries. | OmegaT is intended for professional translators. Its features include customisable segmentation using ], ] with fuzzy matching and match propagation, glossary matching, dictionary matching, translation memory and reference material searching, and inline spell-checking using ] spelling dictionaries. | ||
OmegaT runs on ], ] and ] |
OmegaT runs on ], ], ] and ], and requires Java 8.<ref>{{Cite web|url=https://omegat.sourceforge.io/manual-standard/en/chapter.installing.and.running.html#other.systems|title=Chapter 1. Installing and running OmegaT|website=omegat.sourceforge.io|access-date=2019-08-14}}</ref> It is available in 27 languages. According to a survey in 2010<ref>{{Cite web | url=http://www.translationtribulations.com/2010/07/results-of-june-translation-tools.html | title=Results of the June translation tools surveys }}</ref> among 458 professional ], OmegaT is used 1/3 as much as ], ] and ], and 1/8 as much as the market leader ]. | ||
== History == | == History == | ||
OmegaT was first developed by Keith Godfrey in 2000. It was originally written in C++. | OmegaT was first developed by Keith Godfrey in 2000. It was originally written in C++. | ||
The first public release in February 2001<ref>http:// |
The first public release in February 2001<ref>{{Cite web|url=http://translationjournal.net/journal/23linux.htm|title = Close Windows. Open Doors}}</ref> was written in Java. This version used a proprietary translation memory format. It could translate unformatted ] files, and HTML, and perform only block-level segmentation (i.e. paragraphs instead of sentences). | ||
== Development and software releases == | == Development and software releases == | ||
The development of OmegaT is hosted on SourceForge. The development team is led by |
The development of OmegaT is hosted on SourceForge. The development team is led by Aaron Madlon-Kay. As with many open source projects, new versions of OmegaT are released frequently, usually with 2-3 bugfixes and feature updates each. There is a "standard" version, which always has a complete user manual and a "latest" version which includes features that are not yet documented in the user manual.<ref>https://sourceforge.net/projects/omegat/files/ OmegaT's "standard" and "latest" versions</ref> The updated sources are always available from the SourceForge code repository.<ref>https://archive.today/20120717155731/http://omegat.svn.sourceforge.net/viewvc/omegat/trunk/ The latest source files are always available from the SourceForge code repository</ref> | ||
== How OmegaT works == | == How OmegaT works == | ||
OmegaT handles a translation job as a project, a hierarchy of folders with specific names. The user copies non-translated documents into |
OmegaT handles a translation job as a project, a hierarchy of folders with specific names. The user copies non-translated documents into one named /source/ (or subfolders thereof). The Editor pane displays the source documents as individual “segments” for translation one segment at a time. OmegaT, when directed, generates the (partially) translated versions in the /target/ subfolder. | ||
Other named folders include ones for automatic consultation within the program: /tm/ for existing translation pairs in .tmx format, /tm/auto/ for automatic translation of 100% matches, /glossary/ for glossaries, /dictionary/ for StarDict (and .tbx) dictionaries. | |||
Before commencing translation, the user can also copy old translations into the /tm/ subfolder, glossaries into the /glossary/ folder, and StarDict dictionaries into the /dictionary/ folder, which will all be consulted by OmegaT automatically. | |||
When the user goes to translate a segment in the Editor pane, OmegaT automatically searches the .tmx files in the /tm/ hierarchy for previous translation pairs with similar source sentences and displays them in the Fuzzy Matches pane for insertion into the Editor pane with a keyboard shortcut. The Glossary and Dictionary panes provide similar automatic look-up functions for any glossaries and dictionaries in the corresponding named folders in the project. The optional Machine Translation pane shows machine translations from Google Translate and similar services. | |||
When the user leaves a segment, OmegaT normally first adds the source-target pair to its database in memory. It subsequently saves that database to disk in Translation Memory eXchange (.tmx) format for use another day, in other projects, by other translators, and even with other CAT tools. No change, naturally enough, means no such update. Version 3.1 added a setting for blocking targets equal to their sources, a common slip, plus a keyboard shortcut for overriding it—numbers, source code in programming manuals, etc. | |||
When the translation is finished, OmegaT creates translated versions of the files, and exports the project's current translations to TMX files that can be reused in later translations or optionally exchanged with other translators using OmegaT or other CAT tools. | |||
At any point, the user can create partially translated versions of the source files. Note that OmegaT copies source segments verbatim if they have yet to be translated. Before doing so, however, the user is advised to use the Validate menu command to check for tag and other errors. Version 3.1 added a menu command (and keyboard shortcut) for limiting operation to the current file—for partial delivery or quick update, for example. | |||
== Features of OmegaT == | == Features of OmegaT == | ||
OmegaT shares many features with |
OmegaT shares many features with proprietary CAT tools. These include creating, importing and exporting translation memories, fuzzy matching from translation memories, glossary look-up, and reference and concordance searching. | ||
OmegaT also has additional features that are not always available in other CAT tools. These include: | OmegaT also has additional features that are not always available in other CAT tools. These include: | ||
*OmegaT starts by displaying a short tutorial called "Instant Start.". | |||
*OmegaT can translate multiple files in multiple file formats in multiple subfolders simultaneously, and consult multiple translation memories,<ref>http://www.articlesbase.com/corporate-articles/free-open-source-translation-memory-software-omegat-vs-anaphraseus-tm-1472085.html</ref> glossaries and dictionaries (limited only by available computer memory). | |||
*OmegaT can translate multiple files in multiple file formats in multiple subfolders simultaneously, and consult multiple translation memories, glossaries and dictionaries (limited only by available computer memory). | |||
*With regard to supported file types, OmegaT allows the user to customise file extensions and file encodings. For a number of document types, the user can choose selectively which elements must be translated (e.g. in OpenOffice.org Writer files, choose whether to include bookmarks; in Microsoft Office 2007/2010 files, choose whether to translate footnotes; or in HTML, choose whether to translate ALT text for images). The user can also choose how non-standard elements in third-party translation memories should be handled. | *With regard to supported file types, OmegaT allows the user to customise file extensions and file encodings. For a number of document types, the user can choose selectively which elements must be translated (e.g. in OpenOffice.org Writer files, choose whether to include bookmarks; in Microsoft Office 2007/2010 files, choose whether to translate footnotes; or in HTML, choose whether to translate ALT text for images). The user can also choose how non-standard elements in third-party translation memories should be handled. | ||
*OmegaT's segmentation rules are based on regular expressions. Segmentation can be configured based on language or based on file format, and successive segmentation rules inherit values from each other. | *OmegaT's segmentation rules are based on regular expressions. Segmentation can be configured based on language or based on file format, and successive segmentation rules inherit values from each other. | ||
*In the edit window, the user can jump directly to the next untranslated segment, or go forward or backwards in history. Users can use undo and redo, copy and paste, and switch between uppercase and lowercase in the same way as one would in an advanced text editor. The user can choose to see the source text of segments that have already been translated. The edit pane also has inline spell-checking using Hunspell dictionaries, and interactive spell-checking is done using the mouse. | *In the edit window, the user can jump directly to the next untranslated segment, or go forward or backwards in history. Users can use undo and redo, copy and paste, and switch between uppercase and lowercase in the same way as one would in an advanced text editor. The user can choose to see the source text of segments that have already been translated. The edit pane also has inline spell-checking using Hunspell dictionaries, and interactive spell-checking is done using the mouse. | ||
*Users can insert fuzzy matches using a keyboard shortcut or using the mouse. OmegaT shows the degree of similarity in fuzzy matches using colours. OmegaT can also display the date, time and the name of the user who translated any given segment. Glossary matches can be inserted using the mouse. The user can choose to have the source text copied into the target text field, or to have the highest fuzzy match automatically inserted. | *Users can insert fuzzy matches using a keyboard shortcut or using the mouse. OmegaT shows the degree of similarity in fuzzy matches using colours. OmegaT can also display the date, time and the name of the user who translated any given segment. Glossary matches can be inserted using the mouse. The user can choose to have the source text copied into the target text field, or to have the highest fuzzy match automatically inserted. | ||
*In the search window, the user can choose to search the current files' source text, target text, other translation memories, and reference files. Searches can be case sensitive, and regular expressions can also be used. Double-clicking a search result takes the user directly to that segment in the edit window. | *In the search window, the user can choose to search the current files' source text, target text, other translation memories, and reference files. Searches can be case sensitive, and regular expressions can also be used. Double-clicking a search result takes the user directly to that segment in the edit window. | ||
*After translation, OmegaT can perform tag validation to ensure that there are no accidental tag errors. OmegaT can calculate statistics for the project files and translation memories before the project starts, or during the translation to show the progress of the translation job. | *After translation, OmegaT can perform tag validation to ensure that there are no accidental tag errors. OmegaT can calculate statistics for the project files and translation memories before the project starts, or during the translation to show the progress of the translation job. | ||
*OmegaT can get machine translations from ], ], Deepl and ], and display it in a separate window. | |||
*The various windows in OmegaT's user interface can be moved around, maximised, tiled, tabbed and minimised. | |||
*OmegaT can get machine translations from ], ] and ], and display it in a separate window. | |||
*The various windows in OmegaT's user interface can be moved around, maximised, tiled, tabbed and minimised. When OmegaT starts, a short tutorial called "Instant Start" is displayed. | |||
== Document formats support == | == Document formats support == | ||
Line 73: | Line 65: | ||
=== Directly supported formats === | === Directly supported formats === | ||
{| class="wikitable sortable collapsible collapsed" | |||
OmegaT can translate the following formats directly: | |||
{| class="wikitable" | |||
|- | |- | ||
! File format !! File extension pattern | ! File format !! File extension pattern !! Format type | ||
|- | |- | ||
| Plain text (any text format which Java can handle) encoded in a variety of encodings including ] || .txt, .txt1, .txt2, .utf8 || Documentation | |||
! colspan="2" | Documentation formats | |||
|- | |- | ||
| ]/] || .html, .htm, .xhtml, .xht || Documentation | |||
| plain text (any text format which Java can handle) encoded in a variety of encodings including ] || .txt, .txt1, .txt2, .utf8 | |||
|- | |- | ||
| ] (ODF),<ref> – ISO/IEC 26300:2006 format</ref> used in ], ], ] || .sx?, .st?, .od?, .ot? || Documentation | |||
| ]/] || .html, .htm, .xhtml, .xht | |||
|- | |- | ||
| Microsoft ] (used in Microsoft Office 2007 and later) || .docx, .xlsx, .pptx || Documentation | |||
| ] (ODF),<ref> – ISO/IEC 26300:2006 format</ref> used in ], ], ] || .sx?, .st?, .od?, .ot? | |||
|- | |- | ||
| Help & Manual || .xml, .hmxp || Documentation | |||
| Microsoft ] || .docx, .xlsx, .pptx | |||
|- | |- | ||
| Help |
| HTML Help Compiler || .hhc, .hhk || Documentation | ||
|- | |- | ||
| ] || .tex, .latex || Documentation | |||
| HTML Help Compiler || .hhc, .hhk | |||
|- | |- | ||
| ] || . |
| ] || .txt || Documentation | ||
|- | |- | ||
| ] CopyFlow Gold || .tag, .xtg || Documentation | |||
| ] || .txt | |||
|- | |- | ||
| ] |
| ] || .xml, .dbk || Documentation | ||
|- | |- | ||
| |
| Android resources || .xml || Localization | ||
|- | |- | ||
| Java properties || .properties || Localization | |||
! colspan="2" | Localization resource formats | |||
|- | |- | ||
| ] Localization Manager (l10nmgr) || .xml || Localization | |||
| Android Resource || .xml | |||
|- | |- | ||
| Mozilla ] || .dtd || Localization | |||
| Java properties || .properties | |||
|- | |- | ||
| Windows resources || .rc || Localization | |||
| ] Localization Manager (l10nmgr) || .xml | |||
|- | |- | ||
| ] localization || .wxl || Localization | |||
| Mozilla ] || .dtd | |||
|- | |- | ||
| ] || .resx || Localization | |||
| Windows Resource || .rc | |||
|- | |- | ||
| Key=Value files || .ini, .lng || Localization | |||
| ] Localization || .wxl | |||
|- | |- | ||
| ] || . |
| ] || .xlf, .sdlxliff || Multilingual | ||
|- | |- | ||
| ] (PO) || .po, .pot || Multilingual | |||
| files with a “Key=Value” structure || .ini, .lng | |||
|- | |- | ||
| ] subtitles || .srt || Other | |||
! colspan="2" | Multilingual localization formats | |||
|- | |- | ||
| ] images || .svg || Other | |||
| ] || .xlf, .sdlxliff | |||
|- | |||
| ] (PO) || .po, .pot | |||
|- | |||
! colspan="2" | Other formats | |||
|- | |||
| ] Subtitles || .srt | |||
|- | |||
| SVG Images || .svg | |||
|- | |- | ||
|} | |} | ||
Line 146: | Line 128: | ||
==== Support for Gettext PO ==== | ==== Support for Gettext PO ==== | ||
A number of file formats can be converted to Gettext Portable Object (PO) files, which can be translated in OmegaT. The ] program po4a can convert formats such as ], ] and ] to Gettext PO.<ref> – A conversion utility to and from the ] format, perl application packaged under Debian</ref> The ] can convert Mozilla .properties and dtd files, CSV files, certain Qt .ts files, and certain XLIFF files to Gettext PO. | A number of file formats can be converted to Gettext Portable Object (PO) files, which can be translated in OmegaT. The ] program po4a can convert formats such as ], ] and ] to Gettext PO.<ref> {{Webarchive|url=https://web.archive.org/web/20060622022011/http://po4a.alioth.debian.org/ |date=2006-06-22 }} – A conversion utility to and from the ] format, perl application packaged under Debian</ref> The ] can convert Mozilla .properties and dtd files, CSV files, certain Qt .ts files, and certain XLIFF files to Gettext PO. | ||
==== Support for Office Open XML and ODF ==== | ==== Support for Office Open XML and ODF ==== | ||
Line 156: | Line 138: | ||
== Supported memory and glossary formats == | == Supported memory and glossary formats == | ||
=== Translation memories in TMX format === | === Translation memories in TMX format === | ||
Line 175: | Line 158: | ||
== Involvement by community of users == | == Involvement by community of users == | ||
=== The OmegaT Project === | === The OmegaT Project === | ||
OmegaT is open-source software and benefits from the help of volunteers. Programming is certainly the most important function, but it would benefit from greater support from volunteers in almost all areas. If you feel so inclined, you may also modify OmegaT to suit your own requirements.<ref> {{webarchive|url=https://web.archive.org/web/20110523083120/http://www.omegat.org/en/involved.html |date=2011-05-23 }}</ref> | |||
{{Disputed-section|The_so-called_.22Project.22|date=October 2010}} | |||
Users are encouraged to contribute tools written by themselves in response to translators' needs which are not yet addressed by the main OmegaT program itself.<ref> – Translators are encouraged to write their own supplementary tools</ref> | |||
=== Localization of OmegaT === | === Localization of OmegaT === | ||
Line 187: | Line 169: | ||
=== User-created programs === | === User-created programs === | ||
A characteristic of the OmegaT user community is that deficiencies in OmegaT often prompt users to create macros, scripts and programs that perform those functions, although sometimes those features later become available in OmegaT itself. When OmegaT offered only paragraph segmentation, a user created OpenOffice.org macros for segmenting by sentence. When automatic leveraging of TMs in OmegaT still required TMs to be merged, a user created a TMX merging script. When OmegaT offered no spell-checking support, several users created scripts or found solutions to provide spell-checking as part of an OmegaT based translation process.<ref>http://www.omegat.org/en/resources.html</ref> | A characteristic of the OmegaT user community is that deficiencies in OmegaT often prompt users to create macros, scripts and programs that perform those functions, although sometimes those features later become available in OmegaT itself. When OmegaT offered only paragraph segmentation, a user created OpenOffice.org macros for segmenting by sentence. When automatic leveraging of TMs in OmegaT still required TMs to be merged, a user created a TMX merging script. When OmegaT offered no spell-checking support, several users created scripts or found solutions to provide spell-checking as part of an OmegaT based translation process.<ref>{{cite web |url=http://www.omegat.org/en/resources.html |title=OmegaT, free memory translation tool |website=www.omegat.org |url-status=dead |archive-url=https://web.archive.org/web/20080509104328/http://www.omegat.org/en/resources.html |archive-date=2008-05-09}} </ref> | ||
== Other software built on OmegaT == | == Other software built on OmegaT == | ||
=== Autshumato translation suite === | |||
=== OmegaT in DGT === | |||
Autshumato consists of a CAT tool, an aligner, a PDF extractor, a TMX editor, and a public TM based on crawled data. The finished version will include a terminology manager and a machine translator. The CAT tool element is built upon OmegaT, and requires OpenOffice.org to run. Development is funded by the South African government's Department of Arts and Culture.<ref></ref> | |||
Latest update: 2021-03-21 | |||
The Directorate-General for Translation of the European Commission (DGT) uses OmegaT as an alternative CAT tool alongside a mainstream commercial tool. DGT maintains a fork of OmegaT (DGT-OmegaT) with adaptations/improvements/new features that meet DGT-specific requirements as well as a number of helper-applications to integrate OmegaT in its workflow: a Wizard to automate the creation, updating, revision and delivery of projects, Tagwipe to clean useless tags in docx documents and TeamBase to allow the sharing of memories in real-time. Those applications are made available by DGT as free open source software.<ref></ref> | |||
=== Benten === | === Benten === | ||
Latest update: 2018-04-07 | |||
Benten is an Eclipse based XLIFF editor. It uses OmegaT code to handle the TM matching process. It is partly funded by the Japanese government.<ref></ref> | |||
Benten is an Eclipse-based XLIFF editor. It uses OmegaT code to handle the TM matching process. It is partly funded by the Japanese government.<ref></ref> | |||
=== Boltran === | |||
Boltran is a web-based tool that mimicks the workflow of an OmegaT project. It is built upon the source code of OmegaT and can export OmegaT projects. <ref></ref> | |||
=== Autshumato translation suite === | |||
Latest update: 2017-02-28 | |||
Autshumato consists of a CAT tool, an aligner, a PDF extractor, a TMX editor, and a public TM based on crawled data. The finished version will include a terminology manager and a machine translator. The CAT tool element is built upon OmegaT, and requires OpenOffice.org to run. Development is funded by the South African government's Department of Arts and Culture.<ref></ref> | |||
=== OmegaT+ === | === OmegaT+ === | ||
Latest update: 2012-10-24 | |||
OmegaT+ is a CAT tool that was forked from OmegaT version 1.4.5 in 2005. OmegaT+ works in a way similar to OmegaT. It has developed its own features but projects are not compatible with OmegaT.<ref></ref> | OmegaT+ is a CAT tool that was forked from OmegaT version 1.4.5 in 2005. OmegaT+ works in a way similar to OmegaT. It has developed its own features but projects are not compatible with OmegaT.<ref></ref> | ||
=== Boltran === | |||
Latest update: 2010-10-12 | |||
Boltran is a web-based tool that mimicks the workflow of an OmegaT project. It is built upon the source code of OmegaT and can export OmegaT projects.<ref>{{Cite web |url=http://sourceforge.net/projects/boltran/ |title=Boltran |access-date=2013-10-11 |archive-date=2022-01-01 |archive-url=https://web.archive.org/web/20220101153052/https://sourceforge.net/projects/boltran/ |url-status=dead }}</ref> | |||
== See also == | == See also == | ||
* ] | * ] | ||
* ] | * ] | ||
* ] | * ] | ||
* ] | * ] | ||
Line 215: | Line 211: | ||
* | * | ||
=== User |
=== User support === | ||
* – Multilingual user mailing list (archives |
* – Multilingual user support mailing list (archives publicly visible) | ||
{{DEFAULTSORT:Omegat}} | {{DEFAULTSORT:Omegat}} | ||
] | ] | ||
] | ] | ||
] | ] | ||
] | ] |
Latest revision as of 06:49, 28 February 2024
Computer assisted translation tool written in JavaOmegaT 3.1.9 translating LibreOffice from English to Basque, "Project Files" window | |
Original author(s) | Keith Godfrey |
---|---|
Developer(s) | Aaron Madlon-Kay, Didier Briel, Alex Buloichik, Zoltan Bartko, Tiago Saboga, etc. |
Initial release | November 28, 2002 |
Stable release | 4.3.3 (March 18, 2022; 2 years ago (2022-03-18)) [±] |
Preview release | 5.7.1 (March 18, 2022; 2 years ago (2022-03-18)) [±] |
Repository | |
Operating system | Microsoft Windows, macOS, Linux, Solaris |
Type | Computer-assisted translation |
License | GPLv3+ |
Website | omegat |
OmegaT is a computer-assisted translation tool written in the Java programming language. It is free software originally developed by Keith Godfrey in 2000, and is currently developed by a team led by Aaron Madlon-Kay.
OmegaT is intended for professional translators. Its features include customisable segmentation using regular expressions, translation memory with fuzzy matching and match propagation, glossary matching, dictionary matching, translation memory and reference material searching, and inline spell-checking using Hunspell spelling dictionaries.
OmegaT runs on Linux, macOS, Microsoft Windows and Solaris, and requires Java 8. It is available in 27 languages. According to a survey in 2010 among 458 professional translators, OmegaT is used 1/3 as much as Wordfast, Déjà Vu and MemoQ, and 1/8 as much as the market leader Trados.
History
OmegaT was first developed by Keith Godfrey in 2000. It was originally written in C++.
The first public release in February 2001 was written in Java. This version used a proprietary translation memory format. It could translate unformatted text files, and HTML, and perform only block-level segmentation (i.e. paragraphs instead of sentences).
Development and software releases
The development of OmegaT is hosted on SourceForge. The development team is led by Aaron Madlon-Kay. As with many open source projects, new versions of OmegaT are released frequently, usually with 2-3 bugfixes and feature updates each. There is a "standard" version, which always has a complete user manual and a "latest" version which includes features that are not yet documented in the user manual. The updated sources are always available from the SourceForge code repository.
How OmegaT works
OmegaT handles a translation job as a project, a hierarchy of folders with specific names. The user copies non-translated documents into one named /source/ (or subfolders thereof). The Editor pane displays the source documents as individual “segments” for translation one segment at a time. OmegaT, when directed, generates the (partially) translated versions in the /target/ subfolder.
Other named folders include ones for automatic consultation within the program: /tm/ for existing translation pairs in .tmx format, /tm/auto/ for automatic translation of 100% matches, /glossary/ for glossaries, /dictionary/ for StarDict (and .tbx) dictionaries.
When the user goes to translate a segment in the Editor pane, OmegaT automatically searches the .tmx files in the /tm/ hierarchy for previous translation pairs with similar source sentences and displays them in the Fuzzy Matches pane for insertion into the Editor pane with a keyboard shortcut. The Glossary and Dictionary panes provide similar automatic look-up functions for any glossaries and dictionaries in the corresponding named folders in the project. The optional Machine Translation pane shows machine translations from Google Translate and similar services.
When the user leaves a segment, OmegaT normally first adds the source-target pair to its database in memory. It subsequently saves that database to disk in Translation Memory eXchange (.tmx) format for use another day, in other projects, by other translators, and even with other CAT tools. No change, naturally enough, means no such update. Version 3.1 added a setting for blocking targets equal to their sources, a common slip, plus a keyboard shortcut for overriding it—numbers, source code in programming manuals, etc.
At any point, the user can create partially translated versions of the source files. Note that OmegaT copies source segments verbatim if they have yet to be translated. Before doing so, however, the user is advised to use the Validate menu command to check for tag and other errors. Version 3.1 added a menu command (and keyboard shortcut) for limiting operation to the current file—for partial delivery or quick update, for example.
Features of OmegaT
OmegaT shares many features with proprietary CAT tools. These include creating, importing and exporting translation memories, fuzzy matching from translation memories, glossary look-up, and reference and concordance searching.
OmegaT also has additional features that are not always available in other CAT tools. These include:
- OmegaT starts by displaying a short tutorial called "Instant Start.".
- OmegaT can translate multiple files in multiple file formats in multiple subfolders simultaneously, and consult multiple translation memories, glossaries and dictionaries (limited only by available computer memory).
- With regard to supported file types, OmegaT allows the user to customise file extensions and file encodings. For a number of document types, the user can choose selectively which elements must be translated (e.g. in OpenOffice.org Writer files, choose whether to include bookmarks; in Microsoft Office 2007/2010 files, choose whether to translate footnotes; or in HTML, choose whether to translate ALT text for images). The user can also choose how non-standard elements in third-party translation memories should be handled.
- OmegaT's segmentation rules are based on regular expressions. Segmentation can be configured based on language or based on file format, and successive segmentation rules inherit values from each other.
- In the edit window, the user can jump directly to the next untranslated segment, or go forward or backwards in history. Users can use undo and redo, copy and paste, and switch between uppercase and lowercase in the same way as one would in an advanced text editor. The user can choose to see the source text of segments that have already been translated. The edit pane also has inline spell-checking using Hunspell dictionaries, and interactive spell-checking is done using the mouse.
- Users can insert fuzzy matches using a keyboard shortcut or using the mouse. OmegaT shows the degree of similarity in fuzzy matches using colours. OmegaT can also display the date, time and the name of the user who translated any given segment. Glossary matches can be inserted using the mouse. The user can choose to have the source text copied into the target text field, or to have the highest fuzzy match automatically inserted.
- In the search window, the user can choose to search the current files' source text, target text, other translation memories, and reference files. Searches can be case sensitive, and regular expressions can also be used. Double-clicking a search result takes the user directly to that segment in the edit window.
- After translation, OmegaT can perform tag validation to ensure that there are no accidental tag errors. OmegaT can calculate statistics for the project files and translation memories before the project starts, or during the translation to show the progress of the translation job.
- OmegaT can get machine translations from Apertium, Belazar, Deepl and Google Translate, and display it in a separate window.
- The various windows in OmegaT's user interface can be moved around, maximised, tiled, tabbed and minimised.
Document formats support
Several file types can be translated directly in OmegaT. OmegaT determines the file type by the file extension. The file extension handling and preferred encoding can be customised to override default settings.
OmegaT handles formatted documents by converting formatting to tags, similar to other commercial CAT tools.
Directly supported formats
File format | File extension pattern | Format type |
---|---|---|
Plain text (any text format which Java can handle) encoded in a variety of encodings including Unicode | .txt, .txt1, .txt2, .utf8 | Documentation |
HTML/XHTML | .html, .htm, .xhtml, .xht | Documentation |
OpenDocument (ODF), used in LibreOffice, StarOffice, Apache OpenOffice | .sx?, .st?, .od?, .ot? | Documentation |
Microsoft Office Open XML (used in Microsoft Office 2007 and later) | .docx, .xlsx, .pptx | Documentation |
Help & Manual | .xml, .hmxp | Documentation |
HTML Help Compiler | .hhc, .hhk | Documentation |
LaTeX | .tex, .latex | Documentation |
DokuWiki | .txt | Documentation |
QuarkXPress CopyFlow Gold | .tag, .xtg | Documentation |
DocBook | .xml, .dbk | Documentation |
Android resources | .xml | Localization |
Java properties | .properties | Localization |
TYPO3 Localization Manager (l10nmgr) | .xml | Localization |
Mozilla DTD | .dtd | Localization |
Windows resources | .rc | Localization |
WiX localization | .wxl | Localization |
ResX | .resx | Localization |
Key=Value files | .ini, .lng | Localization |
XLIFF | .xlf, .sdlxliff | Multilingual |
Portable Object (PO) | .po, .pot | Multilingual |
SubRip subtitles | .srt | Other |
SVG images | .svg | Other |
Indirectly supported formats
There are two processes that allow OmegaT to handle unsupported formats:
- register the format file extension into the preferred file filter (typically all plain text based formats)
- convert the format to a directly supported format
Support for XLIFF
The program Rainbow from the Okapi Framework can convert certain file formats to an XLIFF format that OmegaT does support. Rainbow can also create complete OmegaT project folders from such documents, for easier handling in OmegaT.
Support for Gettext PO
A number of file formats can be converted to Gettext Portable Object (PO) files, which can be translated in OmegaT. The Debian program po4a can convert formats such as LaTeX, TeX and POD to Gettext PO. The Translate Toolkit can convert Mozilla .properties and dtd files, CSV files, certain Qt .ts files, and certain XLIFF files to Gettext PO.
Support for Office Open XML and ODF
Microsoft Word, Excel and PowerPoint documents from version 97 to 2003 can be converted to Office Open XML (Microsoft Office 2007/2010) or ODF (OpenOffice.org) format. Conversion is not entirely lossless and may lead to loss of formatting.
Support for Trados® .ttx files
Trados® .ttx files can be treated using the Okapi TTX Filter.
Supported memory and glossary formats
Translation memories in TMX format
OmegaT's internal translation memory format is not visible to the user, but every time it autosaves the translation project, all new or updated translation units are automatically exported and added to three external TMX memories: a native OmegaT TMX, a level 1 TMX and a level 2 TMX.
- The native TMX file is for use in OmegaT projects.
- The level 1 TMX file preserves textual information and can be used with TMX level 1 and 2 supporting CAT tools.
- The level 2 file preserves textual information as well as inline tag information and can be used with TMX level 2 supporting CAT tools.
Exported level 2 files include OmegaT's internal tags encapsulated in TMX tags which allows such TMX files to generate matches in TMX level 2 supporting CAT tools. Tests have been positive in Trados and SDLX.
OmegaT can import TMX files up to version 1.4b level 1 as well as level 2. Level 2 files imported in OmegaT will generate matches of the same level since OmegaT converts the TMX level 2 tags of the foreign TMX. Here again, tests have been positive with TMX files created by Transit.
Glossaries
For glossaries, OmegaT mainly uses tab-delimited plain text files in UTF-8 encoding with the .txt extension. The structure of a glossary file is extremely simple: the first column contains the source language word, the second column contains the corresponding target language words, the third column (optional) can contain anything including comments on context etc. Such glossaries can easily be created in a text editor.
Similarly structured files in standard CSV format are also supported, as well as TBX files.
Involvement by community of users
The OmegaT Project
OmegaT is open-source software and benefits from the help of volunteers. Programming is certainly the most important function, but it would benefit from greater support from volunteers in almost all areas. If you feel so inclined, you may also modify OmegaT to suit your own requirements.
Localization of OmegaT
OmegaT's user interface and documentation have been translated into about 30 languages. Volunteer translators can translate either the user interface, the "Instant Start" short tutorial, or the entire user manual (or all three components). All the language files and all translations of the user manual are included in the standard distribution of OmegaT.
User-created programs
A characteristic of the OmegaT user community is that deficiencies in OmegaT often prompt users to create macros, scripts and programs that perform those functions, although sometimes those features later become available in OmegaT itself. When OmegaT offered only paragraph segmentation, a user created OpenOffice.org macros for segmenting by sentence. When automatic leveraging of TMs in OmegaT still required TMs to be merged, a user created a TMX merging script. When OmegaT offered no spell-checking support, several users created scripts or found solutions to provide spell-checking as part of an OmegaT based translation process.
Other software built on OmegaT
OmegaT in DGT
Latest update: 2021-03-21
The Directorate-General for Translation of the European Commission (DGT) uses OmegaT as an alternative CAT tool alongside a mainstream commercial tool. DGT maintains a fork of OmegaT (DGT-OmegaT) with adaptations/improvements/new features that meet DGT-specific requirements as well as a number of helper-applications to integrate OmegaT in its workflow: a Wizard to automate the creation, updating, revision and delivery of projects, Tagwipe to clean useless tags in docx documents and TeamBase to allow the sharing of memories in real-time. Those applications are made available by DGT as free open source software.
Benten
Latest update: 2018-04-07
Benten is an Eclipse-based XLIFF editor. It uses OmegaT code to handle the TM matching process. It is partly funded by the Japanese government.
Autshumato translation suite
Latest update: 2017-02-28
Autshumato consists of a CAT tool, an aligner, a PDF extractor, a TMX editor, and a public TM based on crawled data. The finished version will include a terminology manager and a machine translator. The CAT tool element is built upon OmegaT, and requires OpenOffice.org to run. Development is funded by the South African government's Department of Arts and Culture.
OmegaT+
Latest update: 2012-10-24
OmegaT+ is a CAT tool that was forked from OmegaT version 1.4.5 in 2005. OmegaT+ works in a way similar to OmegaT. It has developed its own features but projects are not compatible with OmegaT.
Boltran
Latest update: 2010-10-12
Boltran is a web-based tool that mimicks the workflow of an OmegaT project. It is built upon the source code of OmegaT and can export OmegaT projects.
See also
References
- "OmegaT - multiplatform CAT tool / Code / [86775c] /Release/OmegaT-license.TXT".
- "Chapter 1. Installing and running OmegaT". omegat.sourceforge.io. Retrieved 2019-08-14.
- "Results of the June translation tools surveys".
- "Close Windows. Open Doors".
- https://sourceforge.net/projects/omegat/files/ OmegaT's "standard" and "latest" versions
- https://archive.today/20120717155731/http://omegat.svn.sourceforge.net/viewvc/omegat/trunk/ The latest source files are always available from the SourceForge code repository
- Open Document Format for Office Applications – ISO/IEC 26300:2006 format
- Okapi Framework – Text Extraction utility can create an OmegaT project folder tree
- po4a Archived 2006-06-22 at the Wayback Machine – A conversion utility to and from the Portable Object format, perl application packaged under Debian
- The OmegaT project and You Archived 2011-05-23 at the Wayback Machine
- "OmegaT, free memory translation tool". www.omegat.org. Archived from the original on 2008-05-09.
- DGT-OmegaT
- Benten
- Autshumato
- OmegaT+
- "Boltran". Archived from the original on 2022-01-01. Retrieved 2013-10-11.
External links
User support
- omegat-users@lists.sourceforge.net – Multilingual user support mailing list (archives publicly visible)