PangeaMT is Pangeanic’s own, independent translation technology division with a clear focus on customised, domain-specific Machine Translation (MT) services. Pangeanic has developed and used machine translation for many applications since it became a member of TAUS and thanks to access to millions of words as training corpus with which it was able to experiment. Machine translation became part of company culture in 2009, and since then providing machine translation services to corporations and even other translation companies has become part of Pangeanic’s range of services.

HISTORY

As a forward-thinking and technology savvy LSP, Pangeanic won a post-editing contract in 2007 to provide services for the European Commission as machine translation output post-editors. It was around this time when we became acquainted with institutional user needs and (re-)evaluated several commercial machine translation services we had been using. Soon we decided to develop our own machine translation technology.

Pangeanic was quoted as the first language service provider to make commercial use of Moses in EU’s Framework development program euromatrixplus.net (the second, more perfected release of Moses). Since then, many presentations, awards and implementations have followed, and Pangeanic has made a name for itself as a leading machine translation implementation company in London, Hong Kong and across the globe. It also markets its machine translation services in other areas beyond the translation industry and is heavily involved in several more EU machine translation government programs.

FOCUS

We began as keen followers of the statistical-driven paradigm of machine translation services. This worked very well in several related languages (Romance languages and English, German and Scandinavian languages). However, our links to the Japanese industry soon provided requests to add Japanese and Chinese to our service portfolio. In 2011, Pangeanic developed hybrid machine translation services which were included as part of the system features.

Pangeanic's Syntax-Based Hybrid Machine Translation
Pangeanic’s Syntax-Based Hybrid Machine Translation

FEATURES

We started with statistical systems and have since evolved to become neural networks-based by 2017. We have been able to overcome many of the raw MT shortcomings in order to fit the needs of the translation industry: our services go beyond text-based machine translation and are capable of taking input and producing output in industry-standards, such as TMX and XLIFF. PangeaMT provides an app for document translation and API access to our translation platform so you do not need to change your translation environment but you can benefit from adding your future translations in a virtuous re-training cycle. Using open standards means that you will never have to pay for expensive machine translation services and software again. Our solutions just avoid having you locked-in by expensive upgrades year after year.

Another PangeaMT breakthrough is our inline mark-up parser. PangeaMT handles tags extremely efficiently. Statistical machine translation systems (as they come from open sources releases) usually produce plain text output because this is also the format they process. However, we are keen to see PangeaMT solutions in use and adapted to the most demanding language industry requirements. We focused our effort on developing SMT engines capable of handling in-line coding typical of other content formats used in localisation production environments. Thanks to this parser, PangeaMT can identify in-lines without attempting to translate them, and it places them back in the resulting text, too. An in-line placeholder acts first by copying and transferring all XML and code information to a separate module. The translation engine does its work and then places the in-line back into the translated segment. At the time of its release, our in-line parser constituted an innovation well-above the current level of maturity of well-known SMT systems.

We keep learning and improving with every development commissioned by an existing or new client and language combination. We therefore remain open as to apply new hybridisation techniques, even ad-hoc rules, that we research and implement ourselves or co-develop in conjunction with our clients. We are aware that for some language combinations it will be necessary to resort to some linguistic-informative techniques that will be part of the pre- or post-processing phases. Right word and phrase reordering in the MT output is not an easy goal to achieve, especially when the languages involved are not closely linked from a linguistic family standpoint, or when one of the two languages has a really flexible, and therefore MT-challenging, word order (WO). Some language-specific fixing procedures may come handy. In some other cases, it may be useful to use one language as pivot to train engines in languages that are pretty close. These and other techniques may be used or taken as a basis for expanding our PangeaMT solution palette.

Please visit our machine translation division website to learn more about services from PangeaMT.