Velingua terminology tools for end users, language professionals, and system integrators

Velingua is a collection of terminological tools that can either be used in the Velingua Organizer or integrated as a module in your own software.

Velingua offers terminology extraction for German and English as well as terminology checking (term checking for short).

Velingua uses a linguistic engine with lemma recognition (lemmatisation) and morpheme recognition, ensuring a high-quality analysis of the text data.

The products of the UniTerm family or TBX data can be used as terminology databases.

Velingua supports the following file formats: text (Ansi or Unicode), XLIFF, PDF, HTML and XML.

Velingua – overview

Velingua is a collection of terminological tools for language professionals such as terminologists, translators, interpreters, and linguists who are looking for flexibility and complex features.

Velingua modules can be integrated into existing systems or third-party software.

Modules

  • Terminology extraction (for German and English)
  • Term checking
  • Linking to terminology databases
  • Terminology checking for translations (in planning)

Technology

Velingua uses a linguistic engine with lemma recognition (lemmatisation) and morpheme recognition, ensuring a high-quality analysis of the text data. The linguistic engine supports major Western European languages (German, English, French, Spanish, Dutch, Portuguese, Italian).

The products of the UniTerm family or TBX data can be used as terminology databases.

The following text formats can be used:

  • Text (Ansi or Unicode)
  • XLIFF
  • PDF
  • HTML
  • XML

Licences

  • Separate licences for the modules for system integrators
  • Single-user licence for Velingua Organizer
  • Company licence for Velingua Organizer
  • Single-user licence for Velingua TermCheck
  • Company licence for Velingua TermCheck (personalised or floating)

Velingua – terminology extraction

Terminology extraction in German and English

The terminology extraction process will factor in an existing terminology if it is available in a UniTerm Pro, UniTerm Enterprise or TBX format.

The following formats are supported as text corpora:

  • Text (Ansi or Unicode)
  • XLIFF
  • PDF
  • HTML
  • XML

The candidate terms are output as a list in a CSV file. If desired, the KWIC information can also be generated.

Velingua terminology extraction can be integrated into other environments as a command line tool. The list of candidate terms can then be cleaned and imported into any terminology database via CSV import.

The basic methods for extraction are:

  • Large relative frequency of lemmas (with lemmatisation)
  • Morpheme analysis (evaluation of frequent morphemes)
  • N-gram analysis (relatively frequent character sequences – purely statistical process)
  • Multi-word recognition using word class patterns (e.g. adjective/noun sequences in German)

Terminology extraction can be configured to a high degree with

  • Thresholds for minimum number of occurrences
  • Control of the result set depending on the size of the text corpus
  • Percentage share of the four basic methods
  • Word types of the candidate terms
  • Word class patterns for multi-word recognition
  • Columns in the results file (absolute frequency, extraction method, word type, KWIC …)

Terminology extraction can be conveniently configured in Velingua Organizer, where terminology candidates can also be interactively transferred to the terminology database. In the process, they can be classified as a preferred designation, permitted designation, forbidden designation or stop word.

Velingua – terminology checking (term checking for short)

Term checking as you write

Velingua TermCheck makes it possible for terms to be checked while they are being written, with forbidden or unknown terms flagged immediately so that authors can correct them.

This term checking function is already integrated in Acolada’s XML editing software SIMQIN, in Office’s Word and PowerPoint programs, and in Adobe’s InDesign and FrameMaker tools.

With the Velingua Clipboard Listener, term checking is possible in any Windows application capable of highlighting and copying text. The Clipboard Listener runs as a tool in the background and can be switched on and off as required.

The term checker uses the databases of the UniTerm terminology management systems.

A linguistic engine performs a lemma reduction so that inflected forms are also recognised.

Multi-word designations are also recognised. This works even if the constituents are in inflected forms (such as plural instead of singular for nouns or feminine instead of masculine inflection form for adjectives).

Current documentation:

Velingua TermCheck documentation

System requirements: 

Runs on all Windows operating systems from Windows 8 and higher

Velingua – Velingua Organizer

Windows application for the Velingua modules

Velingua Organizer is the graphical user interface for the Velingua modules. This is where you can define the terminology databases that the terminology extraction and term checking should interact with.

The following terminology databases are possible:

  • UniTerm Pro
  • UniTerm Enterprise
  • TBX

For terminology extraction, the settings can be configured and saved for reuse.
Terminology extraction can be performed for individual files or entire directories. The results of the extraction can be clearly evaluated.
Candidate terms can be selectively transferred to the terminology database. In the process, they can be classified as a preferred designation, permitted designation, forbidden designation or stop word.

In Velingua Organizer it is possible to extract terms from documents with the following text formats:

  • Text (Ansi or Unicode)
  • XLIFF
  • PDF
  • HTML
  • XML

For term checking, documents can be compared against a terminology. In the process, designations that are classified as a preferred designation, permitted designation, prohibited designation or stop word can be highlighted in colour in the document being examined.

In Velingua Organizer, term checking can be carried out for documents with the following text formats:

  • Text (Ansi or Unicode)
  • XLIFF
  • PDF
  • XML
  • HTML

Velingua – integrating modules

The Velingua modules can be integrated into existing applications as ActiveX objects.

For terminology extraction, there is the option of integrating this as a command line tool.

The linguistic engine must also be installed as a service so that the linguistic functions are available.

The Acolada terminology management systems have an interface through which communication between the Velingua modules and UniTerm takes place. The interfaces are available after the terminology management systems have been installed.

A REST service is available for term checking based on UniTerm Enterprise. This makes it very easy to integrate into your own software components.