Information

Title	WorldServer - how are Repetitions calculated?

URL Name	000010570

Summary	In order for Repetitions to be calculated, you need to have a Calculate Repetitions Automatic Step added to your Workflow after Segment Asset. If the step is not added, repetitions will not be counted and added to the scoping report. This article explains the concept of functionality Repetitions and refers to other relevant articles.

Scope/Environment	SDL WorldServer

Question

How are Repetitions calculated in WorldServer?

Answer

In order for Repetitions to be calculated, you need to have a Calculate Repetitions Automatic Step added to your Workflow after Segment Asset. If the step is not added, repetitions will not be counted and added to the scoping report.

If you scope an asset in Explorer (Ad-hoc scoping) repetitions are calculated automatically.

As explained in this section of our RWS Documentation Center: the repetition calculation process identifies and counts repetitions across segments that cannot be fully leveraged by the TM. This means that both ICE and 100% matches are not included in the repetition counts. The motivation behind repetition counting is to reduce translation cost for the customer by preventing them from being charged full price for identical segments that must be translated.

Note: if the Segment Asset step applies Machine Translation (through the setting Leverage assigned MT: Yes), segments pretranslated by Machine Translation will also not be included in the Repetitions count.

The first occurrence of a duplicated segment is considered the original or repeated segment, and it is scoped as normal. This means that the word count for the repeated segment will be attributed to the fuzzy match bucket (from 0% - 99%), representing the best TM fuzzy match for the segment, and thus will incur a translation cost relative to the translation effort. Subsequent occurrences of the segment are referred to as repetition segments. The word count for repetition segments are placed in the scoping bucket for repetitions.

For example, if the segment "Oh what a beautiful morning" containing five words was repeated five times in an asset or set of assets, then the first occurrence would be scoped normally, and the additional occurrences (collectively containing 20 words) would be placed in the repetition bucket provided that they cannot be fully leveraged by the TM.

Repetitions and markup tags

Recognizing segment repetitions is important if you want to estimate the effort needed for translating a document. In WorldServer, certain types of markup tags affect whether or not a segment is a genuine repetition of another.

WorldServer treats markup tags of the same type (standalone, opening, or closing) and appearing at the same location in the segment as interchangeable and considers these tags equivalent in the repetition analysis. It determines equivalency based on tag types and locations, but not on the content of the placeholder.

For example, the following sentences are likely to be repetitions in WorldServer:

This is a simple sentence<\b> with tags.
This is a simple sentence<\i> with tags.

On the other hand, the following sentences are not likely to be repetitions in WorldServer:

This is a simple sentence<\b> with tags.
This is a simple<\b> sentence with tags.

This is a simple sentence<\b> with tags.
This is a  simple sentence  with tags.

Reference

These articles might be relevant:

How does the 'Calculate Repetitions' Automatic Action work in WorldServer?
WorldServer - Is there a way to see what part of the repetitions are cross-file repetitions in the scoping report?
WorldServer - why are there tasks in 'Calculate repetitions' step in status 'Waiting for project scope' and not moving?

Attachment 1

Attachment 2

Attachment 3

Attachment 4

Attachment 5

WorldServer - how are Repetitions calculated?

Information

Repetitions and markup tags