In order for
Repetitions to be calculated, you need to have a
Calculate Repetitions Automatic Step added to your Workflow after
Segment Asset. If the step is not added, repetitions will not be counted and added to the scoping report.
If you scope an asset in
Explorer (Ad-hoc scoping) repetitions are calculated automatically.
As explained in
this section of our RWS Documentation Center: the repetition calculation process identifies and counts repetitions across segments that cannot be fully leveraged by the TM. This means that both ICE and 100% matches are
not included in the repetition counts. The motivation behind repetition counting is to reduce translation cost for the customer by preventing them from being charged full price for identical segments that must be translated.
Note: if the
Segment Asset step applies
Machine Translation (through the setting
Leverage assigned MT: Yes), segments pretranslated by Machine Translation will also not be included in the
Repetitions count.
The first occurrence of a duplicated segment is considered the original or repeated segment, and it is scoped as normal. This means that the word count for the repeated segment will be attributed to the fuzzy match bucket (from 0% - 99%), representing the best TM fuzzy match for the segment, and thus will incur a translation cost relative to the translation effort. Subsequent occurrences of the segment are referred to as repetition segments. The word count for repetition segments are placed in the scoping bucket for repetitions.
For example, if the segment "Oh what a beautiful morning" containing five words was repeated five times in an asset or set of assets, then the first occurrence would be scoped normally, and the additional occurrences (collectively containing 20 words) would be placed in the repetition bucket provided that they cannot be fully leveraged by the TM.
Repetitions and markup tags
Recognizing segment repetitions is important if you want to estimate the effort needed for translating a document. In WorldServer, certain types of markup tags affect whether or not a segment is a genuine repetition of another.
WorldServer treats markup tags of the same type (standalone, opening, or closing) and appearing at the same location in the segment as interchangeable and considers these tags equivalent in the repetition analysis. It determines equivalency based on tag types and locations, but not on the content of the placeholder.
For example, the following sentences are likely to be repetitions in WorldServer:
- This is a
<b>
simple sentence<\b>
with tags. - This is a
<i>
simple sentence<\i>
with tags.
On the other hand, the following sentences are not likely to be repetitions in WorldServer:
- This is a
<b>
simple sentence<\b>
with tags. - This is a
<b>
simple<\b>
sentence with tags.
or
- This is a
<b>
simple sentence<\b>
with tags. - This is a
<br>
simple sentence<br>
with tags.