Information

Title	WorldServer - how to control the behavior of ICE Matches and set Penalties where required

URL Name	000010391

Summary	WorldServer offers configurable controls over what is considered an ICE match. These controls are available as property settings in the tm.properties file.

Scope/Environment	WorldServer

Question

How to control the behavior of ICE Matches and set TM Penalties where required?

Answer

This page and this section of the RWS Documentation Center address this topic in detail. Here is a summary:

WorldServer offers configurable controls over what is considered an ICE match. These controls are available as property settings in the tm.properties file. The following table describes each control.

TM ICE Control Properties

TM Property	Description	Default Value
require_asset_match_for_ice	Determines whether the ICE condition will require that the Entry Origin for the TM entry match the TM AIS context of the asset being translated. If this option is enabled, then no segments for new assets will be ICE matched. Note that path normalization will impact how the TM identifies assets, and therefore can impact ICE results when this option is enabled. This option does not apply to SPICE matches.	False
require_metadata_match_for_ice	Determines whether the ICE condition will require the TM entry to have the same attribute values as the corresponding mapped AIS properties. If enabled, all mapped attributes must match their AIS property counterparts. This option does not impact SPICE matches.	False
require_full_usage_context_for_ice	Determines whether a full usage context is required for all ICE matches. For a full usage context, the TM match must have been translated between segments having the same content as those around the asset segment being translated. This option does not impact SPICE matches. Without this option, you only require partial context (that is, only the segment before or after has to match) where this requires both the segment before and after to match.	False
require_reviewed_status_for_ice	Determines if translation entries must have Reviewed translation status as a requirement for ICE matches. The default behavior is that unreviewed translation memory entries do not satisfy the ICE criteria. In non-live mode, all translation memory entries are reviewed, so ICE matches are unaffected by this configuration. This option affects both SPICE and ICE matches.	True

WorldServer also provides configurable TM penalties that are assignable at the system level, and thus apply to all TMs within the system. There are two levels of penalties: content level penalties and leverage level penalties.

Content level penalties – Content level penalties are designed to negatively affect scoring due to certain undesirable differences between the content and the TM entry being leveraged. These penalties are applied against specific types of segment elements—words, numbers and placeholders—within the match text. Alternatively, they can be thought of as element level penalties.

For example, the capitalization penalty, tm_score_capitalization_penalty, is applied to each word in the segment that differs in capitalization from the lookup text. What this means is that the weight of the word within the match is reduced by the penalty. The more elements there are in a segment, the less the impact of applying a penalty to a single element will have on the final match score. In effect, the impact of content-based penalties is weighted across all of the elements within the segment. If the assigned value for a content-based penalty is .01 (or 1%), then the only way the final score will be reduced by 1% is if the penalty is applied to every element within that segment.
Users are most often able to identify scenarios that result in content-based penalties. This is because the user can generally see the differences that led to the penalties being applied simply by comparing the text differences between the source and match text.
Because content level penalties are based on content differences, the impact of these penalties can be prevented or reduced through content based repair algorithms. It is possible that an initially penalized match can be scored up as high as a 100% match due to the repairs performed.

Leverage level penalties – Leverage level penalties are penalties that are applied to the final match score based on criteria that extends beyond the segment content being translated. Leverage level penalties tend to focus on context and other data characteristics that have implications toward the overall quality or usefulness of the match.
For instance, the maximum target length penalty, maximum_target_length_penalty, penalizes matches if the translation text of the match is too long to be useful in the target segment. The match may have normally produced an ICE match were there no length constraints. However, the penalty effectively negates further consideration of the match as an ICE match.
Unlike content level penalties, leverage level penalties are deducted from the final match score. That is, if a leverage level penalty is applied having a value of 0.01, the final TM score will be reduced by 1%.

Another important note about leverage level penalties is that leverage level penalties are final. Repair technology cannot negate their impact since repairs only focus on content differences.

The following table describes the leverage level penalties:

TM Property

Description

Default Value

tm_score_asset_mismatch_penalty

Defines the penalty to be applied when the Entry Origin for the TM entry does not match the TM AIS context of the asset being translated. This is a leverage level penalty. It is applied to the final score produced by the match, and is only applied during leverage based processes. Value should be between 0 and 1. It does not apply to SPICE matches.

tm_score_metadata_mismatch_penalty

Defines the penalty to be applied when the mapped metadata of the TM match does not match the mapped metadata of the asset being translated. This is a leverage level penalty. It is applied to the final score produced by the match, and is only applied during leverage based processes. Value should be between 0 and 1. It does not apply to SPICE matches.

Note: Either all of the pieces of mapped metadata match and no penalty is applied or one or more pieces of metadata do not match and the full penalty is applied once. Mapped metadata refers to TM attributes that are mapped to AIS properties.

tm_score_multiple_exact_match_penalty

Defines the penalty to be applied when there are multiple distinct 100% match translations for the source text contained in the TM match. This is a leverage level penalty. It is applied to the final score produced by the match, and is only applied during leverage based processes. Value should be between 0 and 1. It does not apply to SPICE or ICE matches.

Note: All of the 100% matches would get this penalty. For example, if you had three 100% matches and this penalty was set to .02, then all three would be reduced to 98%. Repaired 100% matches are not considered.

tm_score_whitespace_difference_penalty

Defines the penalty to be applied when there are whitespace differences between the source text of the segment and the source text of the TM match. This penalty is only applied when whitespace is to be considered significant. This is a content based penalty, not a leverage level penalty. A value > 0 means whitespace is significant.

tm_score_unreviewed_match_penalty

Defines the penalty for TM matches with an unreviewed translation status. In non-live mode, all translation memory entries are reviewed, so ICE matches are unaffected by this configuration. This is a leverage level penalty. It is applied to the final score produced by the match, and is only applied during leverage based processes. Value should be between 0 and 1. This penalty applies to all TM matches.

tm_score_reverse_leverage_penalty

Defines the penalty for reverse TM matches. This penalty applies to all leverage matches the result from reverse TM lookups. A TM must be configured to support reverse leveraging. Value range should be between 0 and 1.

maximum_target_length_penalty

Penalizes matches if the translation text of the match is too long to be useful in the target segment. The match may have normally produced an ICE match were there no length constraints. However, the penalty effectively negates further consideration of the match as an ICE match.

There is also a TM group TM usage penalty, supported and configured at the TM group level. This penalty is configured in the TM Group definition page of the user interface. There is no TM properties file entry for it.

Reference

These articles might be relevant:

WorldServer: How to find the context of a match in order to determine why the match is not an ICE match
Where can I find detailed information about how Translation Memory works in WorldServer?
WorldServer - after segmentation, a segment has the status 'Pending Review' although the corresponding TM entry has the status 'Reviewed'
WorldServer - Is it possible to set specific TM penalties for terms that may have multiple meanings and result in multiple exact match?

Attachment 1

Attachment 2

Attachment 3

Attachment 4

Attachment 5

WorldServer - how to control the behavior of ICE Matches and set Penalties where required

Information