This article explains some of the options and settings for Concordance Search that are available in Trados Studio:
A common question users have when using concordance is why the Concordance Search does not find an expected match in the translation memory. This question can be paraphrased or simplified by asking "Why is the concordance search not searching through the complete translation memory?"
The most important observation around concordance searching is that the Concordance Search is NOT a full-text search through your complete translation memory, rather it is a fuzzy search that has been designed to return meaningful results in large data sets as quickly as possible. If theConcordance Search was a traditional full-text search, performance would be affected negatively as the entire translation memory would need to be searched sequentially and completely before the list of matches would be displayed.
To avoid this performance decrease, the Concordance Search is based on a search algorithm that works in the same way as a fuzzy search and is split into three phases as follows:
- Candidate search
- Scoring of found candidates
- Sorting and display
Although you can use Concordance Search for words and characters, the algorithm is optimized for searching phrases. This means that phrase searching will typically return the best results.
If you want to run a full-text search for specific words/phrases through the entire translation memory, we recommend that you use the translation memory view in Trados Studio that supports full-text searches.
This is the degree of match that must exist between the search text you have selected and the matching text fragment in a translation memory segment in order for the translation to be offered as a match. The default is 70% but you can make the limit higher or lower.
If you increase the number of hits in the Maximum number of Hits options for the Concordance Search, this will include more matching candidates in the scoring set. The candidate search also has scores but these are independent from the match scores. This means that a “good” candidate score can lead to a “not as good” match score.
For example, if you search a word in the Concordance Search and set the Maximum number of hits to 10, and you receive 10 candidates for the match scoring that have all the same (good) candidate score, it could be that the last good hit in the results list for example receives a match score of 90%. But if you run a search and set the Maximum number of hits to 11 hits and candidate #11 has a score of 95% then this hit will be listed better/higher in the results list than #10 that had a score of 90%.
Concordance search works in two steps:
- Search the fuzzy index: the fuzzy index contains tokens that represent one or more words. For example, the fuzzy index may store the token
verdipapir
for the following words: verdipapirlån
, verdipapirfond
, verdipapirene
and verdipapirtap
. With a Maximum number of hits value set to 30, stage 1 of the concordance search will return the first thirty segments it finds that contain the token verdipapir
. - Evaluate the full content of the segments found in stage 1. If the real word from the search is found, you will receive a 100% match; if the real word is not included, you will receive no matches or fuzzy matches. When the Maximum number of hits value is increased, the number of segments found in stage one containing the token
verdipapir
will increase and this in turn will increase the probability that stage 2 will find one or more of the segments that contain the word you are looking for.
In a situation where you have a translation memory containing 1000 segments including the word verdipapirfond
, you would probably rarely find the word verdipapirlån
with a concordance search, unless you include more words in your search.
Trados Studio 2009 (re-)introduced the character-based concordance search feature in addition to the existing word-based concordance search. To use the character-based concordance search, you need to select the Enable character-based concordance search option during the creation of a translation memory. This can only be done during the creation stage of a translation memory (and is not reversible). There are two main aspects to be aware of when enabling this feature:
- If you have a large translation memory, enabling this option can increase the size of your translation memory and also increase the response time for concordance searching. For this reason, you may want to disable this option.
- If you have a small to medium size translation memory, enabling this option can help in finding truncated or misspelled words in your translation memory without a large increase in the response time for concordance searching. So it is recommended only to small to medium-sized translation memories.
Enabling character-based concordance search for a translation memory allows Trados Studio to index groupings of characters within a word. This results in more partial (fuzzy) matches and bigger indexes. For example, if you enable Character-based Concordance Search and you search a translation memory for the word Resource
, your search results may include Resource
, Resources
and Sources
. The search results include all three words because they each contain groupings of characters that are the same. Word-based concordance search would not find Sources
.