SDL WorldServer: Embedded content File Type configuration for XLSX files in WorldServer
000004015|5/3/2017 9:10 PM
SDL WorldServer (all versions)
SDL Trados Studio 2017 / 2019
How do I process my Excel 2007-2016 (XLSX) file containing embedded content such as HTML content - and convert HTML control elements to tags?
For instance, my XLSX file contains these strings:
<li><b>TextText: </b>TextTexTextTexTextTexTextText.</li> <p><b>TextText:</b> TextTexTextTexTextTexTextText.</b> and <b>Textagain</b> TextTexTextTexTextTexTextText.</p>
Starting from WorldServer 11.1.1. and SDL Trados Studio 2017 or 2019, a new Excel filter has been introduced: the Microsoft Excel 2007-2016 Studio File Type.
If you work in these versions of WorldServer, due to a defect with defect ID CRQ-6594 (which will be fixed in WorldServer 11.4.), you will be able to configure the embedded content section only by working on it in SDL Trados Studio 2017 or 2019, exporting the *.sdlftsettings file and then importing it to WorldServer following the steps described in this article:
To configure your Microsoft Excel 2007-2016 File Type in SDL Trados Studio, make sure to enable the processing of embedded content by going to the Embedded content section of the File Type and selecting Enable embedded content processing as displayed below:
Once this option is selected, a pre-set Tag definition rule is applied. This will address most of the HTML syntax.
If you work in an earlier version of WorldServer or SDL Trados Studio, you will apply the Microsoft Excel 2007-2013 Studio File Type. To process your content with this filter, you need to enable the embedded content here as well the same way as described above. Note: for this file type, you can do this directly in WorldServer:
In the Microsoft Excel 2007-2013 Studio File TypeDefault Filter Configuration there is no pre-set Tag Definition Rule, so you need to add this manually.
1- You can use the same Regular Expression as in the latest Microsoft Excel 2007-2016 Studio File Type. The Tag Pair is Placeholder and the Regular Expression is:
Note that you should first add "sdl" as Document structure information.
Your Embedded Content configuration should look like this:
2- Alternatively, you can use this Regular Expression as Tag Pair:
Start Tag: <[a-z][a-z0-9]*[^<>]*> End Tag: <V[a-z][a-z0-9]*[^<>]*>
3- Another possible configuration of the embedded content could be of Tag Type Placeholder with Start Tag:
This is how it should look like:
4- Once you have adapted your Filter/File Type and your Filter Configuration, save your change and re-segment your XLSX source file with these setting in WorldServer or in SDL Trados Studio 2017 or 2019 and review the results.
Note: you might need to adapt your embedded content filter settings for perfect results. The suggested configurations might not address all of your embedded content.