OpenText Output Transformation Server: tips and tricks
This article explains how to solve some common challenges that you can face during a document remediation using OpenText Output Transformation Server.
OpenText Output Transformation Server (OTS) automatically transform and present archived content in a highly usable, navigable and accessible format, such as PDF/UA, for persons requiring accessible technologies.
<P>
tag is used in a header when Auto Detect is enabled
Follow this steps to set the tag that you want to use to auto-tag your document title:
- Select the
PDF Parser component
- Click on one character of your title
- Double click on the character on the left-hand side
- Copy/remember the font
Name
andPoint Size
used
- Back to
XFT Field Technology component
- Edit the field that contains your title
- Click on
Auto Detect
tab - Click on
[…]
close toAuto Detect Options
- Edit
Default
auto detect options
- Right click on
TextToTag[0]
- Click on
+ Add
- Set the following values:
- Name: H1
- TagType: H1
- FonCompare:
- Fontname: CIDFont+F1-Asc
- PtSizes: 15-16
NOTE: Original font name is “CIDFont+F1-Asc”. You must scape
+
character adding a\
before it.
PtSizes
admit exact values and ranges. We are using a range because the exact value is a decimal number.
- Click
OK
button - Click
OK
at theAuto Detect Options
popup - Click
OK
at theXFT Container properties
popup
Now the title is using a H1 tag in the DOM tree.
A paragraph is split in multiple <p>
tags when Auto Detect is enabled
A paragraph is split in multiple <p>
tags when Auto Detect is enabled and you want to use just one <p>
tag for the full paragraph.
How you can do it?. Just follow these steps:
- Edit the field that contains your paragraph
- Click on
Auto Detect
tab - Click on
[…]
close toAuto Detect Options
- Edit Default auto detect options
- Click on
TextJoining
parameter - Edit
LineJoinUnderRation
- Set value to
1.5
- Set value to
- Click OK at the
Auto Detect Options
popup - Click OK at the
XFT Container properties
popup
Now the paragraph is merged, using only one <p> tag for the full paragraph
<TD>
used in table header row instead of <TH>
When you autodetect the columns in a table that contains a header row <TD>
tag is used by default.
Ideally we must use <TH>
in headers cells
Follow these steps to improve it:
- Select you table
- Right click on table
- Click on
Edit
- Click on
Cells
tab
- Right click on
RowHeaderColumns[0]
- Click on
+ Add
- Click on Value cell at the
RowHeaderColumns[0] > [0]
row - Set value to
1
- Click
OK
If you check the DOM tree you can see that now <TH>
is used in the header row
The language of my document is not English
Sometimes you work in document remediation with documents written in a language other than English.
NOTE: You can check the language of your document in Acrobat Reader at Menu > Document properties. At Advanced tab in the Document properties pop-up you can see the language. This language will be used by the screen reader to choose the right voice.
Follow these steps to set the default language of a document in Output Transformation Server:
- Right click on your
Document
at theXFT Field Technology
menu - Click on
Edit
- Click on
DOM properties
tab
- Select the document Language at the
Language
drop-down list - Click
OK
NOTE: If the language of your document is not included in the drop-down list you can use the ISO language code, i.e. you can use
ca_ES
for Catalan.