Out of love, perhaps? Because when faced with a PDF format, almost everyone swoons. It is safe, strong, unflappable. You can compress it, send it, twist it, tag it, slice it, combine it, and it can withstand anything, without losing its composition, without altering the layout of the page, as fresh as the moment it was produced. It is solid and reliable, always available, the perfect friend.
Well, no, translators are not trembling out of love, but out of dread… at the editing problems they are about to encounter. Years and years of dealing with this format in their day-to-day work have biased them against it, but why is it that this innocent format is so alien to the translation community?
Because the PDF was not invented to be modified! It was invented to be read.
In 1991, Dr. John Warnock launched the Camelot project to make it possible for anyone to take documents from any application and send an electronic version that could be read and printed on any machine, device or system. In 1992 the Camelot project was transformed into PDF (Portable Document Format) and since then these files have become regular fixtures on our screens.
However, with mass sharing, this noble purpose became a liability. The PDF document was a publication format, a printed electronic sheet, after all. It could be annotated, it could be written on; the receiver could leave marks, cut pages and intermix them, but the original text could not be modified. Huh? You can’t! And what do we do with a PDF translation assignment, when we have to modify the whooole text? Well, that’s what they pay us for.
Well, no, the PDF wouldn’t let us. It was a stubborn partner. Users wanted to modify documents to supplement them or to translate them. They asked for the source document (Word, InDesign, PowerPoint) in order to edit it and generate a PDF again. But as soon as one of these files started to circulate, nobody, absolutely nobody, knew who had the source file. And so here we are in the same situation today, confronted with a format that will not go away.
Over time, partial solutions have been developed. Adobe itself has a utility that converts PDF to RTF. Most computer-assisted translation tools, such as SDL Trados, DejaVu or MemoQ, have filters that transform the PDF file into an almost fully editable file. But this is no mean feat. This is a mixed format with: a vector image (not editable), a bitmap (not editable) and text (editable). The problem remains that the transformation sometimes generates almost hellishly formatted files, with a tangle of styles, section breaks and column breaks that can drive anyone crazy trying to maintain the original formatting. Especially in PDF translation, due to the inevitable increase in translated text. Text boxes become misaligned, paragraphs don’t fit, text jumps to new pages, styles are distorted – what a gem!
At LexiaPark, we have adopted the Infix PDF Editor as a solution. The advantage of this software is that it allows you to work directly on the PDF format (passing first through XML). This editing is sometimes complex but allows us to avoid the filters that transform texts into equivalent Word files (Adobe tool, SDL Trados import filters, MemoQ or DejaVu).
And please don’t forget that if you want a translator to despair when you ask him or her to translate a pdf, all you have to do is:
- ask for the translated file to be identical to the original PDF, right down to the smallest layout details,
- provide them with a Word text converted from PDF, without first checking what happens when the text is enlarged by a mere 5%,
- ignore their repeated requests for the original editable file (Word, InDesign, odt, Power Point), and, above all, put pressure on their deadline without making concessions,
- put pressure on them regarding the deadline without making any concessions on the final format.
Unfortunately, we will have to live with these files for a long time, but we should at least be well aware of their limitations.