|
Corinne McKay, CT ATA-Certified French to English Translator/ Traductrice Agréée Français-Anglais corinne@translatewrite.com (+1) 303-499-9622 |
Open Source Update Issue 15- August 2006Open Source Update is an e-newsletter for language professionals who are interested in free and open source software. The newsletter is geared toward translators who are not (yet!) heavy-duty users of free and open source software, but of course anyone is welcome to subscribe. All articles may be freely reproduced or redistributed for non-commercial use with attribution to the authors. To subscribe (free): send an e-mail with "Subscribe" as the subject line to
opensource@translatewrite.com
In This Issue
News From the World of Open SourceReaders, we know you've forgiven us before, and we're hoping you'll do it again... After promising in April to go back to a monthly publication schedule, we've left you to spend the long hot days of summer with no news from the world of open source (or at least none that came from us!). Like an unreliable romantic partner, we could offer excuses, promise to mend our ways, and send flowers, but we think that an actual issue of the newsletter might be a better apology. And, since we're now up to 350 subscribers (from 42 in 2005), worldwide floral delivery is a little outside our budget. Our one concrete excuse for the delay is the publication of Corinne's book How to Succeed as a Freelance Translator in June, notable in part because it was written and designed using free and open source software exclusively. In the spirit of "where are they now," we've contacted some of our regular contributors to find out what they've been up to this summer... read on! Okapi Framework UpdateYves Savourel sent the following update on the Okapi Framework, the set of components and applications for "building interoperable tools for the different steps of the translation and localization process," which we've written about a few times before. Yves reports that: Okapi has been continuing slowly to grow during the Spring and Summer: The last important release was Release 9, at the beginning of August. One of the important changes was the port of all the tools and libraries to .NET version 2.0. Various enhancements and bug fixes were made also for several utilities, based on the feedback of the users, such as the Term Extraction, the RTF Conversion, or the Pseudo-Translation functions. The support for ITS (Internationalization Tag Set) was further developped for the XML Filter. You can find the information on the latest release here (http://okapi.sourceforge.net/Release/Rainbow/ReadMe.htm). Currently filters are available for the following types of formats: Windows RC, Java Properties, PO, .NET ResX (or compiled ResX), INX files, Illustrator files, Table/CSV-type files, Wordfast TM, Trados-Text TM, as well as any format that can be parsed with Regular-expressions (using the Script Filter). XML Filter is also available but as beta. The list is here (http://okapi.sourceforge.net/Release/Filters/Help/index.html) The next release will introduce the beta version of an HTML Filter, as well as Album, a tool that provides some Clipboard-related function. Album existed in the freeware suite the Okapi tools are coming from, but this is a complete re-write and, while ultimetely Okapi's Album will have all the features provided by the old Album, the current version has only functions that are not in the old version. For example there is a method to get a text automatically translated using MT in any application by just selecting the source text, doing a Copy followed by a Paste command. The translation is obviously only MT-quality, but it could be useful for many situation (getting the gist of an email for example, getting a word translated, etc.) Translate.org.za UpdateDwayne Bailey still qualifies as one of our most inspirational interviews ever; we're so impressed with him, we even let him spell localization with an "s." Dwayne heads up Translate.org.za, a project that focuses on localizing Free and Open Source Software into the 11 official languages of South Africa, and he has been busy as well (with 11 languages, what else could we expect?). Dwayne sent updates on two projects: Translate.org.za and WordForge. Open Source Update's advice to Dwayne: keep up the great work, and get some sleep! Translate.org.za: We have just wrapped up a Government funded project which saw us release OpenOffice.org in 11 languages, Mozilla Firefox and Mozilla Thunderbird, also in 11 languages. Both of these are firsts in South Africa and perhaps even Africa. We are seeing proprietary vendors rushing to catch up which is good for us as it helps create more awareness of localisation and of course demand for localisation. I still firmly believe that if anyone wants to see a localisation industry in their language that they should translate FOSS to stimulate the growth and of course to hone their own skills. We have also putting the finishing touches on a South African keyboard which allows for instance the Venda to be typed on a computer for the first time. Our online translation site hoseted at pootle.translate.org.za is about to go live, allowing much broader community participation in the localisation process.WordForge UpdateThe WordForge Foundation has been established to assist digitally endangered languages. Through WordForge we offer assistance to minority languages that might not be able to amass the technical skills required to build a localisation. WordForge brings together a number of groups from the FOSS localisation field, including KhmerOS translating into Khmer in Cambodia, IT46 who assisted in the Swahili localisation of OpenOffice.org and of course Translate.org.za A key project at the moment of the WordForge Foundation is the creation of tools to assist localisers. These include: If any readers would like to further the aims of WordForge or have a localisation project for which they need technical assistance then feel free to contact us and we can discuss methods of collaboration. OmegaT UpdateOmegaT, the well-known free (in both senses!) translation memory application written in Java, has some new features as well. Visiting their website, we noticed that the site itself is now fully translated into Portuguese, and OmegaT team member Jean-Christophe Helary sent this update: Version 1.6.0 is now a release candidate nb 12a. We are working hard on the manual to get it ready for stable release. Now we have a new docking interface, a docbook filter, a much better xhtml filter and an experimental PO filter. We also have greatly improved the OOo filter, to support a number of subflows in the documents (like notes, index items etc, I haven't kept a definitive track but that should be in the changes.txt). Jean-Christophe promises to be back in touch with a feature article once the next stable release of OmegaT is out. Open Source Self-Publishing Tools: Designing a Book with LyXby Corinne McKay and Daniel J. UristLanguage professionals should be interested in self-publishing tools, for a variety of reasons. Not least of all, the mainstream publishing houses in the United States aren't very receptive to publishing translations; according to the wholesale book distributor Bowker, of the roughly 195,000 new titles printed in English in 2004, only 891, or an unimpressive 4 tenths of a percent, were works of adult literature in translation. So, one option for book and literary translators (once you've obtained the translation rights, of course) is to take the matter into your own hands and self-publish. Another excellent use of self-publishing tools is the production of books with a very targeted niche market, which are usually unappealing to mainstream publishers but can be profitable for the author. Specialized dictionaries, translation software manuals and translation business guides are some examples of possible candidates in this department. In this article, we'll focus on LyX, a very handy pre-press tool that is cost-free and available for Linux, Mac and Windows. Introduction Self-publishing is becoming easier and cheaper, thanks in part to improved printing technologies and desktop publishing tools. If you've ever considered writing a book, you may have looked at the layout capabilities of OpenOffice.org Writer, AbiWord, KWrite or other word processing programs. While these tools can produce adequate results for many types of documents, it's also worth considering LyX, an open source (GPL) desktop publishing application that, with a bit of work, can create a really professional looking book that is indistinguishable from a book produced by a mainstream publishing house. When we began looking for an application to design Corinne's book How to Succeed as a Freelance Translator (Lulu Press, 2006), we road-tested a few pieces of software, including the word processors referenced above. The problem? Although features such as Writer's PDF export are easy to use and produce clean, easy-to-read documents, there was something subtly missing from the design; something that didn't quite look "like a book." It's not that Writer can't be made to produce a high-quality book layout, but the conventions of book design are extremely subtle and tricky to master, at least for those of us who are not professional book designers or typographers. So, we turned to LyX. Our system is Debian Linux, but LyX is packaged and included in the repositories for most of the popular Linux distributions, so installation is a breeze. LyX is distributed under the GPL, and runs on Unix/Linux, Mac and Windows operating systems. LyX is "the first WYSIWYM (What You See Is What You Mean) document processor." When you start up LyX, the user interface has a very similar look and feel to that of a word processor with a stripped down set of toolbars. It includes all the basic wordprocessing functions, such as cut and paste, search and replace, undo and spell checking. The idea behind LyX is that the user selects a document class, such as Article, Report, Book, Letter or Slides, and LyX provides the structure. For example, a Book document allows the user to identify any given piece of text as a Chapter, Section, Subsection, or Subsubsection, etc., and formats the document automatically according to the defined styles for the category, referred to as the paragraph environment, that the user selects. In effect, LyX enforces consistency throughout the document, without requiring (but while allowing) the user to define the style to be used for each paragraph environment. This is especially useful in a book-length document, where consistency can be the difference between a book that looks like a book, and a book that looks like a bound word processing document. Here's an example from LyX's own website: "Let's say you tell LyX that a certain line is a Section title. LaTeX adds the Section to your table of contents, places the Section name into your page header, gives it a special "bold" appearance on the page, assigns it a number or label, and tells other parts of your document what page it's on, for references and citations." LyX also enforces typographic rules by its behavior; for example, hitting the space bar over and over doesn't result in a series of blank spaces in the text, since LyX automatically inserts the correct number of spaces between different pieces of text. LyX works its real magic when the user selects a PDF preview of the print-ready document from the View menu---after a small delay, the on-screen document is transformed into a PDF that looks, well... exactly like a book! The workhorse behind the magic of LyX is Donald Knuth's legendary typesetting system, TeX, and the LaTeX document preparation system built on top of TeX. LaTeX provides the structured document classes and functions such as indexing, page numbering, cross-references and bibliographies; the underlying TeX typesetting engine incorporates intelligent algorithms for tasks such as hyphenation, paragraph breaking and line breaking, which are largely responsible for the "like a book" look of documents produced by LyX. Creating a book in LyX There are several document classes available for LyX that are suitable for typesetting a book, including the default Book class, the KOMA-script Book class (scrbook), and the Memoir class. The default Book class is fairly minimal in terms of features, most notably it lacks paragraph environments for standard front matter and end matter sections, such as Acknowledgments, Publishers, Dedication, etc., with no built-in fine controls for these features. However, the LyX user can insert LaTeX code directly into a LyX document to create these sections, which is referred to as ERT (for Evil Red Text!). KOMA-Script, a mature (in active development since 1994) set of replacement classes for the standard LaTeX classes, provides the scrbook class, which is an excellent option for LyX users, and is the one we chose for our project. KOMA-Script's original purpose was to provide alternative LaTeX classes for German documents, but it is now used by authors in a variety of languages. KOMA-Script's scrbook class offers a comprehensive set of paragraph environments. For example, the Uppertitleback and Lowertitleback environments allow the author to format the upper and lower sections of text that appear on the back side of the book's title page, normally the copyright and cataloging-in-publication data (the book's ISBN number and Library of Congress cataloging information) and any disclaimers, sales information, author contact information, etc. If there is no specific paragraph environment for a section that the author wants to include, scrbook provides useful generic sections. For example, we wanted How to Succeed as a Freelance Translator to include a Colophon, emphasizing the fact that the book was written and designed using free and open source software exclusively. To accomplish this, we used the Addsec* paragraph environment to add a generic section (the * indicates that the section will not appear in the table of contents). A notable feature of scrbook is that it produces books laid out using European typographic conventions, including: the book's lower margin is twice the height of the upper margin, and the inner margin on each page is half the width of the outer margin. While this produces a classic and elegant finished layout, our publisher (Lulu Press; Global Distribution) required equal top and bottom margins and equal side margins, which required us to change scrbook's default settings. It is possible to do this within the scrbook package by using LaTeX commands, but LyX makes it very easy to bypass scrbook's layout engine and change the margins using the Document→Settings→Page Margins menu. The KOMA-Script manual, weighing in at 227 pages, and available in English and German, is not only a comprehensive guide to the KOMA-Script package, but an excellent overview of typographical conventions and principles. Its dedication "To All Friends of Typography!" speaks for itself; this manual is a must-read for the novice book designer. The sections on typography are worth reading even if you do not intend to use KOMA-Script. The Memoir class offers another book layout option, with 307 pages of beautifully typeset manual to go with it. The Memoir User Guide, written by the Memoir class' developer Peter Wilson, contains a thorough introduction to typography (another must-read for the novice book designer), including a section on designing electronic books. The Memoir class is now production-grade, and is in active development. In terms of functionality, Memoir appears to be comparable with scrbook, although it relies more on external LaTeX packages; for example, scrbook includes its own header package, whereas Memoir must use a separate header package such as fancyhdr. In addition to its primary function of typesetting, LyX provides a number of very helpful features for the self-publisher. Conveniently, a Table of Contents is generated automatically. In researching the market for nonfiction books, we learned that having an index greatly adds to the saleability of a nonfiction book; in fact, many libraries are reluctant to purchase nonfiction books that do not have indexes. Since hiring a professional indexer was out of our budget, we turned to LyX's indexing feature, which proved itself more than equal to the task. When you position the cursor at the end of the word that you want to index and select Insert→Index Entry, a dialog box will pop up, with the word automatically inserted in the "Keyword" field. Then, you can either keep the pre-inserted word, or edit it. The edit could take the form of a minor change, such as case, or you can change the word to an entirely different one; for example if you want to associate the word "interview" with the index entry "job search." You can also create a nested index entry by typing an exclamation point after the parent term, then typing a sub-term directly after it, for example "travel!Asia" would result in the parent index entry "travel" and the sub-term "Asia." You can also insert as many index tags after a word as you want. True to its typographic genius, LyX (actually LaTeX) is also smart enough to notice when you insert index tags for the same term on consecutive pages, and will create an appropriate index entry, for example "travel, 47--50." LyX also has excellent support for tables. In fact, using tables in LyX is similar to using tables in HTML. To insert a table, select Insert→Table, and the efficiently designed Insert Table dialog box will pop up. This allows you to select the number of rows and columns that you would like in the table. Then, by right-clicking in the table, you can bring up the Table Settings dialog box, where you can adjust the table's size, orientation, borders, etc. To change the font of the text in your table, highlight the portion of the table containing the text to be changed, and select Edit→Text Style. We used tables for a Glossary (there is also a LaTeX package for compiling a glossary) and various worksheets such as Billable Hours and Hourly Rate charts. LyX's features could be the basis of an article in themselves, but it's worth noting that the software includes: a math editor; support for most European languages and right to left languages, including documents that contain more than one language; support for footnotes and margin notes; comprehensive bibliography support and figure support with rotation, scaling and captions. There are many more features that are relevant to other document classes, such as academic journal articles, etc. Customization in LyX One of the differences between LyX and a traditional word processor is LyX's approach to formatting. Most word processing programs allow regions of text to be tagged with a style. In LyX, every piece of text belongs to a paragraph environment which has a predefined style. For example, someone using OpenOffice.org Writer can choose to use the style options offered by the software, such as Heading 1, Text body, etc. However, most word processor users do not employ styles, but instead do most of their formatting manually, by using different fonts and effects. In LyX, it is not impossible to do manual formatting, but the program is set up to discourage this behavior by limiting manual formatting options. Some basic customization can be done from the Document→Settings dialog box. For example, the user can change the default font for the document, the page size and margins, the bullet styles, etc. For finer control, the user needs to edit the LaTeX Preamble, also found in the Documents→Settings dialog box. In the Preamble, the user can add LaTeX commands to change the default behavior of the document class, including the settings for different paragraph environments, as well as including and configuring additional LaTeX packages to add functionality. While LaTeX code can be daunting, LaTeX's maturity, excellent documentation, widespread use and great support through mailing lists and newsgroups means that a solution to most customization questions can be found with a quick web search. It is certainly possible to use a book class such as KOMA-Script or Memoir without altering any of the paragraph environments (we did this). One thing we did want to change was the default header formatting. This involved loading and configuring the popular fancyhdr package in the Preamble. Fancyhdr gives very fine control of the header and footer formatting, including font and page numbering style. Basically anything you've ever seen in a book header (small caps, a line under the header text, etc.), fancyhdr enables you to do. Another extremely useful package is microtype. Microtype provides subtle expansion of fonts in order to make lines of type fit better into the allotted space. This gives TeX more options for breaking lines, resulting in a more eye-pleasing end product and less manual fine-tuning of the finished text. Microtype works only with the View→PDF(pdflatex) command option. Both of these packages were included with the default Debian LaTeX installation, and there are many more useful packages available at CTAN. Finishing Touches When you're done composing all of your text in LyX, it's time to do
a little cleanup on your document before sending it to press. Because
TeX is a typesetting engine, it may occasionally fail to find a good
place to break a line, resulting in a warning message and a line that
extends into the margin. When this happens, your options are to
rewrite the line so that TeX can find a good place to break it;
manually force a break by using Insert→Special
Formatting→Line Break; or allow TeX to insert extra spacing
between words by using the command sloppypar. This is
done by inserting an ERT code box with Insert→TeX code,
and typing Some publishers, including Lulu Press' Global Distribution option,
require electronic files submitted in PDF format to be created with
Adobe Distiller. This wasn't an option for us, but fortunately we
could also upload the book in PostScript format. To do this, we
created a PDF of the book using View→PDF(pdflatex), which brings up a
PDF version of the book in Adobe Reader. We then saved a copy of the
PDF from Adobe Reader and converted this file to PostScript format
using this command:
Conclusion LyX is an excellent tool for authors at all levels of typographic knowledge and interest. For the first-time author who simply wants a great-looking book, LyX's default settings allow you to enter text as easily as with a word processor, and out pops a book that truly does not look self-published. For the typographically motivated (those who are self-publishing because they lament the decline in editorial and design standards at major publishing houses), LyX offers an almost infinite array of options to satisfy even the pickiest book designer's requirements. Above all, the LaTeX system encourages consistency, which, in a book-length document, is critical to a professional-looking end product. This is precisely where traditional word processors fall short, and is in itself a good reason to investigate LyX. Corinne McKay is a freelance French to English translator and translation industry writer; Daniel J. Urist is a recovering Unix systems administrator. They, their daughter and one-eyed cat are based in Boulder, Colorado. |