We've been involved with a translation company (no names) for about a week now, and, frankly, this has been a shocking experience. I decided to share some lessons I learned from this saga to save y'all headache should you embark on a localization campaign.
If you develop an ASP.NET web application and you want to ingrain support for multiple foreign languages you should plan accordingly right from the start! Doing localization as an afterthought will prove difficult, if not impossible! Localization is a commitment which requires discipline to store all text in resource files which are, in essence, XML files. In ASP.NET resource files have .resx extensions and you store each chunk of text as a name-value pair.
For example, suppose you have a login screen which encourages a visitor to register. You might want to add the following entry to your resource file:
<data name="MyWebApp.RegisterPrompt">
<value>Not registered yet? Become a member today!</value>
</data>
A resource file is compiled into a satellite assembly and with minimum of effort you can have you web app speak a foreign language! The whole discussion of localization in ASP.NET is well beyond the scope of this post. Please, see Walkthrough: Localizing Web Forms Pages on MSDN.
The translation services company in question seems to be well-established, and it was recommended to us by a large client of ours. We sent them our resource files for translation and what we received back was shocking.
Lesson #1: Know Thy Unicode
Make sure translators know what Unicode is. This is an obvious tip, but... Properly encoded files should have a proper Unicode Byte Order Mark (BOM) signature. For the sake of this discussion, let's assume we go with utf-8 which should be sufficient for 99% of your needs. Any contemporary text processor must be able to prepend the three-byte signature at the very beginning of a text file.
This is only half of the story, though. Our resource files were being translated into Spanish, and those folks saved translations with an ISO encoding which, quite naturally, maimed all foreign characters. When I opened them up in XMLSpy I saw Japanese kanji characters instead, while Visual Studio.NET couldn't even make heads and tails of the content.
ASP.NET is all about Unicode under the hood which is great. Visual Studio.NET works great with Unicode too, as long as it understands that a file is utf-8 encoded. I've already published some research on this subject in Unicode in Visual Studio.NET 2003.
Rationale: make sure they use software that can save files in utf-8 encoding! Ask them if they understand how to do it. Properly encoded files should be a deliverable.
Lesson #2: Apples Go In, Apple Juice Comes Out
Next, make sure they don't change the format of your files. You give them a clean, nicely indented XML file, and should expect a clean, nicely indented translated XML file back.
In our case, a large resource file was "flattened" into one line. What enraged me is that one file had HTML comments (!) injected by their software right in the middle of our text.
I've worked for a translation company before, and messing up the client's format was out of question. Why they took liberty to reformat our files I don't know. If you take your car to a shop to have the tires rotated, you don't expect them to change your dashboard gadgets with racing type ones because yours look dull.
We happened to have a couple of encoded HTML tags, e.g. <b> and </b> as <b> and </b> respectively. Translators dared to decode them back to <b> and </b> which started breaking visual representation and in the case of an open <br> tag broke validation of the entire XML file!
Rationale: demand same formatting of translated XML files. Don't take a "no" for an answer. If you have encoded HTML don't let them decode it back. They should leave it alone. Convey it to them.
Lesson #3: Know Thy XML
Make sure the translation company you approach has had experience with developers before. They must be familiar with XML. They need to have firm understanding of what validation and well-wormed mean. Ask if they use a tool capable of reading, parsing and validating XML. At least, tell them to load XML files in a web browser and make sure they display correctly.
Rationale: You give them a perfectly valid XML, expect the same back.
Lesson #4: Check Data Integrity
This is a stupid one. Once I pulled a translated resource file in VS.NET and had it create a satellite assembly I noticed that our product was 50% English/50% Spanish. I fished through the main resx and discovered that 200 lines of text were missing from the translation! Didn't see this one coming. They simply lost them. Lost in translation, so to say. :) After a follow-up email, the lost content was found on their end. Sheesh, do I really have to enforce it?
Lesson #5: Know Thy Developers
This goes with the #3. Make sure the translation services company has worked with developers. It's a whole different crowd. Translating business letters ain't the same as translating XML. It would be ideal if they had a techno geek on staff.
Lesson #6: Know Thy Real Estate
Translators need to understand they translate for the web where the length of navigation elements is very limited. "Newspaper headline" style works best.
Lesson #7: Maintenance
Make sure translators maintain history of previous translations. When you update your resx files and email them, translators should pick up and translate what's new and re-translate what has changed. Obvious, but all of the previous tips seem to be obvious at first, too.
Conclusion
Localization doesn't come easy. It's painless if you stick to the game plan and exercise discipline. .NET takes care of the rest and makes it a breeze. Yes, really. I hope this post helps you cross that bridge and saves you some hair pulling.
Any other tips from the field?