+ All Categories
Home > Documents > An Introduction Part II: Enabling Internationalization.

An Introduction Part II: Enabling Internationalization.

Date post: 26-Dec-2015
Category:
Upload: nancy-webster
View: 217 times
Download: 2 times
Share this document with a friend
128
An Introduction Part II: Enabling Internationalizat ion
Transcript
Page 1: An Introduction Part II: Enabling Internationalization.

An IntroductionPart II: Enabling

Internationalization

Page 2: An Introduction Part II: Enabling Internationalization.

License

This presentation and its associated materials licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.5 License. You may use these materials without obtaining permission from the author. Any materials used or redistributed must contain this notice.[Derivative works may be permitted with permission of the author.]This work is copyright © 2008-2011 by Addison P. Phillips

Page 3: An Introduction Part II: Enabling Internationalization.

Who is this guy?

• Globalization Architect, Lab126 We make the technology behind the Kindle

• Chair, W3C Internationalization WG

Page 4: An Introduction Part II: Enabling Internationalization.

Internationalization is:

• the design and development of a product that is enabled for target audiences that vary in culture, region, or language. [W3C]

• a fundamental architectural approach to software development

Page 5: An Introduction Part II: Enabling Internationalization.

Related Concepts

• Localization: creation of a product tailored to a particular target market

• Translation: process of converting text from one language to another

• Globalization: unified approach to creating global products, especially those that support multiple geographies simultaneously

Page 6: An Introduction Part II: Enabling Internationalization.

Mystic Numbering (M4C N7G)

Opinions differ on capitalization (C12N);choose from: i18N I18n I18n I18NVery geeky; not very internationalized (I19G?)

I N T E R N A T I O N A L I Z A T I O N

I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 N

I18N

Localization = L10NGlobalization = G11NCanonicalization = C14NAccessibility = A12Y

Page 7: An Introduction Part II: Enabling Internationalization.

The Internationalization Approach

• Gather requirements globally• Enable• Externalize• Customize• Test and support globally• Localize

Page 8: An Introduction Part II: Enabling Internationalization.

The Internationalization Approach

• Enabling—the same code supports multiple regions or cultures. Sometimes called a “global binary”.

• Externalization—plan for localizability by separating “content” from code. This makes localization for specific languages, regions, or cultures easy, fast, and cheap.

• Customization—add culturally specific functionality, presentation, or content to an application.

Page 9: An Introduction Part II: Enabling Internationalization.

A Global Approach

• Internationalization turns technical problems into business decisions

• Balance priorities based on real user distribution/requirements– Consider global user population as a whole– Consider specific market requirements on an equal

footing– Potential markets for the product

Page 10: An Introduction Part II: Enabling Internationalization.

Internationalization Myths

• We (wrote it in Java/C#, used Unicode, etc.), so it is internationalized.

• We made the assumption that the product would only ever have English screens: all our users understand it anyway.

• A localized product is internationalized.• An internationalized product is slow/slower.• It takes longer to write internationalized code.• We can’t read the screens/it is too hard to test.• We have no intention of localizing, so no need

to internationalize.• We don’t have any customers there.• The users in (some country) never complained,

so it must work.• This product is 100% fully internationalized.

We need special experts.We need an extra development cycle.We need six more months to build it.We need people who speak (language).

Page 11: An Introduction Part II: Enabling Internationalization.

Internationalization Truths:“Well, it depends…”

• Generalize designs– Locale independent data structures– Locale sensitive display

• Externalize cultural or linguistic variations• Customize as a last resort

Page 12: An Introduction Part II: Enabling Internationalization.

Buy In: The Key to Success

• For internationalization to be a success over time, there must be commitment:– Management– Product Team– Development Team

• All developers, not a splinter group

Page 13: An Introduction Part II: Enabling Internationalization.

Addressable Market:

Why Do Internationalizatio

n?

Page 14: An Introduction Part II: Enabling Internationalization.

Globalized Product Development

Internationalization turns technical problems into business decisions.– Localization: Choose which markets to translate user

interface or documentation for with no engineering.– Deployment : Choose whether to serve applications from

a single site, cluster of sites, or in each target market.– Development : Add content and features to products as

necessary in each target market.– Integration and Interoperability: Servers and products

can work together around the world, so customers can truly create “Enterprise” solutions.

Page 15: An Introduction Part II: Enabling Internationalization.

Development Methodologies

Independent of development methodology Agile? Waterfall? You make

the choice. Encompasses the full

development cycle: Design Development QC Release Support

Develop Roadmap

(global deployment

)

Develop Requiremen

ts&

Architecture

Design(internation

alized)

Code(Enable,

externalize,customizabl

e)

Test(non-

English/non-ASCII)

RTM/GA(by market)

Develop Requiremen

ts(all

customers)

Page 16: An Introduction Part II: Enabling Internationalization.

The Customization Approach

• Let’s do it in a separate release.• Let’s make a branch for the international

customers.• Let’s get a special team of people to work on

the international release.

Page 17: An Introduction Part II: Enabling Internationalization.

How That Model Really Looks

Time

Main Line

sexy new features

bug fixes

1.0 1.0a 2.0

International Branch

Merges and Fixes

Lots more peopleand cost Internation

al Release 1.0

functionality gaps: intl users waiting for 2.0i now

Lost $ and opportunitylots of cost to get there

1.0i

Page 18: An Introduction Part II: Enabling Internationalization.

The Problem with Customization

Code forks. (double, triple coding)Lag time for international releases.Non-adoption of localized release.Full regression of every language.Quality or commitment perception.Lack of data exchange between language versions.Difficult to repeat (every version is a repeat)Proliferation of bugs and of support problems. International features are cancelled.Core product still doesn’t work/can’t address similar markets.Loss of market share.

Page 19: An Introduction Part II: Enabling Internationalization.

ANALYZING AND DEVELOPING A DESIGN

Large Animal Pictures

Page 20: An Introduction Part II: Enabling Internationalization.

dates

numbers

images

colors

addresses

local rules

strings

Your Application

local rules, regulatory requirements, postal addresses, default bookmark lists, your company’s customer service phone numbers

The Problem

Page 21: An Introduction Part II: Enabling Internationalization.

The Solution

Locale-independent global binary

Locale-dependent resources

(includes code)

Page 22: An Introduction Part II: Enabling Internationalization.

Large Animal Pictures

Software ComponentOutput

Global Code

Reso

urce

s

I/O

Input

Page 23: An Introduction Part II: Enabling Internationalization.

Enterprise Animal Pictures

Business Logic

Data Store

Front End

Operating Env.

Your System

API

UnicodeLegacy

Encoding

Detect / Convert

Capture Encoding

Detect / Convert

Unicode Cloud

Unicode Interface

Convert to Legacy

Partner/Content Provider

Page 24: An Introduction Part II: Enabling Internationalization.

Internationalization Issues

• Text Processing– Character encodings, including Unicode, spelling, word breaks, collation,

and so on• Language

– Of the software (localization)– Of solutions built using the software (localizability, data)

• Locale-affected formats– dates, numbers and the like

• Regionally-affected formats– names, addresses, currency, and the like

• Time-related issues – time zone, calendar, holidays, work rules and the like

• Cultural adaptation– presentation, style, position, color use, and the like

• Legal requirements– accessibility, SOX, DRM, moderation, security, content, and the like

Page 25: An Introduction Part II: Enabling Internationalization.

Levels of Enablement

• Not Enabled• Single-Language-at-a-Time (SLAAT)

All components run in the same language and encoding environment correctly.

• Multi-LocaleUnicode support; components run in different locales, languages, encodings, and time zones

Page 26: An Introduction Part II: Enabling Internationalization.

Test Your Assumptions

Gender: Male× Female

Page 27: An Introduction Part II: Enabling Internationalization.

Choose Your Language

Page 28: An Introduction Part II: Enabling Internationalization.

How is this company doing?

Page 29: An Introduction Part II: Enabling Internationalization.

ENABLINGMaking Code Aware of Culture

Page 30: An Introduction Part II: Enabling Internationalization.

What is “enabling”?

• Enabled software:adapts the display, processing, validation, storage, and transmission of data according to the cultural, linguistic, and regional needs of the users

– Text, Characters, and Encodings– Locale Awareness– Times and Time Zones

A “global binary” is a single object-code version that is used in all markets, regardless of localization.

Page 31: An Introduction Part II: Enabling Internationalization.

Don’t Code What You Think You Know

5/2/7 sometime in February? sometime in May?sometime in 2005?

1.234 more than 1000?less than 2?

4.32.MD number, time, currency?morning or afternoon?

Page 32: An Introduction Part II: Enabling Internationalization.

Date Formats

Culture Format Example

U. S. A. mdy, / 2/16/05

France dmy, . 16.2.05

France dmy, - 16-2-05

CJKT ymd, / 2005/2/16

CJKT ymd, 年月日 2005 年 2 月 16日

Japan e¥md, 平成 17 年 2 月16 日

Japan ¥md, / 17/2/16

Page 33: An Introduction Part II: Enabling Internationalization.

Time Formats• U.S.A.: 4:00 p.m.• France: 16.00• Japan: 1600• Japan: ごご4:00• Korea: 오후 4:32• Thai: 16:32 น.• Albanian: 4.32.MD• Arabic: م 04:32

Page 34: An Introduction Part II: Enabling Internationalization.

More Examples

Assumptions about date tokens:USA: Sun, Mon, Tue 3 positions, titlecaseFrench: lun. mar. mer. four positions

lowercaseRussian: Пн Вв Ср two positions, Cyrillic USA: Jan, Feb, Mar 3 positions, titlecaseFrench: janv. févr. mars avr. variable (4 or 5)

positions, lowercaseSpanish (Spain): ene, feb, mar not titlecaseSpanish (Americas): Ene, Feb, Mar titlecase

Page 35: An Introduction Part II: Enabling Internationalization.

Calendars: What Year Is It?

• Legal, ceremonial, or popular requirementGregorian 2012Japan Emperor: 24 Heisei ( 平成 24 年

)Thailand (Buddhist): 2555 (Gregorian + 543)

Chinese (traditional): 4704 (lunar)

Hebrew (lunar)תשסו 5767

Hijri (Islamic) 1428 (lunar)

Armenian 1461 ( )ԹՎ ՌՆԾԶ

etc. etc. etc.

Page 36: An Introduction Part II: Enabling Internationalization.

Weekends and Holidays

• When is the weekend?– Friday is part of the weekend in some countries.

• Both official and unofficial holidays vary widely in number. Here are a few to watch for:– USA: July 4, MLK, President’s Day, Veteran’s Day, Flag Day,

Columbus Day, Thanksgiving…– Japan: Golden Week– China: New Year’s– Britain: Guy Fawke’s Day, Boxing Day– France: Bastille Day– Spain: Reyes Magos

Page 37: An Introduction Part II: Enabling Internationalization.

Calendar Display

Page 38: An Introduction Part II: Enabling Internationalization.

Numbers

Grouping and decimal separators:England: 12,345.67Germany: 12.345,67Switzerland: 12’345,67Swiss money: 12’345.67France: 12 345,67India: 12,34,567.89

France uses a non-breaking space!India: number of digits in groupings changes!

Page 39: An Introduction Part II: Enabling Internationalization.

ListsList delimiters & separators can conflictFrench example:

2 345,67, 1 012,34, 45,67 hard to read

2 345,67 ; 1 012,34 ; 45,67 easier to read

List myNumberList = getList();NumberFormat nf = NumberFormat.getInstance();StringBuffer buf = new StringBuffer();Iterator iter = myNumberList.listIterator();while (iter.hasNext()) { buf.append(nf.format(((Number)iter.next()).doubleValue()); buf.append(“, “);}System.out.println(buf.toString());

List myNumberList = getList();NumberFormat nf = NumberFormat.getInstance();StringBuffer buf = new StringBuffer();Iterator iter = myNumberList.listIterator();while (iter.hasNext()) { buf.append(nf.format(((Number)iter.next()).doubleValue()); buf.append(“, “);}System.out.println(buf.toString());

Page 40: An Introduction Part II: Enabling Internationalization.

Collation ( A F A N C Y W O R D F O R “ S O R T I N G ” )

English: ABC...RSTUVWXYZGerman:AÄB...NOÖ...SßTUÜV…YZSwedish/Finnish: AB...STUVWXYZÅÄÖNorwegian: AB...VWXYÜZÆØÅ

Page 41: An Introduction Part II: Enabling Internationalization.

Organizing Information

• “Alphabet” differences• Additional information

– for example: yomi• ASCII vs. the world• Mixed information sets

Page 42: An Introduction Part II: Enabling Internationalization.

“Should I be writing all of this down…”

• Wide range of variation

• Obscure formats• Difficult to obtain

reliable information on formats

• Lots of work to implement and maintain

Enabling means not having to know (m)any of the details

Page 43: An Introduction Part II: Enabling Internationalization.

Supporting International Formats

• Use neutral data structures– Makes code

independent of locale– Most data types are

locale-neutral:• Boolean• String, char• Number classes• Date, Calendar

• Encapsulate formatting/validation in a function– Format style chosen

dynamically at runtime– Format details don’t

have to be specified or researched

– APIs know the gory details

Page 44: An Introduction Part II: Enabling Internationalization.

Essence of Enabling

• Object to Presentation, Presentation to Object– Integers– Floats– Percents– Currencies– Dates– Times– Durations– Collation (lists)– Weights/measures/sizes– Resources (user interface strings)

Locale user presentati

on

Page 45: An Introduction Part II: Enabling Internationalization.

Locale

• an identifier or data structure that allows programmers to access culturally and linguistically affected functionality in a system.

• Many systems now based on IETF BCP 47; for example JavaScript, Java 7, and CLDR

Page 46: An Introduction Part II: Enabling Internationalization.

Complex Types

• Data structures, APIs, or classes built from basic types must include similar capabilities.– Store data in a locale-neutral or independent format.– Display in a language/regional/culturally sensitive manner– Convert from locale format to locale-neutral or locale-independent

storage format.

Page 47: An Introduction Part II: Enabling Internationalization.

Design Time and Data Structures

• Identify your own “locale bias”– Field names matter!

• “Postal Code”, not “ZIP code”.• Family Name/Given Name, not First Name/Last Name

– Avoid problematic fields• Postal address parsing? Area code? Etc.

Page 48: An Introduction Part II: Enabling Internationalization.

Currency

• Currency formatting is usually similar to number formatting. But things can vary widely here, too:– $1,100.00 [USA]– €1 100,00 [France-Euro]– ¥1,100 [Japan]– 1.100$00 Esc. [Portugal,

obsolete]– SFr. 1’000.00 [Switzerland]

• Currency associated with the locale doesn’t always apply. Store the currency type with value.– Use ISO 4217 std. codes (USD,

JPY, EUR, RUR)• Not always one symbol.• Not always two decimal places.• $100 + ¥100 = $101

• Consider neutral displays!

Page 49: An Introduction Part II: Enabling Internationalization.

Being Locale Neutral

• Avoid or reduce locale-affected display to increase portability– Use unambiguous formats, such as ISO 8601-like

dates, especially in log files and the like• 2005-04-01 14:17:00 UTC

– Use consistent formats (‘user locale’), especially in columns or collections of data

Amount Currency351,234.56 USD102,556.78 EUR65,336.00 JPY

212,345.00 INR

Amount Currency351,234.56 USD102 556,78 EUR

65336 JPY2,12,345.00 INR

Page 50: An Introduction Part II: Enabling Internationalization.

“The String is the Thing”

• Text doesn’t get translated on the fly.• Don’t use text as an identifier or foreign key.

– Use ID Numbers or not-human-readable values instead of requiring text fields to match.

– “Intrinsic” data value versus “display” data value.• Enumerated values displayed as strings.• Use display strings.

Displayed“Accounts Payable”

“pagável de clientes”

Enumerated

ACCOUNTS_PAYABLE

Page 51: An Introduction Part II: Enabling Internationalization.

English-like Construction

• Concatenation– string1 + string2

• Pluralization– Dog + “s” = “dogs”

This topic will be covered in greater depth in the section on localization.

Page 52: An Introduction Part II: Enabling Internationalization.

Databases

• Most databases can only handle one collation sequence per instance or one collation per index.– Remove reliance on alphalists.– Self-collate short lists.– Pre-collate long lists?

• Example: NLS_SORT controls the way Oracle returns data (collation sequence).– Global environment variable.– Not necessarily under your control.– Indices are built on a predetermined or binary sort.

Page 53: An Introduction Part II: Enabling Internationalization.

Enabling Summary

• Understand Encodings and Unicode– All text has an encoding!

• Be Locale-Aware– Create locale-neutral data structures– Separate display from storage

Page 54: An Introduction Part II: Enabling Internationalization.

IT’S ABOUT TIME

Dates, Times, Durations, Calendars a little aside…

Page 55: An Introduction Part II: Enabling Internationalization.

Observed Time

Page 56: An Introduction Part II: Enabling Internationalization.

Incremental Time

• Computed time based on “clock ticks” in an “epoch”– The epochal date is arbitrary. The UNIX epoch is

midnight, January 1, 1970, UTC.

Page 57: An Introduction Part II: Enabling Internationalization.

Field Based Time

• Time based on calendric fields (day, month, year, hour, minute, second)

• Some systems have data types for “field based” time also.

Page 58: An Introduction Part II: Enabling Internationalization.

What is a Time Zone

• A time zone is a geographical region or area that has common rules for determining the local observed time as it relates to monotonic (computer) time.

• Distinctions include:– Offset from UTC– Daylight Savings (Summer Time) behavior– Historic changes in offset or DST behavior– Political control

Page 59: An Introduction Part II: Enabling Internationalization.

Durations and Repeating Events

Wall-time: this meeting is at 2 PM Pacific time every Tuesday

– interval between meetings may vary in number of seconds

• Daylight time transitions• Changes in DST rules

Fixed-duration: run the virus scanner every 57 minutes

– interval is always 342000 milliseconds

Page 60: An Introduction Part II: Enabling Internationalization.

Time Zone Affected Scenarios

• Zone independent– only “incremental” times

are necessary• Local time, past only

– future changes to time zone rules not applicable

– example: logging system• Local time, both past and

future– time zone rule changes

may affect some time values

– example: calendar program

• Floating times– events not tied to a specific

time zone– example: birthdate, start date,

definition of “night” for phone usage

• Recurring events– events that recur—sometimes

during and sometimes not during daylight savings.

– example: weekly status meeting

Page 61: An Introduction Part II: Enabling Internationalization.

Time Zone Scenarios

• Zone Independent — generally timestamps that don’t refer to a specific time zone.– Record local offset or (better) use UTC– May want wall time for analysis

Page 62: An Introduction Part II: Enabling Internationalization.

Time Zone Scenarios

• Local Time (Past Only)—times that cannot change their relationship to DST– Store zone ID and time value

[may store offset instead of zone ID]

• Local Time (Past+Future) — time values may need to change if DST rules change– Store original offset along

with zone ID and time value– May require a database crawl

if DST rules change

Page 63: An Introduction Part II: Enabling Internationalization.

Time Zone Scenarios

• Floating Times — times that don’t change regardless of where you are in the world.– Publication dates– Birth dates (or any anniversary date)– Etc.

• Handle using UTC andavoiding zone casting

Page 64: An Introduction Part II: Enabling Internationalization.

Time Zone Scenarios

• Recurring Events — time values that occur in both DST and non-DST time– Store time, recurrence period, zone ID, original

offset, and whether to tie recurrence to DST

Page 65: An Introduction Part II: Enabling Internationalization.

OffsetEtc/UTCEtc/GMT+1

Time Zone Identifiers

• Often based on the IANA time zone database (tzinfo) [formerly “Olson IDs”]

Continent/CityAmerica/Los_AngelesEurope/ParisAsia/TokyoAntarctica/DumontD

Urville

Ocean/Island(City)

Atlantic/CanaryPacific/AucklandPacific/Pago_Pago

Continent/Region/City

America/Indiana/ Indianapolis

Page 66: An Introduction Part II: Enabling Internationalization.

Time Zone Hints

• Only 21 countries have more than one time zone (if you know the country, you often know the time zone)

• Argentina, Australia, Brazil, Canada, Chile, Democratic Republic of the Congo, Ecuador, France, Greenland, Indonesia, Kazakhstan, Kiribati, Mexico, Micronesia, Mongolia, New Zealand, Portugal, Russia, Spain, and the United States.

– Of these, most have maritime or overseas regions. Examples:

• Ecuador: Galapagos Islands• Chile: Easter Island• Portugal: Azores

Page 67: An Introduction Part II: Enabling Internationalization.

Locale-Neutral Formats

• Use locale-neutral formats for interchange:– ISO 8601– Incremental time values (e.g. time_t)– Distinguish time zone if necessary for

interpretation• Offset is not the same as time zone

At any given time, in UTC, it is the same time everywhere that time is measured.

SQL data types and XML formats are often field-based, while programming languages are usually incremental.

Page 68: An Introduction Part II: Enabling Internationalization.

Formatting Dates and Times

Requires more than just a locale!datetime zonecalendar

value being formatted

defines relation to “wall time”

defines rules for calculating field

values

1034197545321L

Asia/Tokyo

Japanese Imperial

October 10, 14H 6:05:45 AM JST

Page 69: An Introduction Part II: Enabling Internationalization.

Externalization

Making software localizable

Page 70: An Introduction Part II: Enabling Internationalization.

What is localization?

“What is localization?” Zula asked. Peter sighed, letting her know it was a stupid question. “Translating foreign software into Hungarian, making things work correctly in the special environment of Hungary,” Csongor explained, and Zula thought that she could glimpse, here, in the way that he contentedly explained things, Csongor’s father the school-teacher.

Reamde by Neil Stephenson

Page 71: An Introduction Part II: Enabling Internationalization.

What is Localization?

• The process of tailoring a product to a specific target market.– Translation of messages– Adaptation to local preferences– Addition (or subtraction) of content or features

Page 72: An Introduction Part II: Enabling Internationalization.

Localization is Obvious

… but it isn’t “internationalization”• Localizability is internationalization.

– Externalize text– Externalize presentation– Dynamic composition– Distribution of language content– “Plug-in” features

Page 73: An Introduction Part II: Enabling Internationalization.

What is a ‘Resource’?

any application component loaded dynamically at runtime, rather than compiled into the application

In localization: source code files containing language, region, or culturally-affected materials

– Text– Error messages – Icons– Pictures– Fonts– Colors– Graphics– Sizes– Positions– Magic Numbers– Mnemonics (“Alt+G”,

“F4”, etc.)– File Locations– Dictionaries– Glossaries– Grammar Rules– Code

Page 74: An Introduction Part II: Enabling Internationalization.

Why Resources?

TextError messages IconsPicturesFontsColorsGraphicsSizesPositionsMagic NumbersMnemonicsDictionariesGlossariesGrammar RulesCulturally specific code

Before After

Page 75: An Introduction Part II: Enabling Internationalization.

Avoiding Forks

Global Binary

Resources

English Version

ResourcesResources

ResourcesLanguage +1 Version

Page 76: An Introduction Part II: Enabling Internationalization.

Forked Code Woes

• Hard to fix and maintain• Different versions in the field• Delays in releasing localized product• Different functionality by region• Confusing for customers/users• Versions are not interoperable and might not

be able to exchange data!

Page 77: An Introduction Part II: Enabling Internationalization.

More Benefits

• Rename or re-brand product• Fix spelling or grammar mistakes• Fix usability• Make terminology consistent• Test drive new customer experiences, try new

designs, etc.

… all without a rebuild!

Page 78: An Introduction Part II: Enabling Internationalization.

"Project-Id-Version: blanket 1.0\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2011-03-23 15:43-0700\n" "PO-Revision-Date: 2011-03-23 15:43-0700\n" "Last-Translator: Richard Gillam <gillam (a] lab126.com>\n" "Language-Team: en <kindle-i18n-team (a] lab126.com>\n" "MIME-Version: 1.0\n" 20 "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" # font msgid “my.font.name" msgstr "dialog" #: progress bar in point based msgid "progress_bar.rect" msgstr "43.11,64.67,172.45,12.93"

msgid "progress_bar.border" msgstr "2"

# bounding box: x_pos,y_pos,width,height msgid "shutdown.cust_service.header.rect" msgstr "0,14.65,258.68,12.07"

msgid "shutdown.cust_service.header" msgstr "Repair Needed"

Page 79: An Introduction Part II: Enabling Internationalization.

What’s wrong here?

String1 = There areString2 = noString 3 = tables inString4 = filesString5 = .

Messages I Could Build:There are files.There are no files. There are 50 files. There are tables in files. There are no tables in files.There are 50 tables in no files.There are tables in.

Page 80: An Introduction Part II: Enabling Internationalization.

Let’s Google Translate That!Messages I Could Build:

There are files.There are no files. There are 50 files. There are tables in files. There are no tables in files.There are 50 tables in no files.There are tables in.

Il ya des fichiers.Il n'y a pas les fichiers.Il ya 50 fichiers.Il ya des tables dans des fichiers.Il n'ya pas de tables dans des fichiers.Il ya 50 tableaux dans aucun fichier.Il ya des tables po.

Page 81: An Introduction Part II: Enabling Internationalization.

Don’t Build Text From Fragments

• Text fragments are hard to translate– Fragments may not follow grammar rules– Cannot know which parts go together– Parts can be reused in incompatible ways

• Internationalization APIs offer “patterns” to fix:

[] files out of [] were deleted.

An error occurred at [] on [].

Page [] of []Processing: []%

complete.

Page 82: An Introduction Part II: Enabling Internationalization.

Example: MessageFormat (Java)

There were {0} tables on {1}.

There were {0,number,integer} tables on {1,date,short}.

{1,date} に {0,number,integer} のテーブルがあった。

• Number replacement variables.• Provide typing and formatting information where possible.• Externalize as a single unitary string.

Page 83: An Introduction Part II: Enabling Internationalization.

What’s My Gender

“Documenti del Chris“"Documenti della Chris”

"Documenti - Chris"

Page 84: An Introduction Part II: Enabling Internationalization.

More Issues With Text Composition

– There were one errors found.– You have earned your 22th set of bonus points.

Page 85: An Introduction Part II: Enabling Internationalization.

Sentence Parts Must Agree

• Endings, Gender, Plurality, Case– e.g. Japanese counting uses different words for

different kinds of objects– e.g. Slavic languages use different endings for

singular, few, many…

Page 86: An Introduction Part II: Enabling Internationalization.

Complex Message Formatting

There were no errors.There was 1 error.There were 2 errors.

“choice format” APIs allow for different resources to be used based on runtime values.

0:There were no errors.1:There was {0} error.2:There were {0} errors.

0:не было ошибок 1:была {0} ошибка 2:были {0} ошибки 5:были {0} ошибок

The number of resources may need to vary by locale or language

Examples: ordinal numbers (1st,

2nd, 3rd, 4th, etc.) complex messages,

such as “27 seconds ago” vs. “10 minutes ago”

Page 87: An Introduction Part II: Enabling Internationalization.

Images and Icons• Avoid metaphors• Avoid cultural sensitivities• Avoid body parts• Replace as necessary

• Avoid putting text into graphics

Graphic: $20Text: $0.06

Page 88: An Introduction Part II: Enabling Internationalization.

Images and Culture

• Beware your biases—even “good” ones.

Meet your friends on our new social website

for India

Page 89: An Introduction Part II: Enabling Internationalization.

Isn’t it Swell?

English is very succinct.– Words in other languages

are longer– Sentences are longer– Characters may be larger

Page 90: An Introduction Part II: Enabling Internationalization.

More Swollen Text

• 30% in length (alphabetics, abjads, etc.)• 30% in height (ideographics)• But… a rule of thumb, not a “fact”

– Measure your results with care.

Page 91: An Introduction Part II: Enabling Internationalization.

A Cautionary Tale

Page 92: An Introduction Part II: Enabling Internationalization.

GUI Layout

Page 93: An Introduction Part II: Enabling Internationalization.

Dereferencing

• Minimize sentence building• Minimize arguments per string• Use subject:predicate wherever possible

When you can do this:Balance: $100.00

Don’t do this:Your balance is $100.00.

Page 94: An Introduction Part II: Enabling Internationalization.

Dynamic vs. Static Layout

• Magic numbers• Externalized layouts• Mnemonics• Colors

Page 95: An Introduction Part II: Enabling Internationalization.

Localizing Styles

• Bolding is not universal for emphasis– Italicization, Capitalization, etc. are also not

universal (some scripts don’t have these attributes)

• Use Logical not Presentational names– Describe the function not the appearance. For

example, use “emphasis” instead of “italics”.

中国 Amikake Wakiten

Page 96: An Introduction Part II: Enabling Internationalization.

Use of Color

“Going Down”

“Going Up”

Page 97: An Introduction Part II: Enabling Internationalization.

Non-Translatable Resources

• Some content should be externalized but not translated– Sometimes referred to as “DNT” for “do not translate”

• Externalize? Yes…– Segregate DNT material from translated material if possible (by

using separate resource files or separate resource blocks within a file).

– Developers can’t always tell when something should or should not be DNT… and neither can translators (context is missing)

Page 98: An Introduction Part II: Enabling Internationalization.

The “Locale” in “Localization”

• Resources “fall back” to find the best match Global Binary

Resources

zh-Hans-SG (Chinese, Simplified script, Singapore)

zh-Hans (Chinese, Simplified script)

zh (Chinese)

(root)

Fallin

g b

ack

Page 99: An Introduction Part II: Enabling Internationalization.

Sparse Population

• A given language resource may not contain a complete set of resources.– Some resource language fall back for each sub-

resource (such as a particular value)

“appName” “Demo”“maxRows” 57“dialogTitle” “Hello World”

“appName” “Démo”

“dialogTitle” “Bonjour monde”

Page 100: An Introduction Part II: Enabling Internationalization.

Getting the Right Locale

Business Logic

Data Store

Front End

Operating Env.

client

Client Locale

Server Locale

API Request Locale

System Mgmt Locale

One request might serve multiple purposes or be seen in multiple contexts

Page 101: An Introduction Part II: Enabling Internationalization.

Resources and Translation

“key”, “display string”“dialogTitle”, “Dialog Title”“aMessage”, “This is a message.”

Pseudo-Translation

“key”, “ðìsplàÿ stríñg”“dialogTitle”, “Ðîálòg Tïtl蔓aMessage”, “Thìß ís â M

ésßãgê.

Page 102: An Introduction Part II: Enabling Internationalization.

Pseudotranslation

Page 103: An Introduction Part II: Enabling Internationalization.

Keyboards

Page 104: An Introduction Part II: Enabling Internationalization.

Input Method EditorsSome languages require software to assemble keystrokes into characters

Asian languages with vary large character sets Complex scripts with vowel-killers and other

contextual editing requirements

Applications that interact directly with key-pressed events can disable or disrupt IME input.

On- and over-the-spot editing

Page 105: An Introduction Part II: Enabling Internationalization.

Customization

Page 106: An Introduction Part II: Enabling Internationalization.

When is it okay?

• Content should be highly localized or have locale-specific requirements: – customization lets you address

this requirement in the most localized possible manner

Page 107: An Introduction Part II: Enabling Internationalization.

dates

numbers

images

colors

addresses

local rules

etc.

Externalization again

Your Application

local rules, regulatory requirements, postal addresses, default bookmark lists, your company’s customer service phone numbers

Page 108: An Introduction Part II: Enabling Internationalization.

Externalization again

Locale-independent global binary

Locale-dependent resources

(includes code)

Page 109: An Introduction Part II: Enabling Internationalization.

Large Animal Pictures

Software ComponentOutput

Global Code

Reso

urce

s

I/O

Input

Code can be a resource!

Page 110: An Introduction Part II: Enabling Internationalization.

Customization Examples

Postal address validation

Postal code validationTelephone number

formatter“Personality” questions

blood type vs. sun sign

Personal name formatterfirst/last position, space,

highlighting, formality, etc.Tax codes and shipping

schedules

Generic API

Generic Implementation

USImplementation

DE Implementation

?? Implementation

Page 111: An Introduction Part II: Enabling Internationalization.

Example: Postal Addresses

address1 varchar(32)

address2 varchar(32)

city varchar(16)

state char(2)

zip char(5)

country char(2)

address1 varchar(64)

address2 varchar(64)

city varchar(64)

province varchar(64)

postcode varchar(64)

i18n

country=US, postcode=‘WC2 1GH’ // error

country=UK, postcode=‘95111’ // error

country=DE, postcode=‘1A4 喪’ // okay?

public interface Address {

public class USAddress extends genericAddress {

public class UKAddress extends genericAddress {

public class genericAddress implements Address {

Page 112: An Introduction Part II: Enabling Internationalization.

Building Global Software

Beyond Just Coding: Localization, QA, and all that

Page 113: An Introduction Part II: Enabling Internationalization.

The Internationalization Cycle

• Encompasses the full development cycle:– Requirements– Design– Development– QC– Release– Support

Develop Roadmap

(where is the product going?)

Develop Requirement

s&

Architecture

Design(internationa

lized)

Code(Enable,

externalize,modularize)

Test(non-

English/non-ASCII)

RTM/GA(by market)

Support Issues

and Requests(all

customers)

Page 114: An Introduction Part II: Enabling Internationalization.

What is “internationalization QA”?

• Does the enabled product work correctly?– Non-English configurations– Non-ASCII data and encoding support– Cross time zone support– Market specific features or customizations

• Does localization appear correctly?– Is the product localizable?

What makes this different from “regular” QA?

Page 115: An Introduction Part II: Enabling Internationalization.

Growing (and Pruning) the Matrix

Include non-English configurations in your test matrix; include non-ASCII data in your tests.

Be prepared to prune the test matrix.

Page 116: An Introduction Part II: Enabling Internationalization.

What to Test With

– Test Non-English configurations• Non-English locales (lying to your machine)• Native configurations (when does it make sense?)

– Test Non-ASCII data• Encodings, encodings, everywhere• Non-ASCII character values

– Test Across Time Zones• Two or more time zones; consider international date

line (“it’s tomorrow in Japan”) and DST issues

Page 117: An Introduction Part II: Enabling Internationalization.

Planning Testing

Initially• Get tools that are

enabled!– Automation allows

greater coverage, but only if it works.

• Plan encodings and locales as part of the test matrix.

• Acquire third-party products as necessary.

Increasing Maturity

• Use test driven development practices.

• Get developers to write unit tests that are internationalized.

• Put the ‘i18n’ bugs into the regression suite.

Page 118: An Introduction Part II: Enabling Internationalization.

Configuring Machines

Create both native and simulated environments:– Native operating systems may have minor but

sometimes critical differences (folder names, keywords, localized registry entries)

– Most features don’t run into native differences (easier to work with English-localized machines)

– Don’t buy physical keyboards (use software keyboards) unless your application relies on scan codes from keys

Page 119: An Introduction Part II: Enabling Internationalization.

Localization

Page 120: An Introduction Part II: Enabling Internationalization.

Incorporate

Localization is part of the release process too.– Changes to the user interface cost the localization

team time and money.– (Changes to the product cost the documentation

and QA folks too)• May need to institute change control or a UI

freeze

Page 121: An Introduction Part II: Enabling Internationalization.

Simultaneous Shipment (Simship)

Ideally, to maximize opportunity, ship the target languages the same day as the source language.

– It might not make sense for your product.– But it might not be as difficult as you think it is. It

might even be good for you.

Page 122: An Introduction Part II: Enabling Internationalization.

Distribution of Content

• How does the localized text get into the running product?– Satellite assemblies, DLLs, shared libraries– Message catalogs– Special directory– Database– Etc.

Page 123: An Introduction Part II: Enabling Internationalization.

More Distribution

• “Specific Language” (per-language)

• “Language Included” (one or more languages)

• “Language Pack” (product plus something)

English

German

French

English

German

French

English

German

French

Global Binary+

Page 124: An Introduction Part II: Enabling Internationalization.

Completing the Product

• Static content is often under source control and can be localized “normally”

• Dynamic content may include the initial set of data or other items which need to be localized beyond software.– Demos and Demo Data– Dictionary, Language add-ons– Local offers, links to Web store, etc.– Packaging– Regulatory

Page 125: An Introduction Part II: Enabling Internationalization.

Quality Checking and Development Methodologies

• Translation is a human-oriented task. – Translation time lines are linear

with volume.• Localized product should be

tested for functionality– translation can break things– usually the first language finds

most of the bugs• Translations should be checked

for quality

• Development cycle has to include time for translators and quality assurance to catch up.– This does not mean “no agile”

or “no changes”– Do pilot language(s) or moving-

target translation; do better UI design and usability reviews; etc.

Page 126: An Introduction Part II: Enabling Internationalization.

Summary

Page 127: An Introduction Part II: Enabling Internationalization.

Internationalization

… is a fundamental architectural approach: it is how software is built.– Design– Enabling– Externalization– Customization– Testing and Support– Lifecycle

Page 128: An Introduction Part II: Enabling Internationalization.

Q&A

Would you write the code for I18N on the whiteboard before you go?

#define UNICODE#import I18N.h


Recommended