Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | elaine-holmes |
View: | 216 times |
Download: | 0 times |
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop1
ICU Low-level Utilities and Resource Management
Vladimir WeinsteinGlobalization Center of CompetencyIBM Emerging Technology Center
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop2
Agenda• What is the locale model in ICU?• What is the string related interface in ICU and how
does one use it?• A quick summary of other low-level utility interface
that provide overall support to ICU components?• Quick summary of resource bundles support; What
kind of data can be stored and retrieved natively?• What’s the system locale format? What’s the future
direction of this support?• Migration from 1.4 model.• What else can one use resource bundles for?
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop3
Introduction• Applications have their own resources.• If we want to globalize an application, we need to have
a way to easily switch between different languages and customs.
• Translators should independently be able to customize an application for different markets.
• A way to uniquely identify the place of execution is also needed.
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop4
Locale Model in ICU• Locale is synonymous with user community.• In ICU locale is specified by a Locale object. Locale
object is just an identifier, as opposed to the POSIX locale concept.
• Default locale is the locale user use for their machine. Applications should not change the system default locale.
• Locale can also be specified at run time, and users should be able to switch locales dynamically.
• Many functions of ICU are locale dependent.
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop5
Locale Naming Conventions• “Language_Country_Variant”.• Language is an ISO-639 identifier – en, es, fr.• Country is an ISO-3166 identifier – ES, MX,
US, FR.• Variant is user defined – EURO, NY• Example: en, en_IE, en_IE_EURO.• The more parts locale name has, the more
specific it is.
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop6
Example root Root locale | +----------+-+-------------+----+ | | | | en de ja ru Language | | | | +---+ +-------+ | | | | | | | | US IE DE AT JP RU Country | | | EURO EURO EURO Variant
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop7
Resource Bundle Concept• Resource bundle is a repository of data an application
uses.• They contain application data specific for different
locales.• All the items in a resource bundle can be accessed by
the application.• Resource bundles support is implemented in C
(through UResourceBundle structure), while the C++ class (ResourceBundle) is only a thin wrapper.
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop8
Fallback• Resource bundle should contain only the data specific
for a locale resource bundle is used for.• Root resource bundle contains all the data. It can
contain the data that does not need to be localized.• All locales descend from root and override information
in root.• Two types of fallback
– Locale level– Resource level
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop9
Resource Bundles Support
• Several native data types:– Complex types:
• Tables• Arrays
– Scalar types:• Strings• Integers• Binaries (imported files and hex strings)
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop10
Resource Bundle Format• Resource bundles use their own format at the
moment.• On top level, there is a table:locale_name {
… //data}
• XML format planned for future
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop11
Resource Bundle Format
• Tables:menu { file { name { "&File" } items {
open { "&Open" } save { "&save" } exit { "&exit" }
} }
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop12
Resource Bundle Format
• Arrays:errors {
"Invalid Command","Bad Value", ..."Read the Manual"
}
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop13
Resource Bundle Format
• The most frequent type of data in a resource bundle is an array of Unicode characters (UChar).
simplestring { "A string" }escapedstring { "This string has some unicode
characters: \u0408\u0443\u043d\0438\u043a\u043e\u0434" }
encodedstring { "This string has some encoded characters Јуникод" }
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop14
Resource Bundle Format
• Integers:versionInfo { // a table
major:int { 1 } // of integersminor:int { 4 }patch:int { 7 }
}
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop15
Resource Bundle Format
• Binaries:– Imported Filessplash:import { "splash_root.gif" }
– Typed Valuespgpkey:bin { a1b2c3d4e5f67890 }
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop16
Using Resource Bundles• Initialization and disposalC++ResourceBundle rb1((char *)0, Locale("root"), status);ResourceBundle rb2 = rb1.get("Countries", status);ResourceBundle rb3 = rb1.get(5, status);ResourceBundle rb4 = rb1.getNext(status);
CUResourceBundle *rb1 = ures_open(NULL, "root", &status );UResourceBundle *rb2 = ures_getByKey(rb1, "Countries", NULL,
&status);ures_getByIndex(rb1, 5, rb2, &status);ures_getNextResource(rb1, rb2, &status);/* ... */ures_close(rb1);ures_close(rb2);
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop17
Using Resource Bundles• Accessing individual strings:C++UnicodeString s; UErrorCode status = U_ZERO_ERROR;s = rb.getStringEx(“LocaleString“, status);s = rb.getStringEx(5, status);s = rb.getString(status); // only working for string resourcess = rb.getNextString(status); // iteration
CUChar *s;UErrorCode status = U_ZERO_ERROR;s = ures_getStringByKey(rb, “LocaleString", &status );s = ures_getStringByIndex(rb, 5, &status );s = ures_getString(rb, &status); /* only with string resources */s = ures_getNextString(rb, &status); /* iteration */
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop18
Using Resource Bundles
• Accessing resources within complex resources by key (can be used only on tables):
C++ResourceBundle rb2 = rb1.get("Countries", status);UnicodeString s = rb.getStringEx(“LocaleString“, status);
Cures_getByKey(rb1, "Countries", NULL, &status); s = ures_getStringByKey(rb, “LocaleString", &status );
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop19
Using Resource Bundles• Accessing resources within complex resources by
index :C++int32_t size = rb1.getSize();for (int32_t i = 0; i<size; i++) {
ResourceBundle rb3 = rb1.get(i, status);UnicodeString s = rb1.getStringEx(i, status); // if resource is a string
}
Cint32_t size = ures_getSize(rb1);int32_t i = 0;for(i = 0; i<size; i++) {
ures_getByIndex(rb1, 5, rb2, &status); s = ures_getStringByIndex(rb1, 5, &status );/* if resource is a string */
}
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop20
Using Resource Bundles• Iterating over complex resources :C++rb1.resetIterator();while(rb1.hasNext(status)) {
rb2 = rb1.getNext(); // or UnicodeString us=rb1.getNextString(status);}
Cures_resetIterator(rb1);while(ures_hasNext(rb1)) {
ures_getNextResource(rb1, rb2, &status); /* or uc = ures_getNextString(rb1, &status); */
}
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop21
Using Resource Bundles• Accessing other scalar types :C++int32_t len = 0;rb2 = rb1.get("A_binary_resource", status);const uint8_t *binarydata = rb2.getBinary(len, status);
Cint32_t len = 0;const int8_t *binaryData = NULL;int32_t number;ures_getByKey(rb1, "A_binary_resource", rb2, &status);binaryData = ures_getBinary(rb2, &len, &status);ures_getByKey(rb1, "An_integer_resource", rb2, &status);number = ures_getInt(rb2, &status);
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop22
Preparing Resource Bundles• During the build process, source Resource Bundles have to be
compiled from the source format in binary format.• Tool to use is genrb.• Syntax:genrb [-s source_directory] [-d destination_directory] [-e
encoding] [-v] [-V] [-h] genrb_source_file
• Example:genrb -s /home/weiv/dev/icu/data -d /home/weiv/icuinstall/data
root.txt
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop23
Preparing Resource Bundles• Encoding of Resource Bundle files:
– invariant characters plus unicode values (ICU data files are stored like this,
– UTF16-LE, UTF16-BE, UTF-8, provided that the BOM is written at the very beggining of the resource bundle file,
– After ICU is built, resource bundles can use any encoding that ICU supports. Encoding must be specified during the build process
• Genrb compiles resource bundle from .txt format to .res format, which is already usable by ICU.
• Furthermore, .res files can be packed with other data files into memory mapped files or dlls.
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop24
Migration from 1.4 model
• const UnicodeString* get2dArrayItem(const char *resourceTag, int32_t rowIndex, int32_t columnIndex, UErrorCode& err) const;
Equivalent code (Error checking intentionally ommited):int32_t row = 3, col = 4;ResourceBundle zonestrings = rb1.get("zoneStrings", status);ResourceBundle zone = zonestrings.get(row, status);UnicodeString data = zone.getStringEx(col, status);
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop25
Migration from 1.4 model• const UnicodeString** get2dArray(const char *resourceTag,
int32_t& rowCount, int32_t& columnCount, UErrorCode& err) const;
Equivalent code (for rectangular array):int32_t zonesize = 0, zoneis = 0, j = 0, i = 0;UnicodeString** zones = NULL;ResourceBundle zonestrings = rb1.get("zoneStrings", status);zonesize = zonestrings.getSize();zoneis = 0;zones = new UnicodeString*[zonesize];for(i = 0; i<zonesize; i++) { ResourceBundle zone = zonestrings.get(i, status); zoneis = zone.getSize(); zones[i] = new UnicodeString[zoneis]; for(j = 0; j<zoneis; j++) { zones[i][j] = zone.getStringEx(j, status); }}
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop26
Migration from 1.4 model• const UnicodeString* getTaggedArrayItem(const char *resourceTag, const
UnicodeString& itemTag, UErrorCode& err) const;Equivalent code:char *item = "US";ResourceBundle countries = rb1.get("Countries", status);UnicodeString countryUS = countries.getStringEx(item, status);
• void getTaggedArray(const char *resourceTag, UnicodeString*& itemTags, UnicodeString*& items, int32_t& numItems, UErrorCode& err) const;
Equivalent code:ResourceBundle tagarray = rb1.get("Countries", status);int32_t tagsize = tagarray.getSize();UnicodeString *items = new UnicodeString[tagsize];const char ** itemTags = new const char*[tagsize];const char *key = 0;int32_t i = 0;for(i = 0; i<tagsize; i++) { ResourceBundle tagitem = tagarray.get(i, status); items[i] = tagitem.getString(status); itemTags[i] = tagitem.getKey();}
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop27
Resource Bundle Usage
• Rewrite array getters to use iteration• Short complete example
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop28
Low-level Utility Interface• Different data can be used by ICU
applications using udata API.• Purpose is creating portable and fast data
access.• Several ways of organizing data:
– memory mapped files– DLLs
• Data should be written out according to portability rules
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop29
Low-level Utility Interface• Data is accessed through the following
APIs:– UDataMemory* udata_open (const char *path, const char *type,
const char *name, UErrorCode *pErrorCode)– UDataMemory* udata_openChoice (const char *path, const char
*type, const char *name, UDataMemoryIsAcceptable *isAcceptable, void *context, UErrorCode *pErrorCode)
– void udata_close (UDataMemory *pData)– const void* udata_getMemory (UDataMemory *pData)– void udata_getInfo (UDataMemory *pData, UDataInfo *pInfo)– void udata_setCommonData (const void *data, UErrorCode *err)
Cupertino, CA, USA / September, 2000 First ICU Developer Workshop30
Udata API usage
• Short complete example, writing out data and retrieving it.