From 76d23e639004517db8f9469d64ac1789f8449365 Mon Sep 17 00:00:00 2001 From: Boris Kolpackov Date: Thu, 7 Jan 2010 13:50:11 +0200 Subject: Add support for ISO-8859-1 as application encoding New runtime configuration parameter, XSDE_ENCODING. New option, --char-encoding. New test, tests/cxx/hybrid/iso8859-1. --- documentation/cxx/hybrid/guide/index.xhtml | 37 +++++++++++++++++++------- documentation/cxx/parser/guide/index.xhtml | 23 ++++++++++++++-- documentation/cxx/serializer/guide/index.xhtml | 11 +++++--- documentation/xsde.1 | 13 +++++++++ documentation/xsde.xhtml | 11 ++++++++ 5 files changed, 80 insertions(+), 15 deletions(-) (limited to 'documentation') diff --git a/documentation/cxx/hybrid/guide/index.xhtml b/documentation/cxx/hybrid/guide/index.xhtml index 8c46932..dead3dc 100644 --- a/documentation/cxx/hybrid/guide/index.xhtml +++ b/documentation/cxx/hybrid/guide/index.xhtml @@ -1265,15 +1265,34 @@ main (int argc, char* argv[]) Compiler Command Line Manual.

-

While the XML documents can use various encodings, the Embedded - C++/Hybrid mapping always delivers character data to the application - in the UTF-8 encoding. The underlying XML parser used by the mapping - includes built-in support for XML documents encoded in UTF-8, UTF-16, - ISO-8859-1, and US-ASCII. Other encodings can be supported by providing - application-specific decoder functions. C++/Hybrid also expects character - data supplied by the application to be in the UTF-8 encoding. The - underlying XML serializer used by the mapping produces the resulting - XML in the UTF-8 encoding as well.

+

While the XML documents can use various encodings, the C++/Hybrid + object model always stores character data in the same encoding, + called application encoding. The application encoding can either be + UTF-8 (default) or ISO-8859-1. To select a particular encoding, configure + the XSD/e runtime library accordingly and pass the --char-encoding + option to the XSD/e compiler when translating your schemas.

+ +

When using ISO-8859-1 as the application encoding, XML documents + being parsed may contain characters with Unicode values greater + than 0xFF which are unrepresentable in the ISO-8859-1 encoding. + By default, in such situations parsing will terminate with + an error. However, you can suppress the error by providing a + replacement character that should be used instead of + unrepresentable characters, for example:

+ +
+xml_schema::iso8859_1::unrep_char ('?');
+  
+ +

To revert to the default behavior, set the replacement character + to '\0'.

+ +

The underlying XML parser used by the mapping includes built-in + support for XML documents encoded in UTF-8, UTF-16, ISO-8859-1, + and US-ASCII. Other encodings can be supported by providing + application-specific decoder functions. The underlying XML + serializer used by C++/Hybrid produces the resulting + XML documents in the UTF-8 encoding.

3.1 Standard Template Library

diff --git a/documentation/cxx/parser/guide/index.xhtml b/documentation/cxx/parser/guide/index.xhtml index 305c420..6a019c5 100644 --- a/documentation/cxx/parser/guide/index.xhtml +++ b/documentation/cxx/parser/guide/index.xhtml @@ -1991,8 +1991,27 @@ age: 28

While the XML documents can use various encodings, the Embedded C++/Parser mapping always delivers character data to the application - in the UTF-8 encoding. The underlying XML parser used by the - Embedded C++/Parser mapping includes built-in support for XML + in the same encoding. The application encoding can either be UTF-8 + (default) or ISO-8859-1. To select a particular encoding, configure + the XSD/e runtime library accordingly and pass the --char-encoding + option to the XSD/e compiler when translating your schemas.

+ +

When using ISO-8859-1 as the application encoding, XML documents + being parsed may contain characters with Unicode values greater + than 0xFF which are unrepresentable in the ISO-8859-1 encoding. + By default, in such situations parsing will terminate with + an error. However, you can suppress the error by providing a + replacement character that should be used instead of + unrepresentable characters, for example:

+ +
+xml_schema::iso8859_1::unrep_char ('?');
+  
+ +

To revert to the default behavior, set the replacement character + to '\0'.

+ +

The Embedded C++/Parser mapping includes built-in support for XML documents encoded in UTF-8, UTF-16, ISO-8859-1, and US-ASCII. Other encodings can be supported by providing application-specific decoder functions.

diff --git a/documentation/cxx/serializer/guide/index.xhtml b/documentation/cxx/serializer/guide/index.xhtml index 5abb31f..34ef798 100644 --- a/documentation/cxx/serializer/guide/index.xhtml +++ b/documentation/cxx/serializer/guide/index.xhtml @@ -2609,10 +2609,13 @@ private:

The Embedded C++/Serializer mapping always expects character data - supplied by the application to be in the UTF-8 encoding. The - underlying XML serializer used by the Embedded C++/Serializer - mapping produces the resulting XML in the UTF-8 encoding as - well.

+ supplied by the application to be in the same encoding. The + application encoding can either be UTF-8 (default) or ISO-8859-1. + To select a particular encoding, configure the XSD/e runtime library + accordingly and pass the --char-encoding option to the + XSD/e compiler when translating your schemas. The underlying XML + serializer used by the Embedded C++/Serializer mapping produces + the resulting XML documents in the UTF-8 encoding.

6.1 Standard Template Library

diff --git a/documentation/xsde.1 b/documentation/xsde.1 index 0c0dc72..afafa8c 100644 --- a/documentation/xsde.1 +++ b/documentation/xsde.1 @@ -175,6 +175,19 @@ Write generated files to .I dir instead of the current directory. +.IP "\fB\--char-encoding \fIenc\fR" +Specify the application character encoding. Valid values are +.B utf8 +(default) and +.BR iso8859-1 . +Note that this encoding is not the same as the XML document encoding +that is being parsed or serialized. Rather, it is the encoding that +is used inside the application. When an XML document is parsed, the +character data is automatically converted to the application encoding. +Similarly, when an XML document is serialized, the data in the +application encoding is automatically converted to the resulting +document encoding. + .IP "\fB\--no-stl\fR" Generate code that does not use the Standard Template Library (STL). diff --git a/documentation/xsde.xhtml b/documentation/xsde.xhtml index 9328e99..31c2f03 100644 --- a/documentation/xsde.xhtml +++ b/documentation/xsde.xhtml @@ -166,6 +166,17 @@
Write generated files to dir instead of the current directory.
+
--char-encoding enc
+
Specify the application character encoding. Valid values are + utf8 (default) and iso8859-1. + Note that this encoding is not the same as the XML document encoding + that is being parsed or serialized. Rather, it is the encoding that + is used inside the application. When an XML document is parsed, the + character data is automatically converted to the application encoding. + Similarly, when an XML document is serialized, the data in the + application encoding is automatically converted to the resulting + document encoding.
+
--no-stl
Generate code that does not use the Standard Template Library (STL).
-- cgit v1.1