summaryrefslogtreecommitdiff
path: root/documentation/cxx/tree
diff options
context:
space:
mode:
authorBoris Kolpackov <boris@codesynthesis.com>2009-12-08 16:18:01 +0200
committerBoris Kolpackov <boris@codesynthesis.com>2009-12-08 16:18:01 +0200
commit1ca6396a3dd284241de11bcaa210ad5836e8e5a8 (patch)
tree465c19f0d668a91bb556d748911847acfb80cb09 /documentation/cxx/tree
parentd71611d5fb575078bdf573c35257bb86bb7054e0 (diff)
Multiple object model character encodings support
Also add support for ISO-8859-1.
Diffstat (limited to 'documentation/cxx/tree')
-rw-r--r--documentation/cxx/tree/guide/index.xhtml25
-rw-r--r--documentation/cxx/tree/manual/index.xhtml18
2 files changed, 34 insertions, 9 deletions
diff --git a/documentation/cxx/tree/guide/index.xhtml b/documentation/cxx/tree/guide/index.xhtml
index 787610a..f96b09b 100644
--- a/documentation/cxx/tree/guide/index.xhtml
+++ b/documentation/cxx/tree/guide/index.xhtml
@@ -226,7 +226,7 @@
<tr>
<th>3</th><td><a href="#3">Overall Mapping Configuration</a>
<table class="toc">
- <tr><th>3.1</th><td><a href="#3.1">Character Type</a></td></tr>
+ <tr><th>3.1</th><td><a href="#3.1">Character Type and Encoding</a></td></tr>
<tr><th>3.2</th><td><a href="#3.2">Support for Polymorphism </a></td></tr>
<tr><th>3.3</th><td><a href="#3.3">Namespace Mapping</a></td></tr>
<tr><th>3.4</th><td><a href="#3.4">Thread Safety</a></td></tr>
@@ -1148,7 +1148,7 @@ $ doxygen hello.doxygen
Compiler Command Line Manual</a>.
</p>
- <h2><a name="3.1">3.1 Character Type</a></h2>
+ <h2><a name="3.1">3.1 Character Type and Encoding</a></h2>
<p>The C++/Tree mapping has built-in support for two character types:
<code>char</code> and <code>wchar_t</code>. You can select the
@@ -1160,14 +1160,25 @@ $ doxygen hello.doxygen
<p>Another aspect of the mapping that depends on the character type
is character encoding. For the <code>char</code> character type
- the encoding is UTF-8. For the <code>wchar_t</code> character type
- the encoding is automatically selected between UTF-16 and
- UTF-32/UCS-4 depending on the size of the <code>wchar_t</code> type.
- On some platforms (for example, Windows with Visual C++ and AIX with IBM XL
- C++) <code>wchar_t</code> is 2 bytes long. For these platforms the
+ the default encoding is UTF-8. Other supported encodings are
+ ISO-8859-1, Xerces-C++ Local Code Page (LPC), as well as
+ custom encodings. You can select which encoding should be used
+ in the object model with the <code>--char-encoding</code> command
+ line option.</p>
+
+ <p>For the <code>wchar_t</code> character type the encoding is
+ automatically selected between UTF-16 and UTF-32/UCS-4 depending
+ on the size of the <code>wchar_t</code> type. On some platforms
+ (for example, Windows with Visual C++ and AIX with IBM XL C++)
+ <code>wchar_t</code> is 2 bytes long. For these platforms the
encoding is UTF-16. On other platforms <code>wchar_t</code> is 4 bytes
long and UTF-32/UCS-4 is used.</p>
+ <p>Note also that the character encoding that is used in the object model
+ is independent of the encodings used in input and output XML. In fact,
+ all three (object mode, input XML, and output XML) can have different
+ encodings.</p>
+
<h2><a name="3.2">3.2 Support for Polymorphism</a></h2>
<p>By default XSD generates non-polymorphic code. If your vocabulary
diff --git a/documentation/cxx/tree/manual/index.xhtml b/documentation/cxx/tree/manual/index.xhtml
index d468fe3..91c6154 100644
--- a/documentation/cxx/tree/manual/index.xhtml
+++ b/documentation/cxx/tree/manual/index.xhtml
@@ -226,7 +226,7 @@
<th>2.1</th><td><a href="#2.1">Preliminary Information</a>
<table class="toc">
<tr><th>2.1.1</th><td><a href="#2.1.1">Identifiers</a></td></tr>
- <tr><th>2.1.2</th><td><a href="#2.1.2">Character Type</a></td></tr>
+ <tr><th>2.1.2</th><td><a href="#2.1.2">Character Type and Encoding</a></td></tr>
<tr><th>2.1.3</th><td><a href="#2.1.3">XML Schema Namespace</a></td></tr>
<tr><th>2.1.4</th><td><a href="#2.1.4">Anonymous Types</a></td></tr>
</table>
@@ -567,7 +567,7 @@
CONVENTION section in the <a href="http://www.codesynthesis.com/projects/xsd/documentation/xsd.xhtml">XSD
Compiler Command Line Manual</a>.</p>
- <h3><a name="2.1.2">2.1.2 Character Type</a></h3>
+ <h3><a name="2.1.2">2.1.2 Character Type and Encoding</a></h3>
<p>The code that implements the mapping, depending on the
<code>--char-type</code> option, is generated using either
@@ -577,6 +577,20 @@
your schemas, for example <code>std::basic_string&lt;C></code>.
</p>
+ <p>Another aspect of the mapping that depends on the character type
+ is character encoding. For the <code>char</code> character type
+ the default encoding is UTF-8. Other supported encodings are
+ ISO-8859-1, Xerces-C++ Local Code Page (LPC), as well as
+ custom encodings and can be selected with the
+ <code>--char-encoding</code> command line option.</p>
+
+ <p>For the <code>wchar_t</code> character type the encoding is
+ automatically selected between UTF-16 and UTF-32/UCS-4 depending
+ on the size of the <code>wchar_t</code> type. On some platforms
+ (for example, Windows with Visual C++ and AIX with IBM XL C++)
+ <code>wchar_t</code> is 2 bytes long. For these platforms the
+ encoding is UTF-16. On other platforms <code>wchar_t</code> is 4 bytes
+ long and UTF-32/UCS-4 is used.</p>
<h3><a name="2.1.3">2.1.3 XML Schema Namespace</a></h3>