Multiple object model character encodings support

Also add support for ISO-8859-1.
author: Boris Kolpackov <boris@codesynthesis.com> 2009-12-08 16:18:01 +0200
committer: Boris Kolpackov <boris@codesynthesis.com> 2009-12-08 16:18:01 +0200
commit: 1ca6396a3dd284241de11bcaa210ad5836e8e5a8 (patch)
tree: 465c19f0d668a91bb556d748911847acfb80cb09 /documentation/cxx/tree
parent: d71611d5fb575078bdf573c35257bb86bb7054e0 (diff)
2 files changed, 34 insertions, 9 deletions
diff --git a/documentation/cxx/tree/guide/index.xhtml b/documentation/cxx/tree/guide/index.xhtml
index 787610a..f96b09b 100644
--- a/documentation/cxx/tree/guide/index.xhtml
+++ b/documentation/cxx/tree/guide/index.xhtml
@@ -226,7 +226,7 @@
     <tr>
       <th>3</th><td><a href="#3">Overall Mapping Configuration</a>
         <table class="toc">
-          <tr><th>3.1</th><td><a href="#3.1">Character Type</a></td></tr>
+          <tr><th>3.1</th><td><a href="#3.1">Character Type and Encoding</a></td></tr>
           <tr><th>3.2</th><td><a href="#3.2">Support for Polymorphism </a></td></tr>
           <tr><th>3.3</th><td><a href="#3.3">Namespace Mapping</a></td></tr>
           <tr><th>3.4</th><td><a href="#3.4">Thread Safety</a></td></tr>
@@ -1148,7 +1148,7 @@ $ doxygen hello.doxygen
      Compiler Command Line Manual</a>.
   </p>
 
-  <h2><a name="3.1">3.1 Character Type</a></h2>
+  <h2><a name="3.1">3.1 Character Type and Encoding</a></h2>
 
   <p>The C++/Tree mapping has built-in support for two character types:
     <code>char</code> and <code>wchar_t</code>. You can select the
@@ -1160,14 +1160,25 @@ $ doxygen hello.doxygen
 
   <p>Another aspect of the mapping that depends on the character type
      is character encoding. For the <code>char</code> character type
-     the encoding is UTF-8. For the <code>wchar_t</code> character type
-     the encoding is automatically selected between UTF-16 and
-     UTF-32/UCS-4 depending on the size of the <code>wchar_t</code> type.
-     On some platforms (for example, Windows with Visual C++ and AIX with IBM XL
-     C++) <code>wchar_t</code> is 2 bytes long. For these platforms the
+     the default encoding is UTF-8. Other supported encodings are
+     ISO-8859-1, Xerces-C++ Local Code Page (LPC), as well as
+     custom encodings. You can select which encoding should be used
+     in the object model with the <code>--char-encoding</code> command
+     line option.</p>
+
+  <p>For the <code>wchar_t</code> character type the encoding is
+     automatically selected between UTF-16 and UTF-32/UCS-4 depending
+     on the size of the <code>wchar_t</code> type. On some platforms
+     (for example, Windows with Visual C++ and AIX with IBM XL C++)
+     <code>wchar_t</code> is 2 bytes long. For these platforms the
      encoding is UTF-16. On other platforms <code>wchar_t</code> is 4 bytes
      long and UTF-32/UCS-4 is used.</p>
 
+  <p>Note also that the character encoding that is used in the object model
+     is independent of the encodings used in input and output XML. In fact,
+     all three (object mode, input XML, and output XML) can have different
+     encodings.</p>
+
   <h2><a name="3.2">3.2 Support for Polymorphism</a></h2>
 
   <p>By default XSD generates non-polymorphic code. If your vocabulary
diff --git a/documentation/cxx/tree/manual/index.xhtml b/documentation/cxx/tree/manual/index.xhtml
index d468fe3..91c6154 100644
--- a/documentation/cxx/tree/manual/index.xhtml
+++ b/documentation/cxx/tree/manual/index.xhtml
@@ -226,7 +226,7 @@
             <th>2.1</th><td><a href="#2.1">Preliminary Information</a>
               <table class="toc">
                 <tr><th>2.1.1</th><td><a href="#2.1.1">Identifiers</a></td></tr>
-                <tr><th>2.1.2</th><td><a href="#2.1.2">Character Type</a></td></tr>
+                <tr><th>2.1.2</th><td><a href="#2.1.2">Character Type and Encoding</a></td></tr>
                 <tr><th>2.1.3</th><td><a href="#2.1.3">XML Schema Namespace</a></td></tr>
 		<tr><th>2.1.4</th><td><a href="#2.1.4">Anonymous Types</a></td></tr>
               </table>
@@ -567,7 +567,7 @@
      CONVENTION section in the <a href="http://www.codesynthesis.com/projects/xsd/documentation/xsd.xhtml">XSD
      Compiler Command Line Manual</a>.</p>
 
-  <h3><a name="2.1.2">2.1.2 Character Type</a></h3>
+  <h3><a name="2.1.2">2.1.2 Character Type and Encoding</a></h3>
 
   <p>The code that implements the mapping, depending on the
      <code>--char-type</code>  option, is generated using either
@@ -577,6 +577,20 @@
      your schemas, for example <code>std::basic_string&lt;C></code>.
   </p>
 
+  <p>Another aspect of the mapping that depends on the character type
+     is character encoding. For the <code>char</code> character type
+     the default encoding is UTF-8. Other supported encodings are
+     ISO-8859-1, Xerces-C++ Local Code Page (LPC), as well as
+     custom encodings and can be selected with the
+     <code>--char-encoding</code> command line option.</p>
+
+  <p>For the <code>wchar_t</code> character type the encoding is
+     automatically selected between UTF-16 and UTF-32/UCS-4 depending
+     on the size of the <code>wchar_t</code> type. On some platforms
+     (for example, Windows with Visual C++ and AIX with IBM XL C++)
+     <code>wchar_t</code> is 2 bytes long. For these platforms the
+     encoding is UTF-16. On other platforms <code>wchar_t</code> is 4 bytes
+     long and UTF-32/UCS-4 is used.</p>
 
   <h3><a name="2.1.3">2.1.3 XML Schema Namespace</a></h3>
author	Boris Kolpackov <boris@codesynthesis.com>	2009-12-08 16:18:01 +0200
committer	Boris Kolpackov <boris@codesynthesis.com>	2009-12-08 16:18:01 +0200
commit	1ca6396a3dd284241de11bcaa210ad5836e8e5a8 (patch)
tree	465c19f0d668a91bb556d748911847acfb80cb09 /documentation/cxx/tree
parent	d71611d5fb575078bdf573c35257bb86bb7054e0 (diff)