summaryrefslogtreecommitdiff
path: root/xsd/doc/xsd-epilogue.1
diff options
context:
space:
mode:
Diffstat (limited to 'xsd/doc/xsd-epilogue.1')
-rw-r--r--xsd/doc/xsd-epilogue.1566
1 files changed, 566 insertions, 0 deletions
diff --git a/xsd/doc/xsd-epilogue.1 b/xsd/doc/xsd-epilogue.1
new file mode 100644
index 0000000..192880c
--- /dev/null
+++ b/xsd/doc/xsd-epilogue.1
@@ -0,0 +1,566 @@
+\"
+\" NAMING CONVENTION
+\"
+
+.SH NAMING CONVENTION
+The compiler can be instructed to use a particular naming convention in
+the generated code. A number of widely-used conventions can be selected
+using the
+.B --type-naming
+and
+.B --function-naming
+options. A custom naming convention can be achieved using the
+.BR --type-regex ,
+.BR --accessor-regex ,
+.BR --one-accessor-regex ,
+.BR --opt-accessor-regex ,
+.BR --seq-accessor-regex ,
+.BR --modifier-regex ,
+.BR --one-modifier-regex ,
+.BR --opt-modifier-regex ,
+.BR --seq-modifier-regex ,
+.BR --parser-regex ,
+.BR --serializer-regex ,
+.BR --const-regex ,
+.BR --enumerator-regex ,
+and
+.B --element-type-regex
+options.
+
+The
+.B --type-naming
+option specifies the convention that should be used for naming C++ types.
+Possible values for this option are
+.B knr
+(default),
+.BR ucc ,
+and
+.BR java .
+The
+.B knr
+value (stands for K&R) signifies the standard, lower-case naming convention
+with the underscore used as a word delimiter, for example: foo, foo_bar.
+The
+.B ucc
+(stands for upper-camel-case) and
+.B java
+values a synonyms for the same naming convention where the first letter
+of each word in the name is capitalized, for example: Foo, FooBar.
+
+Similarly, the
+.B --function-naming
+option specifies the convention that should be used for naming C++ functions.
+Possible values for this option are
+.B knr
+(default),
+.BR lcc ,
+and
+.BR java .
+The
+.B knr
+value (stands for K&R) signifies the standard, lower-case naming convention
+with the underscore used as a word delimiter, for example: foo(), foo_bar().
+The
+.B lcc
+value (stands for lower-camel-case) signifies a naming convention where the
+first letter of each word except the first is capitalized, for example: foo(),
+fooBar(). The
+.B java
+naming convention is similar to the lower-camel-case one except that accessor
+functions are prefixed with get, modifier functions are prefixed with set,
+parsing functions are prefixed with parse, and serialization functions are
+prefixed with serialize, for example: getFoo(), setFooBar(), parseRoot(),
+serializeRoot().
+
+Note that the naming conventions specified with the
+.B --type-naming
+and
+.B --function-naming
+options perform only limited transformations on the
+names that come from the schema in the form of type, attribute, and element
+names. In other words, to get consistent results, your schemas should follow
+a similar naming convention as the one you would like to have in the generated
+code. Alternatively, you can use the
+.B --*-regex
+options (discussed below) to perform further transformations on the names
+that come from the schema.
+
+The
+.BR --type-regex ,
+.BR --accessor-regex ,
+.BR --one-accessor-regex ,
+.BR --opt-accessor-regex ,
+.BR --seq-accessor-regex ,
+.BR --modifier-regex ,
+.BR --one-modifier-regex ,
+.BR --opt-modifier-regex ,
+.BR --seq-modifier-regex ,
+.BR --parser-regex ,
+.BR --serializer-regex ,
+.BR --const-regex ,
+.BR --enumerator-regex ,
+and
+.B --element-type-regex
+options allow you to specify extra regular expressions for each name
+category in addition to the predefined set that is added depending on
+the
+.B --type-naming
+and
+.B --function-naming
+options. Expressions that are provided with the
+.B --*-regex
+options are evaluated prior to any predefined expressions. This allows
+you to selectively override some or all of the predefined transformations.
+When debugging your own expressions, it is often useful to see which
+expressions match which names. The
+.B --name-regex-trace
+option allows you to trace the process of applying
+regular expressions to names.
+
+The value for the
+.B --*-regex
+options should be a perl-like regular expression in the form
+.BI / pattern / replacement /\fR.
+Any character can be used as a delimiter instead of
+.BR / .
+Escaping of the delimiter character in
+.I pattern
+or
+.I replacement
+is not supported. All the regular expressions for each category are pushed
+into a category-specific stack with the last specified expression
+considered first. The first match that succeeds is used. For the
+.B --one-accessor-regex
+(accessors with cardinality one),
+.B --opt-accessor-regex
+(accessors with cardinality optional), and
+.B --seq-accessor-regex
+(accessors with cardinality sequence) categories the
+.B --accessor-regex
+expressions are used as a fallback. For the
+.BR --one-modifier-regex ,
+.BR --opt-modifier-regex ,
+and
+.B --seq-modifier-regex
+categories the
+.B --modifier-regex
+expressions are used as a fallback. For the
+.B --element-type-regex
+category the
+.B --type-regex
+expressions are used as a fallback.
+
+The type name expressions
+.RB ( --type-regex )
+are evaluated on the name string that has the following format:
+
+[\fInamespace \fR]\fIname\fR[\fB,\fIname\fR][\fB,\fIname\fR][\fB,\fIname\fR]
+
+The element type name expressions
+.RB ( --element-type-regex ),
+effective only when the
+.B --generate-element-type
+option is specified, are evaluated on the name string that has the following
+format:
+
+.I namespace name
+
+In the type name format the
+.I namespace
+part followed by a space is only present for global type names. For global
+types and elements defined in schemas without a target namespace, the
+.I namespace
+part is empty but the space is still present. In the type name format after
+the initial
+.I name
+component, up to three additional
+.I name
+components can be present, separated by commas. For example:
+
+.B http://example.com/hello type
+
+.B foo
+
+.B foo,iterator
+
+.B foo,const,iterator
+
+The following set of predefined regular expressions is used to transform
+type names when the upper-camel-case naming convention is selected:
+
+.B /(?:[^ ]* )?([^,]+)/\\\\u$1/
+
+.B /(?:[^ ]* )?([^,]+),([^,]+)/\\\\u$1\\\\u$2/
+
+.B /(?:[^ ]* )?([^,]+),([^,]+),([^,]+)/\\\\u$1\\\\u$2\\\\u$3/
+
+.B /(?:[^ ]* )?([^,]+),([^,]+),([^,]+),([^,]+)/\\\\u$1\\\\u$2\\\\u$3\\\\u$4/
+
+The accessor and modifier expressions
+.RB ( --*accessor-regex
+and
+.BR --*modifier-regex )
+are evaluated on the name string that has the following format:
+
+\fIname\fR[\fB,\fIname\fR][\fB,\fIname\fR]
+
+After the initial
+.I name
+component, up to two additional
+.I name
+components can be present, separated by commas. For example:
+
+.B foo
+
+.B dom,document
+
+.B foo,default,value
+
+The following set of predefined regular expressions is used to transform
+accessor names when the
+.B java
+naming convention is selected:
+
+.B /([^,]+)/get\\\\u$1/
+
+.B /([^,]+),([^,]+)/get\\\\u$1\\\\u$2/
+
+.B /([^,]+),([^,]+),([^,]+)/get\\\\u$1\\\\u$2\\\\u$3/
+
+For the parser, serializer, and enumerator categories, the corresponding
+regular expressions are evaluated on local names of elements and on
+enumeration values, respectively. For example, the following predefined
+regular expression is used to transform parsing function names when the
+.B java
+naming convention is selected:
+
+.B /(.+)/parse\\\\u$1/
+
+The const category is used to create C++ constant names for the
+element/wildcard/text content ids in ordered types.
+
+See also the REGEX AND SHELL QUOTING section below.
+
+\"
+\" TYPE MAP
+\"
+.SH TYPE MAP
+Type map files are used in C++/Parser to define a mapping between XML
+Schema and C++ types. The compiler uses this information to determine
+the return types of
+.B post_*
+functions in parser skeletons corresponding to XML Schema types
+as well as argument types for callbacks corresponding to elements
+and attributes of these types.
+
+The compiler has a set of predefined mapping rules that map built-in
+XML Schema types to suitable C++ types (discussed below) and all
+other types to
+.BR void .
+By providing your own type maps you can override these predefined rules.
+The format of the type map file is presented below:
+
+.RS
+.B namespace
+.I schema-namespace
+[
+.I cxx-namespace
+]
+.br
+.B {
+.br
+ (
+.B include
+.IB file-name ;
+)*
+.br
+ ([
+.B type
+]
+.I schema-type cxx-ret-type
+[
+.I cxx-arg-type
+.RB ] ;
+)*
+.br
+.B }
+.br
+.RE
+
+Both
+.I schema-namespace
+and
+.I schema-type
+are regex patterns while
+.IR cxx-namespace ,
+.IR cxx-ret-type ,
+and
+.I cxx-arg-type
+are regex pattern substitutions. All names can be optionally enclosed
+in \fR" "\fR, for example, to include white-spaces.
+
+.I schema-namespace
+determines XML Schema namespace. Optional
+.I cxx-namespace
+is prefixed to every C++ type name in this namespace declaration.
+.I cxx-ret-type
+is a C++ type name that is used as a return type for the
+.B post_*
+functions. Optional
+.I cxx-arg-type
+is an argument type for callback functions corresponding to elements and
+attributes of this type. If
+.I cxx-arg-type
+is not specified, it defaults to
+.I cxx-ret-type
+if
+.I cxx-ret-type
+ends with
+.B *
+or
+.B &
+(that is, it is a pointer or a reference) and
+.B const
+\fIcxx-ret-type\fB&\fR otherwise.
+.I file-name
+is a file name either in the \fR" "\fR or < > format and is added with the
+.B #include
+directive to the generated code.
+
+The \fB#\fR character starts a comment that ends with a new line or end of
+file. To specify a name that contains \fB#\fR enclose it in \fR" "\fR. For
+example:
+
+.RS
+namespace http://www.example.com/xmlns/my my
+.br
+{
+.br
+ include "my.hxx";
+.br
+
+ # Pass apples by value.
+ #
+ apple apple;
+.br
+
+ # Pass oranges as pointers.
+ #
+ orange orange_t*;
+.br
+}
+.br
+.RE
+
+In the example above, for the
+.B http://www.example.com/xmlns/my#orange
+XML Schema type, the
+.B my::orange_t*
+C++ type will be used as both return and argument types.
+
+Several namespace declarations can be specified in a single file.
+The namespace declaration can also be completely omitted to map
+types in a schema without a namespace. For instance:
+
+.RS
+include "my.hxx";
+.br
+apple apple;
+.br
+
+namespace http://www.example.com/xmlns/my
+.br
+{
+.br
+ orange "const orange_t*";
+.br
+}
+.br
+.RE
+
+
+The compiler has a number of predefined mapping rules that can be
+presented as the following map files. The string-based XML Schema
+built-in types are mapped to either
+.B std::string
+or
+.B std::wstring
+depending on the character type selected with the
+.B --char-type
+option
+.RB ( char
+by default).
+
+.RS
+namespace http://www.w3.org/2001/XMLSchema
+.br
+{
+.br
+ boolean bool bool;
+.br
+
+ byte "signed char" "signed char";
+.br
+ unsignedByte "unsigned char" "unsigned char";
+.br
+
+ short short short;
+.br
+ unsignedShort "unsigned short" "unsigned short";
+.br
+
+ int int int;
+.br
+ unsignedInt "unsigned int" "unsigned int";
+.br
+
+ long "long long" "long long";
+.br
+ unsignedLong "unsigned long long" "unsigned long long";
+.br
+
+ integer "long long" "long long";
+.br
+
+ negativeInteger "long long" "long long";
+.br
+ nonPositiveInteger "long long" "long long";
+.br
+
+ positiveInteger "unsigned long long" "unsigned long long";
+.br
+ nonNegativeInteger "unsigned long long" "unsigned long long";
+.br
+
+ float float float;
+.br
+ double double double;
+.br
+ decimal double double;
+.br
+
+ string std::string;
+.br
+ normalizedString std::string;
+.br
+ token std::string;
+.br
+ Name std::string;
+.br
+ NMTOKEN std::string;
+.br
+ NCName std::string;
+.br
+ ID std::string;
+.br
+ IDREF std::string;
+.br
+ language std::string;
+.br
+ anyURI std::string;
+.br
+
+ NMTOKENS xml_schema::string_sequence;
+.br
+ IDREFS xml_schema::string_sequence;
+.br
+
+ QName xml_schema::qname;
+.br
+
+ base64Binary std::auto_ptr<xml_schema::buffer>
+.br
+ std::auto_ptr<xml_schema::buffer>;
+.br
+ hexBinary std::auto_ptr<xml_schema::buffer>
+.br
+ std::auto_ptr<xml_schema::buffer>;
+.br
+
+ date xml_schema::date;
+.br
+ dateTime xml_schema::date_time;
+.br
+ duration xml_schema::duration;
+.br
+ gDay xml_schema::gday;
+.br
+ gMonth xml_schema::gmonth;
+.br
+ gMonthDay xml_schema::gmonth_day;
+.br
+ gYear xml_schema::gyear;
+.br
+ gYearMonth xml_schema::gyear_month;
+.br
+ time xml_schema::time;
+.br
+}
+.br
+.RE
+
+
+The last predefined rule maps anything that wasn't mapped by previous
+rules to
+.BR void :
+
+.RS
+namespace .*
+.br
+{
+.br
+ .* void void;
+.br
+}
+.br
+.RE
+
+When you provide your own type maps with the
+.B --type-map
+option, they are evaluated first. This allows you to selectively override
+predefined rules.
+
+.\"
+.\" REGEX AND SHELL QUOTING
+.\"
+.SH REGEX AND SHELL QUOTING
+When entering a regular expression argument in the shell command line
+it is often necessary to use quoting (enclosing the argument in " "
+or ' ') in order to prevent the shell from interpreting certain
+characters, for example, spaces as argument separators and $ as
+variable expansions.
+
+Unfortunately it is hard to achieve this in a manner that is portable
+across POSIX shells, such as those found on GNU/Linux and UNIX, and
+Windows shell. For example, if you use " " for quoting you will get
+a wrong result with POSIX shells if your expression contains $. The
+standard way of dealing with this on POSIX systems is to use ' '
+instead. Unfortunately, Windows shell does not remove ' ' from
+arguments when they are passed to applications. As a result you may
+have to use ' ' for POSIX and " " for Windows ($ is not treated as
+a special character on Windows).
+
+Alternatively, you can save regular expression options into a file,
+one option per line, and use this file with the
+.B --options-file
+option. With this approach you don't need to worry about shell quoting.
+
+.\"
+.\" DIAGNOSTICS
+.\"
+.SH DIAGNOSTICS
+If the input file is not a valid W3C XML Schema definition,
+.B xsd
+will issue diagnostic messages to
+.B STDERR
+and exit with non-zero exit code.
+.SH BUGS
+Send bug reports to the xsd-users@codesynthesis.com mailing list.
+.SH COPYRIGHT
+Copyright (c) $copyright$.
+
+Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License,
+version 1.2; with no Invariant Sections, no Front-Cover Texts and
+no Back-Cover Texts. Copy of the license can be obtained from
+https://www.codesynthesis.com/licenses/fdl-1.2.txt