From fd32ca77e7beed31d601ef836be39cb6aa828a5f Mon Sep 17 00:00:00 2001 From: Boris Kolpackov Date: Fri, 15 Nov 2013 08:59:47 +0200 Subject: Rename documentation/ to doc/ --- documentation/xsd-epilogue.xhtml | 417 --------------------------------------- 1 file changed, 417 deletions(-) delete mode 100644 documentation/xsd-epilogue.xhtml (limited to 'documentation/xsd-epilogue.xhtml') diff --git a/documentation/xsd-epilogue.xhtml b/documentation/xsd-epilogue.xhtml deleted file mode 100644 index f9b7c47..0000000 --- a/documentation/xsd-epilogue.xhtml +++ /dev/null @@ -1,417 +0,0 @@ -

NAMING CONVENTION

- -

The compiler can be instructed to use a particular naming - convention in the generated code. A number of widely-used - conventions can be selected using the --type-naming - and --function-naming options. A custom - naming convention can be achieved using the - --type-regex, - --accessor-regex, - --one-accessor-regex, - --opt-accessor-regex, - --seq-accessor-regex, - --modifier-regex, - --one-modifier-regex, - --opt-modifier-regex, - --seq-modifier-regex, - --parser-regex, - --serializer-regex, - --enumerator-regex, and - --element-type-regex options. -

- -

The --type-naming option specifies the - convention that should be used for naming C++ types. Possible - values for this option are knr (default), - ucc, and java. The - knr value (stands for K&R) signifies - the standard, lower-case naming convention with the underscore - used as a word delimiter, for example: foo, - foo_bar. The ucc (stands - for upper-camel-case) and - java values a synonyms for the same - naming convention where the first letter of each word in the - name is capitalized, for example: Foo, - FooBar.

- -

Similarly, the --function-naming option - specifies the convention that should be used for naming C++ - functions. Possible values for this option are knr - (default), lcc, and java. The - knr value (stands for K&R) signifies - the standard, lower-case naming convention with the underscore - used as a word delimiter, for example: foo(), - foo_bar(). The lcc value - (stands for lower-camel-case) signifies a naming convention - where the first letter of each word except the first is - capitalized, for example: foo(), fooBar(). - The java naming convention is similar to - the lower-camel-case one except that accessor functions are prefixed - with get, modifier functions are prefixed - with set, parsing functions are prefixed - with parse, and serialization functions are - prefixed with serialize, for example: - getFoo(), setFooBar(), - parseRoot(), serializeRoot().

- -

Note that the naming conventions specified with the - --type-naming and - --function-naming options perform only limited - transformations on the names that come from the schema in the - form of type, attribute, and element names. In other words, to - get consistent results, your schemas should follow a similar - naming convention as the one you would like to have in the - generated code. Alternatively, you can use the - --*-regex options (discussed below) - to perform further transformations on the names that come from - the schema.

- -

The - --type-regex, - --accessor-regex, - --one-accessor-regex, - --opt-accessor-regex, - --seq-accessor-regex, - --modifier-regex, - --one-modifier-regex, - --opt-modifier-regex, - --seq-modifier-regex, - --parser-regex, - --serializer-regex, - --enumerator-regex, and - --element-type-regex options allow you to - specify extra regular expressions for each name category in - addition to the predefined set that is added depending on - the --type-naming and - --function-naming options. Expressions - that are provided with the --*-regex - options are evaluated prior to any predefined expressions. - This allows you to selectively override some or all of the - predefined transformations. When debugging your own expressions, - it is often useful to see which expressions match which names. - The --name-regex-trace option allows you - to trace the process of applying regular expressions to - names.

- -

The value for the --*-regex options should be - a perl-like regular expression in the form - /pattern/replacement/. - Any character can be used as a delimiter instead of /. - Escaping of the delimiter character in pattern or - replacement is not supported. - All the regular expressions for each category are pushed into a - category-specific stack with the last specified expression - considered first. The first match that succeeds is used. For the - --one-accessor-regex (accessors with cardinality one), - --opt-accessor-regex (accessors with cardinality optional), and - --seq-accessor-regex (accessors with cardinality sequence) - categories the --accessor-regex expressions are - used as a fallback. For the - --one-modifier-regex, - --opt-modifier-regex, and - --seq-modifier-regex - categories the --modifier-regex expressions are - used as a fallback. For the --element-type-regex - category the --type-regex expressions are - used as a fallback.

- -

The type name expressions (--type-regex) - are evaluated on the name string that has the following - format:

- -

[namespace ]name[,name][,name][,name]

- -

The element type name expressions - (--element-type-regex), effective only when - the --generate-element-type option is specified, - are evaluated on the name string that has the following - format:

- -

namespace name

- -

In the type name format the namespace part - followed by a space is only present for global type names. For - global types and elements defined in schemas without a target - namespace, the namespace part is empty but - the space is still present. In the type name format after the - initial name component, up to three additional - name components can be present, separated - by commas. For example:

- -

http://example.com/hello type

-

foo

-

foo,iterator

-

foo,const,iterator

- -

The following set of predefined regular expressions is used to - transform type names when the upper-camel-case naming convention - is selected:

- -

/(?:[^ ]* )?([^,]+)/\u$1/

-

/(?:[^ ]* )?([^,]+),([^,]+)/\u$1\u$2/

-

/(?:[^ ]* )?([^,]+),([^,]+),([^,]+)/\u$1\u$2\u$3/

-

/(?:[^ ]* )?([^,]+),([^,]+),([^,]+),([^,]+)/\u$1\u$2\u$3\u$4/

- -

The accessor and modifier expressions - (--*accessor-regex and - --*modifier-regex) are evaluated on the name string - that has the following format:

- -

name[,name][,name]

- -

After the initial name component, up to two - additional name components can be present, - separated by commas. For example:

- -

foo

-

dom,document

-

foo,default,value

- -

The following set of predefined regular expressions is used to - transform accessor names when the java naming - convention is selected:

- -

/([^,]+)/get\u$1/

-

/([^,]+),([^,]+)/get\u$1\u$2/

-

/([^,]+),([^,]+),([^,]+)/get\u$1\u$2\u$3/

- -

For the parser, serializer, and enumerator categories, the - corresponding regular expressions are evaluated on local names of - elements and on enumeration values, respectively. For example, the - following predefined regular expression is used to transform parsing - function names when the java naming convention - is selected:

- -

/(.+)/parse\u$1/

- -

See also the REGEX AND SHELL QUOTING section below.

- -

TYPE MAP

- -

Type map files are used in C++/Parser to define a mapping between - XML Schema and C++ types. The compiler uses this information - to determine the return types of post_* - functions in parser skeletons corresponding to XML Schema - types as well as argument types for callbacks corresponding - to elements and attributes of these types.

- -

The compiler has a set of predefined mapping rules that map - built-in XML Schema types to suitable C++ types (discussed - below) and all other types to void. - By providing your own type maps you can override these predefined - rules. The format of the type map file is presented below: -

- -
-namespace <schema-namespace> [<cxx-namespace>]
-{
-  (include <file-name>;)*
-  ([type] <schema-type> <cxx-ret-type> [<cxx-arg-type>];)*
-}
-  
- -

Both <schema-namespace> and - <schema-type> are regex patterns while - <cxx-namespace>, - <cxx-ret-type>, and - <cxx-arg-type> are regex pattern - substitutions. All names can be optionally enclosed in - " ", for example, to include white-spaces.

- -

<schema-namespace> determines XML - Schema namespace. Optional <cxx-namespace> - is prefixed to every C++ type name in this namespace declaration. - <cxx-ret-type> is a C++ type name that is - used as a return type for the post_* functions. - Optional <cxx-arg-type> is an argument - type for callback functions corresponding to elements and attributes - of this type. If - <cxx-arg-type> is not specified, it defaults - to <cxx-ret-type> if <cxx-ret-type> - ends with * or & (that is, - it is a pointer or a reference) and - const <cxx-ret-type>& - otherwise. - <file-name> is a file name either in the - " " or < > format - and is added with the #include directive to - the generated code.

- -

The # character starts a comment that ends - with a new line or end of file. To specify a name that contains - # enclose it in " ". - For example:

- -
-namespace http://www.example.com/xmlns/my my
-{
-  include "my.hxx";
-
-  # Pass apples by value.
-  #
-  apple apple;
-
-  # Pass oranges as pointers.
-  #
-  orange orange_t*;
-}
-  
- -

In the example above, for the - http://www.example.com/xmlns/my#orange - XML Schema type, the my::orange_t* C++ type will - be used as both return and argument types.

- -

Several namespace declarations can be specified in a single - file. The namespace declaration can also be completely - omitted to map types in a schema without a namespace. For - instance:

- -
-include "my.hxx";
-apple apple;
-
-namespace http://www.example.com/xmlns/my
-{
-  orange "const orange_t*";
-}
-  
- -

The compiler has a number of predefined mapping rules that can be - presented as the following map files. The string-based XML Schema - built-in types are mapped to either std::string - or std::wstring depending on the character type - selected with the --char-type option - (char by default).

- -
-namespace http://www.w3.org/2001/XMLSchema
-{
-  boolean bool bool;
-
-  byte "signed char" "signed char";
-  unsignedByte "unsigned char" "unsigned char";
-
-  short short short;
-  unsignedShort "unsigned short" "unsigned short";
-
-  int int int;
-  unsignedInt "unsigned int" "unsigned int";
-
-  long "long long" "long long";
-  unsignedLong "unsigned long long" "unsigned long long";
-
-  integer "long long" "long long";
-
-  negativeInteger "long long" "long long";
-  nonPositiveInteger "long long" "long long";
-
-  positiveInteger "unsigned long long" "unsigned long long";
-  nonNegativeInteger "unsigned long long" "unsigned long long";
-
-  float float float;
-  double double double;
-  decimal double double;
-
-  string std::string;
-  normalizedString std::string;
-  token std::string;
-  Name std::string;
-  NMTOKEN std::string;
-  NCName std::string;
-  ID std::string;
-  IDREF std::string;
-  language std::string;
-  anyURI std::string;
-
-  NMTOKENS xml_schema::string_sequence;
-  IDREFS xml_schema::string_sequence;
-
-  QName xml_schema::qname;
-
-  base64Binary std::auto_ptr<xml_schema::buffer>
-               std::auto_ptr<xml_schema::buffer>;
-  hexBinary std::auto_ptr<xml_schema::buffer>
-            std::auto_ptr<xml_schema::buffer>;
-
-  date xml_schema::date;
-  dateTime xml_schema::date_time;
-  duration xml_schema::duration;
-  gDay xml_schema::gday;
-  gMonth xml_schema::gmonth;
-  gMonthDay xml_schema::gmonth_day;
-  gYear xml_schema::gyear;
-  gYearMonth xml_schema::gyear_month;
-  time xml_schema::time;
-}
-  
- -

The last predefined rule maps anything that wasn't mapped by - previous rules to void:

- -
-namespace .*
-{
-  .* void void;
-}
-  
- - -

When you provide your own type maps with the - --type-map option, they are evaluated first. - This allows you to selectively override predefined rules.

- -

REGEX AND SHELL QUOTING

- -

When entering a regular expression argument in the shell - command line it is often necessary to use quoting (enclosing - the argument in " " or - ' ') in order to prevent the shell - from interpreting certain characters, for example, spaces as - argument separators and $ as variable - expansions.

- -

Unfortunately it is hard to achieve this in a manner that is - portable across POSIX shells, such as those found on - GNU/Linux and UNIX, and Windows shell. For example, if you - use " " for quoting you will get a - wrong result with POSIX shells if your expression contains - $. The standard way of dealing with this - on POSIX systems is to use ' ' instead. - Unfortunately, Windows shell does not remove ' ' - from arguments when they are passed to applications. As a result you - may have to use ' ' for POSIX and - " " for Windows ($ is - not treated as a special character on Windows).

- -

Alternatively, you can save regular expression options into - a file, one option per line, and use this file with the - --options-file option. With this approach - you don't need to worry about shell quoting.

- -

DIAGNOSTICS

- -

If the input file is not a valid W3C XML Schema definition, - xsd will issue diagnostic messages to STDERR - and exit with non-zero exit code.

- -

BUGS

- -

Send bug reports to the - xsd-users@codesynthesis.com mailing list.

- - - - - - -- cgit v1.1