From b754d9e58997c4ad09894793d08b2d144cd598cc Mon Sep 17 00:00:00 2001 From: Boris Kolpackov Date: Mon, 27 Oct 2014 11:44:14 +0200 Subject: Add support containers in queries feature --- feature/query/container | 126 ++++++++++++++++++++++++++++++++++++++++++++++++ feature/query/list | 2 + 2 files changed, 128 insertions(+) create mode 100644 feature/query/container diff --git a/feature/query/container b/feature/query/container new file mode 100644 index 0000000..76c005e --- /dev/null +++ b/feature/query/container @@ -0,0 +1,126 @@ +- Parts of a container query: + + * selector -- selects which elements are examined + * predicate -- test applied on the selected elements + * quantifier -- counts how many selected elements satisfy the predicate + + Selector and predicate use the same query syntax, e.g., (query::index < 10). + + The most general quantifier is 'count' which simply returns the number + of elements that satisfied the predicate. We will also have "shortcut" + quantifiers for convenience (and optimization, in the case of 'all'): + + any == (count != 0) + all == (count == size) + one == (count == 1) + none == (count == 0) + + Note that while it may seem that selector and predicate are the same + thing, they really are not (see IMP operator). + +- The most promising syntax so far: + + typedef odb::query query; + typedef query::employees_query emp_query; // employees_value, + // employees_element + // employees_type + + query::employees[emp_query::index < 10].count ( + emp_query::value.first == "John" && + emp_query::value.last == "Doe") > 1; + + The selector ([]) is optional. If not present, then defaults to 'all'. + Instead of 'count' we one can write 'any', 'all', 'one', or 'none'. + + query::employees_query type is essentially a container element (or + value) type. For vector it would be: + + struct + { + index; + value; + }; + + For a map it would be: + + struct + { + key; + value; + }; + + The weakest part in this syntax is the emp_query typedef. We kind of + need it in order not to have to repeat it all the time. We need to + come up with a clean naming schema for these things (both for the + typedef inside query and the alias that the user gives it). For + simple queries it can be omitted, for example: + + query::employees.any ( + query::employees_element::value == name ("John", "Doe")); + + It is conceptually correct that we don't say query::employees::value + because 'employees' is a whole container while what we refer to is + an element of a container. + + This is also related to the mass UPDATE feature in the sense that + the whole "_query" naming schema will have to be changed since we + will want to write something like: + + update ((?::age += 1), (?::name == "John")); + + Keeping the "query" name and ending up with something like this + is most definitely a bad idea: + + update ((query::ceo = true), (query::name == "John Doe")); + + So we need some neutral name, something like "members": + + typedef odb::members members; + + update ((members::ceo = true), (members::name == "John Doe")); + query (members::name == "John Doe" && members::ceo); + +- empty(), size() -- these are properties of the container itself, not + its elements. Syntax: + + query::employees.empty () + + This will probably be easiest to implement with an aggregate sub-query, + which is ok. + + There is a way to implement empty() without a subquery using left + join. + +- count predicate; e.g., more than five employees are female. + +- For object queries will need DISTINCT. Container table is only used in + the where clause. + +- Joining containers in views. Here might need DISTINCT ON (not supported + in SQLite) but will probably have to be user controllable. Also in this + case the container table can be used in both select list and where clause. + +- From: http://www.codesynthesis.com/pipermail/odb-users/2014-January/001696.html + + When people are using a container in a query condition, we need to know + which elements to consider. This can be some specific element (e.g., the + first element), any element, all elements, a range of elements, etc. + + I think the "any element" will be the most widely used case and is the one + we definitely have to support. Others, I am not sure it will even be + possible to implement in SQL in any sane way (e.g., all elements, a range of + elements). Maybe what we should do is expose the index column (or the key + column for maps) to the user so that they can create whatever conditions + they want. Something along these lines: + + query::authTokens.index == 0 && query::authTokens->hash == 123 + + [Note the problem with this syntax: a container element may also have a data + member named index.] + + It is also not clear how to implement the "all elements" case with this + approach, or in SQL in a sane/portable way in general. + +- User examples: + + http://www.codesynthesis.com/pipermail/odb-users/2011-September/000300.html diff --git a/feature/query/list b/feature/query/list index 4d09312..85be57d 100644 --- a/feature/query/list +++ b/feature/query/list @@ -1,3 +1,5 @@ +- Support containers in queries: container + - Shortcut query() call for queries that always return one element Can be useful for aggregate queries, etc. -- cgit v1.1