feature/query/container


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128

! Support containers in queries

- Parts of a container query:

  * selector -- selects which elements are examined
  * predicate -- test applied on the selected elements
  * quantifier -- counts how many selected elements satisfy the predicate

  Selector and predicate use the same query syntax, e.g., (query::index < 10).

  The most general quantifier is 'count' which simply returns the number
  of elements that satisfied the predicate. We will also have "shortcut"
  quantifiers for convenience (and optimization, in the case of 'all'):

  any == (count != 0)
  all == (count == size)
  one == (count == 1)
  none == (count == 0)

  Note that while it may seem that selector and predicate are the same
  thing, they really are not (see IMP operator).

- The most promising syntax so far:

  typedef odb::query<employer> query;
  typedef query::employees_query emp_query; // employees_value,
                                            // employees_element
                                            // employees_type

  query::employees[emp_query::index < 10].count (
    emp_query::value.first == "John" &&
    emp_query::value.last == "Doe") > 1;

  The selector ([]) is optional. If not present, then defaults to 'all'.
  Instead of 'count' we one can write 'any', 'all', 'one', or 'none'.

  query::employees_query type is essentially a container element (or
  value) type. For vector it would be:

  struct
  {
    index;
    value;
  };

  For a map it would be:

  struct
  {
    key;
    value;
  };

  The weakest part in this syntax is the emp_query typedef. We kind of
  need it in order not to have to repeat it all the time. We need to
  come up with a clean naming schema for these things (both for the
  typedef inside query and the alias that the user gives it). For
  simple queries it can be omitted, for example:

  query::employees.any (
    query::employees_element::value == name ("John", "Doe"));

  It is conceptually correct that we don't say query::employees::value
  because 'employees' is a whole container while what we refer to is
  an element of a container.

  This is also related to the mass UPDATE feature in the sense that
  the whole "_query" naming schema will have to be changed since we
  will want to write something like:

  update ((?::age += 1), (?::name == "John"));

  Keeping the "query" name and ending up with something like this
  is most definitely a bad idea:

  update ((query::ceo = true), (query::name == "John Doe"));

  So we need some neutral name, something like "members":

  typedef odb::members<employee> members;

  update ((members::ceo = true), (members::name == "John Doe"));
  query (members::name == "John Doe" && members::ceo);

- empty(), size() -- these are properties of the container itself, not
  its elements. Syntax:

  query::employees.empty ()

  This will probably be easiest to implement with an aggregate sub-query,
  which is ok.

  There is a way to implement empty() without a subquery using left
  join.

- count predicate; e.g., more than five employees are female.

- For object queries will need DISTINCT. Container table is only used in
  the where clause.

- Joining containers in views. Here might need DISTINCT ON (not supported
  in SQLite) but will probably have to be user controllable. Also in this
  case the container table can be used in both select list and where clause.

- From: http://www.codesynthesis.com/pipermail/odb-users/2014-January/001696.html

  When people are using a container in a query condition, we need to know
  which elements to consider. This can be some specific element (e.g., the
  first element), any element, all elements, a range of elements, etc.

  I think the "any element" will be the most widely used case and is the one
  we definitely have to support. Others, I am not sure it will even be
  possible to implement in SQL in any sane way (e.g., all elements, a range of
  elements). Maybe what we should do is expose the index column (or the key
  column for maps) to the user so that they can create whatever conditions
  they want. Something along these lines:

  query::authTokens.index == 0 && query::authTokens->hash == 123

  [Note the problem with this syntax: a container element may also have a data
   member named index.]

  It is also not clear how to implement the "all elements" case with this
  approach, or in SQL in a sane/portable way in general.

- User examples:

  http://www.codesynthesis.com/pipermail/odb-users/2011-September/000300.html