PostgreSQL 9.2.24 Documentation | ||||
---|---|---|---|---|
Prev | Up | Chapter 54. SP-GiST Indexes | Next |
This section covers implementation details and other tricks that are useful for implementers of SP-GiST operator classes to know.
Individual leaf tuples and inner tuples must fit on a single index page (8KB by default). Therefore, when indexing values of variable-length data types, long values can only be supported by methods such as suffix trees, in which each level of the tree includes a prefix that is short enough to fit on a page, and the final leaf level includes a suffix also short enough to fit on a page. The operator class should set longValuesOK to TRUE only if it is prepared to arrange for this to happen. Otherwise, the SP-GiST core will reject any request to index a value that is too large to fit on an index page.
Likewise, it is the operator class's responsibility that inner tuples do not grow too large to fit on an index page; this limits the number of child nodes that can be used in one inner tuple, as well as the maximum size of a prefix value.
Another limitation is that when an inner tuple's node points
to a set of leaf tuples, those tuples must all be in the same
index page. (This is a design decision to reduce seeking and
save space in the links that chain such tuples together.) If
the set of leaf tuples grows too large for a page, a split is
performed and an intermediate inner tuple is inserted. For this
to fix the problem, the new inner tuple must divide the set of leaf
values into more than one node group. If the operator class's
picksplit
function fails to do
that, the SP-GiST core
resorts to extraordinary measures described in Section
54.3.3.
Some tree algorithms use a fixed set of nodes for each inner
tuple; for example, in a quad-tree there are always exactly
four nodes corresponding to the four quadrants around the inner
tuple's centroid point. In such a case the code typically works
with the nodes by number, and there is no need for explicit
node labels. To suppress node labels (and thereby save some
space), the picksplit
function
can return NULL for the nodeLabels
array. This will in turn result in nodeLabels being NULL during subsequent
calls to choose
and inner_consistent
. In principle, node labels
could be used for some inner tuples and omitted for others in
the same index.
When working with an inner tuple having unlabeled nodes, it
is an error for choose
to return
spgAddNode, since the set of nodes is
supposed to be fixed in such cases. Also, there is no provision
for generating an unlabeled node in spgSplitTuple actions, since it is expected that
an spgAddNode action will be needed as
well.
The SP-GiST core can
override the results of the operator class's picksplit
function when picksplit
fails to divide the supplied leaf
values into at least two node categories. When this happens,
the new inner tuple is created with multiple nodes that each
have the same label (if any) that picksplit
gave to the one node it did use,
and the leaf values are divided at random among these
equivalent nodes. The allTheSame flag
is set on the inner tuple to warn the choose
and inner_consistent
functions that the tuple
does not have the node set that they might otherwise
expect.
When dealing with an allTheSame
tuple, a choose
result of
spgMatchNode is interpreted to mean
that the new value can be assigned to any of the equivalent
nodes; the core code will ignore the supplied nodeN value and descend into one of the
nodes at random (so as to keep the tree balanced). It is an
error for choose
to return
spgAddNode, since that would make the
nodes not all equivalent; the spgSplitTuple action must be used if the value
to be inserted doesn't match the existing nodes.
When dealing with an allTheSame
tuple, the inner_consistent
function should return either all or none of the nodes as
targets for continuing the index search, since they are all
equivalent. This may or may not require any special-case code,
depending on how much the inner_consistent
function normally assumes
about the meaning of the nodes.