MVA (multi-valued attributes)
MVAs, or multi-valued attributes, are an important special type of per-document attributes in Sphinx. MVAs make it possible to attach lists of values to every document. They are useful for article tags, product categories, etc. Filtering and group-by (but not sorting) on MVA attributes is supported.
Currently, MVA list entries are limited to unsigned 32-bit integers. The list length is not limited, you can have an arbitrary number of values attached to each document as long as RAM permits (
.spm file that contains the MVA values will be precached in RAM by
searchd). The source data can be taken either from a separate query, or from a document field; see source type in sql_attr_multi. In the first case the query will have to return pairs of document ID and MVA values, in the second one the field will be parsed for integer values. There are absolutely no requirements as to incoming data order; the values will be automatically grouped by document ID (and internally sorted within the same ID) during indexing anyway.
When filtering, a document will match the filter on MVA attribute if any of the values satisfy the filtering condition. (Therefore, documents that pass through exclude filters will not contain any of the forbidden values.) When grouping by MVA attribute, a document will contribute to as many groups as there are different MVA values associated with that document. For instance, if the collection contains exactly 1 document having a ‘tag’ MVA with values 5, 7, and 11, grouping on ‘tag’ will produce 3 groups with ‘@count’ equal to 1 and ‘@groupby’ key values of 5, 7, and 11 respectively. Also note that grouping by MVA might lead to duplicate documents in the result set: because each document can participate in many groups, it can be chosen as the best one in in more than one group, leading to duplicate IDs. PHP API historically uses ordered hash on the document ID for the resulting rows; so you’ll also need to use SetArrayResult() in order to employ group-by on MVA with PHP API.
Multi-valued attribute (MVA) declaration. Multi-value (ie. there may be more than one such attribute declared), optional. Applies to SQL source types (
Plain attributes only allow to attach 1 value per each document. However, there are cases (such as tags or categories) when it is desired to attach multiple values of the same attribute and be able to apply filtering or grouping to value lists.
The declaration format is as follows (backslashes are for clarity only; everything can be declared in a single line as well):
sql_attr_multi = ATTR-TYPE ATTR-NAME 'from' SOURCE-TYPE \ [;QUERY] \ [;RANGE-QUERY]
- ATTR-TYPE is ‘uint’ or ‘timestamp’
- SOURCE-TYPE is ‘field’, ‘query’, or ‘ranged-query’
- QUERY is SQL query used to fetch all ( docid, attrvalue ) pairs
- RANGE-QUERY is SQL query used to fetch min and max ID values, similar to ‘sql_query_range’
sql_attr_multi = uint tag from query; SELECT id, tag FROM tags sql_attr_multi = uint tag from ranged-query; \ SELECT id, tag FROM tags WHERE id>=$start AND id<=$end; \ SELECT MIN(id), MAX(id) FROM tags
SetArrayResultPrototype: function SetArrayResult ( $arrayresult ) PHP specific. Controls matches format in the search results set (whether matches should be returned as an array or a hash).
$arrayresultargument must be boolean. If
false(the default mode), matches will returned in PHP hash format with document IDs as keys, and other information (weight, attributes) as values. If
$arrayresultis true, matches will be returned as a plain array with complete per-match information including document ID. Introduced along with GROUP BY support on MVA attributes. Group-by-MVA result sets may contain duplicate document IDs. Thus they need to be returned as plain arrays, because hashes will only keep one entry per document ID.