MVA (multi-valued attributes)


MVAs, or multi-valued attributes, are an important special type of per-document attributes in Sphinx. MVAs make it possible to attach lists of values to every document. They are useful for article tags, product categories, etc. Filtering and group-by (but not sorting) on MVA attributes is supported.

Currently, MVA list entries are limited to unsigned 32-bit integers. The list length is not limited, you can have an arbitrary number of values attached to each document as long as RAM permits (.spm file that contains the MVA values will be precached in RAM by searchd). The source data can be taken either from a separate query, or from a document field; see source type in sql_attr_multi. In the first case the query will have to return pairs of document ID and MVA values, in the second one the field will be parsed for integer values. There are absolutely no requirements as to incoming data order; the values will be automatically grouped by document ID (and internally sorted within the same ID) during indexing anyway.

When filtering, a document will match the filter on MVA attribute if any of the values satisfy the filtering condition. (Therefore, documents that pass through exclude filters will not contain any of the forbidden values.) When grouping by MVA attribute, a document will contribute to as many groups as there are different MVA values associated with that document. For instance, if the collection contains exactly 1 document having a ‘tag’ MVA with values 5, 7, and 11, grouping on ‘tag’ will produce 3 groups with ‘@count’ equal to 1 and ‘@groupby’ key values of 5, 7, and 11 respectively. Also note that grouping by MVA might lead to duplicate documents in the result set: because each document can participate in many groups, it can be chosen as the best one in in more than one group, leading to duplicate IDs. PHP API historically uses ordered hash on the document ID for the resulting rows; so you’ll also need to use SetArrayResult() in order to employ group-by on MVA with PHP API.

sql_attr_multi

Multi-valued attribute (MVA) declaration. Multi-value (ie. there may be more than one such attribute declared), optional. Applies to SQL source types (mysql, pgsql, mssql) only.

Plain attributes only allow to attach 1 value per each document. However, there are cases (such as tags or categories) when it is desired to attach multiple values of the same attribute and be able to apply filtering or grouping to value lists.

The declaration format is as follows (backslashes are for clarity only; everything can be declared in a single line as well):

sql_attr_multi = ATTR-TYPE ATTR-NAME 'from' SOURCE-TYPE \
	[;QUERY] \
	[;RANGE-QUERY]

where

  • ATTR-TYPE is ‘uint’ or ‘timestamp’
  • SOURCE-TYPE is ‘field’, ‘query’, or ‘ranged-query’
  • QUERY is SQL query used to fetch all ( docid, attrvalue ) pairs
  • RANGE-QUERY is SQL query used to fetch min and max ID values, similar to ‘sql_query_range’

 

Example:
sql_attr_multi = uint tag from query; SELECT id, tag FROM tags
sql_attr_multi = uint tag from ranged-query; \
	SELECT id, tag FROM tags WHERE id>=$start AND id<=$end; \
	SELECT MIN(id), MAX(id) FROM tags



SetArrayResult

Prototype: function SetArrayResult ( $arrayresult ) PHP specific. Controls matches format in the search results set (whether matches should be returned as an array or a hash). $arrayresult argument must be boolean. If $arrayresult is false (the default mode), matches will returned in PHP hash format with document IDs as keys, and other information (weight, attributes) as values. If $arrayresult is true, matches will be returned as a plain array with complete per-match information including document ID. Introduced along with GROUP BY support on MVA attributes. Group-by-MVA result sets may contain duplicate document IDs. Thus they need to be returned as plain arrays, because hashes will only keep one entry per document ID.

  1. Thanks for the good writeup. It in truth was once a leisure account it.
    Glance advanced to more introduced agreeable from you! By the way, how could we communicate?

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: