Solr 6 can use any of these four different methods for routing documents and ACLs to shards.
- ACL (ACL_ID): This sharding method is most appropriate when there are lots of sites
or nodes that have been assigned ACLs. For example, if you have many Share sites, nodes and
ACLs are assigned to shards randomly based on the ACL and the documents to which it applies.
The node distribution may be uneven as it depends how many nodes share ACLs. This method replaces the previous ACL based sharding method used in Solr 4 and distributes ACLs over the shards. Each shard contains only the access control information for the nodes it contains.
To use this method, when creating a shard add a new configuration property:shard.method=PROPERTY
- DBID (DB_ID): This is the default sharding option in Solr 6. Nodes are evenly
distributed over the shards at random. Each shard contains copies of all the ACL information,
so this information is replicated in each shard.To use this method, when creating a shard add a new configuration property:
shard.method=DB_ID
- Date/Datetime (DATE): The date-based sharding assigns dates sequentially through
shards based on the month.
For example: If there are 12 shards, each month would be assigned sequentially to each shard, wrapping round and starting again for each year. The non-random assignment facilitates easier shard management - dropping shards or scaling out replication for some date range. Typical ageing strategies could be based on the created date or destruction date.
Each shard contains copies of all the ACL information, so this information is replicated in each shard. However, if the property is not present on a node, sharding falls back to the DBID murmur hash to randomly distribute these nodes.
To use this method, when creating a shard add the new configuration properties:shard.key=exif:dateTimeOriginal shard.method=DATE
Months can be grouped together, for example, by quarter. Each quarter of data would be assigned sequentially through the available shards.shard.date.grouping=3
-
Property (PROPERTY): This sharding method uses any d:date,
d:datetime, or d:text property, for example, the recipient
of an email, the creator of a node, some custom field set by a rule, or by the domain of an
email recipient. The keys are randomly distributed over the shards using murmur hash.
Each shard contains copies of all the ACL information, so this information is replicated in each shard. However, if the property is not present on a node, sharding falls back to the DBID murmur hash to randomly distribute these nodes.
To use this method, when creating a shard add the new configuration properties:shard.key=cm:creator shard.method=PROPERTY
It is possible to extract a part of the property value to use for sharding using a regular expression, for example, a year at the start of a string:
shard.regex=^\d{4}