原文地址:https://dev.mysql.com/doc/refman/5.1/en/partitioning-hash.html
[]
Partitioning byis used primarily to ensure an even distribution of data among a predetermined number of partitions. With range or list partitioning,you must specify explicitly into which partition a given column value or set of column values is to be stored; with hash partitioning,MySQL takes care of this for you,and you need only specify a column value or expression based on a column value to be hashed and the number of partitions into which the partitioned table is to be divided.
To partition a table usingpartitioning,it is necessary to append to thestatement a
clause,whereexpris an expression that returns an integer. This can simply be the name of a column whose type is one of MySQL’s integer types. In addition,you most likely want to follow this withexpr
)num
,wherenumis a positive integer representing the number of partitions into which the table is to be divided.
For simplicity,the tables in the examples that follow do not use any keys. You should be aware that,if a table has any unique keys,every column used in the partitioning expression for this this table must be part of every unique key,including the primary key. See,for more information.
The following statement creates a table that uses hashing on thecolumn and is divided into 4 partitions:
If you do not include a
clause,the number of partitions defaults to
.
Using the
keyword without a number following it results in a syntax error.
You can also use an SQL expression that returns an integer forexpr. For instance,you might want to partition based on the year in which an employee was hired. This can be done as shown here:
exprmust return a nonconstant,nonrandom integer value (in other words,it should be varying but deterministic),and must not contain any prohibited constructs as described in. You should also keep in mind that this expression is evaluated each time a row is inserted or updated (or possibly deleted); this means that very complex expressions may give rise to performance issues,particularly when performing operations (such as batch inserts) that affect a great many rows at one time.
The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value,as this allows for
For example,where
is a column of type,then the expressionis said to vary directly with the value of
,because for every change in the value of
,the value of the expression changes in a consistent manner. The variance of the expressionwith respect to
is not quite as direct as that of,because not every possible change in
produces an equivalent change in. Even so,is a good candidate for a hashing function,because it varies directly with a portion of
and there is no possible change in
that produces a disproportionate change in.
By way of contrast,suppose that you have a column named
whose type is. Now consider the expression. This would be a poor choice for a hashing function because a change in the value of
is not guaranteed to produce a proportional change in the value of the expression. Changing the value of
by a given amount can produce by widely different changes in the value of the expression. For example,changing
from
to
produces a change of
in the value of the expression,but changing the value of
from
to
produces a change of
in the expression value.
In other words,the more closely the graph of the column value versus the value of the expression follows a straight line as traced by the equation
c
xwherecis some nonzero constant,the better the expression is suited to hashing. This has to do with the fact that the more nonlinear an expression is,the more uneven the distribution of data among the partitions it tends to produce.In theory,pruning is also possible for expressions involving more than one column value,but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason,the use of hashing expressions involving multiple columns is not particularly recommended.
When
is used,MySQL determines which partition ofnum
partitions to use based on the modulus of the result of the user function. In other words,for an expressionexpr,the partition in which the record is stored is partition numberN,whereN
= MOD(expr,num). Suppose that tableis defined as follows,so that it has 4 partitions:
If you insert a record into
whose
value is
,then the partition in which it is stored is determined as follows:
MySQL 5.1 also supports a variant of
partitioning known as
The user function is evaluated each time a record is inserted or updated. It may also—depending on the circumstances—be evaluated when records are deleted.
If a table to be partitioned has a
key,then any columns supplied as arguments to the
user function or to the
'scolumn_list
must be part of that key.