Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: MDEV-28724: Add stats sampling based on % of total pages #2173

Open
wants to merge 1 commit into
base: 11.0
Choose a base branch
from

Conversation

haidong
Copy link
Contributor

@haidong haidong commented Jun 27, 2022

  • The Jira issue number for this PR is: MDEV-28724

Description

Currently when running ANALYZE TABLE against innodb tables, index
stats are calculated by sampling STATS_SAMPLE_PAGES number of pages.
This new feature allows a user to define a percentage of total index
pages to sample, between 0 and 100.

Here is how 3 competing database products implement this feature:

  • PostgreSQL
    As far as I can tell, PostgreSQL doesn't have a similar feature. I
    checked both the CREATE STATISTICS and ANALYZE documentation
  • Microsoft SQL Server
    SQL Server has the setting of PERSIST_SAMPLE_PERCENT This name maybe the
    most relevant when considering the naming of our setting variables;
  • Oracle
    Like SQL Server, Oracle has something similar.

Here is the proposed solution implemented in this PR:

  • 2 new innodb settings

-- innodb_stats_persistent_percent_sampling
This is a boolean setting, true means the sampling is based on percentage.
Default value should be false, which is consistent with the current state
of not having this feature and won't break things;

-- innodb_stats_persistent_sample_percentage
This is an int, between 0 and 100.

  • During index stats sampling (ANALYZE TABLE, for example)
  1. Check the table's stas_percent_sampling setting. If true,
    N_SAMPLE_PAGES will be calculated using its
    stats_sample_percentage. Otherwise:
  2. Check the table's stats_sample_pages value. If greater than 0, then
    set N_SAMPLE_PAGES to it. Otherwise:
  3. Check the instance's innodb_stats_persistent_percent_sampling. If
    true, N_SAMPLE_PAGES will be calculated using
    innodb_stats_persistent_sample_percentage. Otherwise:
  4. Set N_SAMPLE_PAGES to innodb_stats_persistent_sample_pages

New feature usage examples

  1. New system settings:
    MariaDB [(none)]> show variables like '%stats%persistent%';
    +-------------------------------------------+-------+
    | Variable_name                             | Value |
    +-------------------------------------------+-------+
    | innodb_stats_persistent                   | ON    |
    | innodb_stats_persistent_percent_sampling  | OFF   |   <<<
    | innodb_stats_persistent_sample_pages      | 20    |
    | innodb_stats_persistent_sample_percentage | 100   |   <<<
    +-------------------------------------------+-------+
    4 rows in set (0.001 sec)
    
    MariaDB [(none)]> set global innodb_stats_persistent_percent_sampling=1;
    Query OK, 0 rows affected (0.000 sec)
    
    MariaDB [(none)]> show variables like '%stats%persistent%';
    +-------------------------------------------+-------+
    | Variable_name                             | Value |
    +-------------------------------------------+-------+
    | innodb_stats_persistent                   | ON    |
    | innodb_stats_persistent_percent_sampling  | ON    |   <<<
    | innodb_stats_persistent_sample_pages      | 20    |
    | innodb_stats_persistent_sample_percentage | 100   |   <<<
    +-------------------------------------------+-------+
    4 rows in set (0.001 sec)
    
    MariaDB [(none)]> set global innodb_stats_persistent_sample_percentage=25;
    Query OK, 0 rows affected (0.000 sec)
    
    MariaDB [(none)]> show variables like '%stats%persistent%';
    +-------------------------------------------+-------+
    | Variable_name                             | Value |
    +-------------------------------------------+-------+
    | innodb_stats_persistent                   | ON    |
    | innodb_stats_persistent_percent_sampling  | ON    |   <<<
    | innodb_stats_persistent_sample_pages      | 20    |
    | innodb_stats_persistent_sample_percentage | 25    |   <<<
    +-------------------------------------------+-------+
    4 rows in set (0.001 sec)
  1. New settings for table creation. I think this is an important feature, because table size could vary widely. Sampling 20% pages of a 10MB table has a much lower cost than sampling 20% of 100GB table. So defining these properties at the table level is going to be very helpful and useful. This is also how Microsoft SQL Server and Oracle does the sampling.
    MariaDB [haidong]> CREATE TABLE t(a INT UNSIGNED PRIMARY KEY)
        -> ENGINE=InnoDB STATS_PERSISTENT=1 STATS_PERCENT_SAMPLING=1 STATS_SAMPLE_PERCENTAGE=20;
    Query OK, 0 rows affected (0.015 sec)
    
    MariaDB [haidong]> show create table t;
    +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table | Create Table
                                                                   |
    +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | t     | CREATE TABLE `t` (
      `a` int(10) unsigned NOT NULL,
      PRIMARY KEY (`a`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 STATS_PERCENT_SAMPLING=1 STATS_PERSISTENT=1 |
    +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.001 sec)

How can this PR be tested?

New test cases added and verified. Suggestions on how to best organize them welcome!

Basing the PR against the correct MariaDB version

  • This is a new feature and the PR is based against version 10.7

Backward compatibility

N/A

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

@CLAassistant
Copy link

CLAassistant commented Jun 27, 2022

CLA assistant check
All committers have signed the CLA.

@haidong
Copy link
Contributor Author

haidong commented Jun 27, 2022

I'm looking into all these and will have something shortly.

@dr-m
Copy link
Contributor

dr-m commented Jun 28, 2022

Thank you for the contribution.

I only had a quick look at the proposed interface. I think that it would be better to use the MariaDB HA_TOPTION_SYSVAR interface for specifying the parameter. The STATS_PERSISTENT and STATS_AUTO_RECALC were inherited from MySQL and therefore use a different interface.

Could we have a single parameter, say, STATS_PERCENTAGE, with the default value 0 implying that the new logic will be disabled?

I would avoid putting any ticket numbers in test names, and instead try to use subsystem or feature names in tests. For example, the file name subselect_sj2_stats_percentage.test is descriptive on its own.

@haidong
Copy link
Contributor Author

haidong commented Jun 28, 2022

Thanks a lot, @dr-m !

I think that it would be better to use the MariaDB HA_TOPTION_SYSVAR interface for specifying the parameter. The STATS_PERSISTENT and STATS_AUTO_RECALC were inherited from MySQL and therefore use a different interface.

Really good point and I agree. I'm thinking of adding the parameter to storage/innobase/handler/ha_innodb.cc's ha_create_table_option_innodb_table_option_list, if that works.

Could we have a single parameter, say, STATS_PERCENTAGE, with the default value 0 implying that the new logic will be disabled?

Yes! That was my thought initially. I then changed my mind because I wanted the setting to represent one thing and one thing only. However, after typing 2 parameters repeatedly during my testing, I think a single parameter is a wonderful idea!

I would avoid putting any ticket numbers in test names, and instead try to use subsystem or feature names in tests. For example, the file name subselect_sj2_stats_percentage.test is descriptive on its own.

Agreed.

I'm working on revamping this based on your input. I will also target 10.10.

@dr-m
Copy link
Contributor

dr-m commented Jun 29, 2022

Sounds good. I would use this existing parameter for reference:

static MYSQL_THDVAR_UINT(default_encryption_key_id, PLUGIN_VAR_RQCMDARG,
			 "Default encryption key id used for table encryption.",
			 NULL, innodb_default_encryption_key_id_update,
			 FIL_DEFAULT_ENCRYPTION_KEY, 1, UINT_MAX32, 0);
…
HA_TOPTION_SYSVAR("ENCRYPTION_KEY_ID", encryption_key_id, default_encryption_key_id),

Unlike this, no update callback function will be needed, and the default as well as the minimum value should be 0. I think that it should be something like this:

static MYSQL_THDVAR_UINT(stats_percentage, PLUGIN_VAR_RQCMDARG,
			 "Insert description here",
			 nullptr, nullptr, 0, 0, 100, 0);
…
HA_TOPTION_SYSVAR("STATS_PERCENTAGE", stats_percentage, stats_percentage),

If we want to avoid a duplicated meaning of the value 0 (does it make sense to use percentage-based sampling with that value?), we might repurpose a previously impossible bit pattern to enable percentage-based sampling:

diff --git a/include/my_base.h b/include/my_base.h
index f96e569cae4..f45a4b92d78 100644
--- a/include/my_base.h
+++ b/include/my_base.h
@@ -341,7 +341,7 @@ enum ha_base_keytype {
    storage engine and used by the optimizer for query optimization will be
    stored on disk and will not change after a server restart.
 */
-#define HA_OPTION_STATS_PERSISTENT	4096U
+#define HA_OPTION_STATS_PERSISTENT	(1U << 12)
 /*
   STATS_PERSISTENT=0 has been specified in CREATE/ALTER TABLE. Statistics
   for the table will be wiped away on server shutdown and new ones recalculated
@@ -350,7 +350,11 @@ enum ha_base_keytype {
   explicitly set at table level and the corresponding table will use whatever
   is the global server default.
 */
-#define HA_OPTION_NO_STATS_PERSISTENT	8192U
+#define HA_OPTION_NO_STATS_PERSISTENT	(2U << 12)
+/*
+  STATS_PERSISTENT=2 has been specified...
+*/
+#define HA_OPTION_STATS_PERCENTAGE	(3U << 12)
 
 /* .frm has extra create options in linked-list format */
 #define HA_OPTION_TEXT_CREATE_OPTIONS_legacy (1U << 14) /* 5.2 to 5.5, unused since 10.0 */
diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
index 172ea82b1a6..ba3966bb54d 100644
--- a/sql/sql_yacc.yy
+++ b/sql/sql_yacc.yy
@@ -5621,6 +5621,9 @@ create_table_option:
             case 1:
                 Lex->create_info.table_options|= HA_OPTION_STATS_PERSISTENT;
                 break;
+            case 2:
+                Lex->create_info.table_options|= HA_OPTION_STATS_PERCENTAGE;
+                break;
             default:
                 thd->parse_error();
                 MYSQL_YYABORT;

I will be on vacation between July 4 and July 24.

@haidong
Copy link
Contributor Author

haidong commented Jun 29, 2022

I think one setting STATS_PERCENTAGE will be enough. The bit pattern technique is something I'm interested in learning in the future. But I feel there is no point in using it in this case: the payoff is not worth the effort and complication.

I'm working on this now. Hopefully it will be in a good state before your vacation.

@haidong haidong force-pushed the MDEV-28724-innodb_stats_persistent-sampling-with-percentage branch from d61fd4a to e3aeb84 Compare June 30, 2022 03:58
@haidong haidong requested a review from ottok as a code owner June 30, 2022 03:58
@haidong haidong changed the base branch from 10.7 to 10.10 June 30, 2022 03:59
@haidong
Copy link
Contributor Author

haidong commented Jun 30, 2022

@dr-m There are still work to be done, but I wanted to push in some changes for your review before your vacation, if you have time.

For example, the SHOW CREATE TABLE output surrounds STATS_PERCENTAGE with backtick, which is a bit annoying. I'm also thinking of better test cases.

Thanks!

@haidong
Copy link
Contributor Author

haidong commented Jun 30, 2022

Weird that github says haidong requested a review from ottk as a code owner 11 hours ago. Not that I mind it, but I don't think I did that.

@haidong
Copy link
Contributor Author

haidong commented Jun 30, 2022

I made upstream contributions previously as an independent developer, as a result I signed the CLA. Now I'm contributing as an AWS employee and using the BSD license, can I unsign the CLA I previously signed? I suppose I could create a different Github account, but don't want to do that if not absolutely necessary.

@haidong
Copy link
Contributor Author

haidong commented Jun 30, 2022

All of the build failed on the same test: mysql-test/main/subselect_sj2_stats_percentage.test.

It is identical to subselect_sj2.test EXCEPT that I added the new feature stats_percentage to ALL create innodb table statements. The result files in my local environment are the same EXCEPT for that stats_percentage clause. It runs successfully in my local environment.

I need some help to get the reject file so I can do a comparison. Suggestions/comments welcome.

By the way, the SHOW TABLE CREATE output backtick issue is not new. Option like encryption_key_id has the same problem. So I won't address it in this PR.

Thanks!

@ottok ottok requested review from dr-m and removed request for ottok July 19, 2022 15:37
@ottok ottok changed the title MDEV-28724: Add stats sampling based on % of total pages WIP: MDEV-28724: Add stats sampling based on % of total pages Sep 9, 2022
@ottok
Copy link
Contributor

ottok commented Sep 9, 2022

@LinuxJedi Can you give some advice here what you want Haidong to take into account additionally while polishing the submission?

@LinuxJedi
Copy link
Contributor

@LinuxJedi Can you give some advice here what you want Haidong to take into account additionally while polishing the submission?

In general this looks good to me, only minor thing I saw was a code comment closed without a space before the */ and I don't think that is a blocker to getting it in. It would need rebasing against 10.11 and Marko would need to give the final review on it. But I can poke internally when it is ready.

Comment on lines 128 to 135
#define N_SAMPLE_PAGES(index) \
static_cast<ib_uint64_t>( \
(index)->table->stats_percentage != 0 \
? (index)->stat_index_size * \
(index)->table->stats_percentage / 100 + 1 \
: (index)->table->stats_sample_pages != 0 \
? (index)->table->stats_sample_pages \
: srv_stats_persistent_sample_pages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be defined as an inline function with some if and else instead of multiple ternary operators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. It was a define macro, but with a little bit of added logic, it's getting unwieldy. I'll change in a new commit that'll be pushed in today.

Comment on lines +690 to 668
/* With this option the user defines the key identifier used for the encryption */
HA_TOPTION_SYSVAR("ENCRYPTION_KEY_ID", encryption_key_id, default_encryption_key_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "This option identifies the key for encrypting the data file"? This would be best changed in a comment on its own, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. This change is unrelated. I'll revert in a new push.

/** The percentage of pages to sample for this table during persistent
stats estimation. If this is 0, then the value of the stats_sample_pages
will be used instead. */
ulint stats_percentage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need 32 or 64 bits for this?

The sizeof(dict_index_t) is already large enough. Could we combine all fields into a single word:

unsigned stat_persistent:2;
unsigned stats_auto_recalc:2;
unsigned stats_percentage:7;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will ha_table_option_struct::stats_percentage be assigned to dict_table_t::stats_percentage? I did not find any code to do that. For the pre-existing statistics attributes that is being done in innobase_copy_frm_flags_from_table_share().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding combining all fields into a single word, I tried it, but received compiler warning because of type inconsistency with definitions inside dict0stats.inl. I ignored the warning, but that produced failures for test case innodb_stats_create_on_corrupted. In the end, I opted to use ib_uint32_t, which made all tests pass.

Where will ha_table_option_struct::stats_percentage be assigned to dict_table_t::stats_percentage?

Addressed in a new push.

@haidong haidong force-pushed the MDEV-28724-innodb_stats_persistent-sampling-with-percentage branch from 5adacc6 to 7636f8d Compare January 8, 2023 23:24
@haidong haidong changed the base branch from 10.10 to 10.11 January 8, 2023 23:25
@haidong
Copy link
Contributor Author

haidong commented Jan 8, 2023

It would need rebasing against 10.11 and Marko would need to give the final review on it. But I can poke internally when it is ready.

I did rebase against 10.11, and also addressed Marko's comments. The commit message is rewritten in the latest commit. Please review it. Thanks!

@dr-m
Copy link
Contributor

dr-m commented Jan 9, 2023

There seems to be an extra cherry-picked commit before the main commit.

Furthermore, I believe that meanwhile the target for new features has moved to the 11.0 release or its not-yet-existing successor branch.

I’d still like to see the use of bit fields in dict_table_t. The structure is already quite large: sizeof(dict_table_t) is 664 bytes in a debug build, and 608 bytes with cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo, on a 64-bit GNU/Linux system. For partitioned tables, such a structure will be allocated for each individual partition or subpartition.

@ottok ottok requested a review from dr-m January 9, 2023 16:30
When running `ANALYZE TABLE` against `innodb` tables, index stats are
calculated by sampling `STATS_SAMPLE_PAGES` number of pages. This new
feature allows a user to define a percentage of total indexe pages to
sample, between 0 and 100.

Here is how 3 competing database products implement this feature:
- PostgreSQL
As far as I can tell, PostgreSQL doesn't have a similar feature. I
checked both the `CREATE STATISTICS`[^1] and `ANALYZE`[^2] documentation
- SQL Server
SQL Server has the setting of `PERSIST_SAMPLE_PERCENT`[^3]. This name maybe the
 most relevant when considering the naming of our setting variables;
- Oracle
Like SQL Server, Oracle has something similar[^4].

Here is the proposed solution implemented in this PR:
- 1 new `innodb` setting `STATS_PERCENTAGE` for `CREATE` or `ALTER
  TABLE`. Its value ranges between 0 and 100, inclusive;
- To minimize memory footprint, the new `stats_percentage` uses bit
  field. The `stat_persistent` and `stats_auto_recalc` are also
  refactored to use bit fields.
- During index stats sampling (ANALYZE TABLE, for example)
1. Check the table's `STATS_PERCENTAGE` setting. If `> 0`,
   `N_SAMPLE_PAGES` will be calculated using its
   `STATS_PERCENTAGE`. Otherwise:
2. Check the table's `STATS_SAMPLE_PAGES` value. If greater than 0, then
   set `N_SAMPLE_PAGES` to it. Otherwise:
3. Set `N_SAMPLE_PAGES` to `INNODB_STATS_PERSISTENT_SAMPLE_PAGES`

New feature usage examples
1. `CREATE TABLE` using `STATS_PERCENTAGE`
```
MariaDB [test]> CREATE TABLE t(a INT UNSIGNED PRIMARY KEY) ENGINE=InnODB STATS_PERSISTENT=1 STATS_PERCENTAGE=20;
Query OK, 0 rows affected (0.007 sec)

MariaDB [test]> SHOW CREATE TABLE t;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                      |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
  `a` int(10) unsigned NOT NULL,
  PRIMARY KEY (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci STATS_PERSISTENT=1 `STATS_PERCENTAGE`=20 | [^5]
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
```
2. `ALTER TABLE` using `STATS_PERCENTAGE`
```
MariaDB [test]> ALTER TABLE t STATS_PERCENTAGE=35;
Query OK, 0 rows affected (0.007 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [test]> SHOW CREATE TABLE t;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                      |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
  `a` int(10) unsigned NOT NULL,
  PRIMARY KEY (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci STATS_PERSISTENT=1 `STATS_PERCENTAGE`=35 | [^5]
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)

```

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.

[^1]: https://www.postgresql.org/docs/current/sql-createstatistics.html
[^2]: https://www.postgresql.org/docs/current/sql-analyze.html
[^3]: https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql?view=sql-server-ver16
[^4]: https://docs.oracle.com/en/database/oracle/oracle-database/21/tgsql/gathering-optimizer-statistics.html
[^5]: I'm aware of the issue with unwanted tilde and possibly pipe
characters. I wonder if it is related to MDEV-29001 and should be
addressed there.
@haidong haidong force-pushed the MDEV-28724-innodb_stats_persistent-sampling-with-percentage branch from 7636f8d to e44b580 Compare March 19, 2023 22:22
@haidong haidong changed the base branch from 10.11 to 11.0 March 19, 2023 22:23
@haidong
Copy link
Contributor Author

haidong commented Mar 19, 2023

I have implemented the suggested change:

  • The new stats_percentage uses bit fields in this push
  • The existing stat_persistent and stats_auto_recalc have been refactored to use bit fields also.

Stats is such an important part of the query processing engine so please review carefully. I had to rewrite a handful of tests, and re-record some result files.

Additionally, innodb_bug84958 test fails. The MTR output bears some resemblance to the MTR output here.

This PR is now rebased on 11.0.

@dr-m Your input welcome and much appreciated. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants