MySQL 8.0 Data Dictionary: Status in the 8.0.0 DMR

In a previous post, I explained the architecture and design for the transactional data dictionary in MySQL 8.0. In this post I intend to summarize the status of this work in the 8.0.0 Milestone Release.

Storing all dictionary information in InnoDB tables

The MySQL 8.0.0 now stores dictionary data in InnoDB tables.  We have removed the .FRM files. We have moved away from .TRN/.TRG used for triggers, MyISAM tables for users and privileges, events, stored programs, plugins and servers, timezones and help tables, and UDFs . We have removed the DB.OPT file and folded the information into the data dictionary tables.

 

traditional-mysql-data-dictionary
The traditional MySQL Data Dictionary, and a subset of the multiple sources for dictionary data
dd-in-innodb-tables
The Transactional Data Dictionary in 8.0 has a simplified and uniform handling of dictionary data
Creating new APIs to dictionary objects

We are now storing the dictionary information in a uniform way, and provide APIs for access to the data dictionary objects. We have made new APIs for triggers, stored programs, events, schemas, views and both non-partitioned and partitioned tables. We have added metadata information about for tablespaces.

File system induced problems

Since we now have stored the dictionary information in InnoDB tables, the file system induced problems are now minimized. When storing table meta data in .FRM files, the naming of these files are dependent on the characteristics of the file system which stores these files. So issues with file system case sensitivity versus non-sensitivity had to be handled in the MySQL server. Now that these names are stored in InnoDB tables, there are no more issues like this to handle for table names.  A folder is still created per schema.

New and more general dictionary cache

APIs are made cache-friendly. A new cache has been implemented. Currently most old caches are still there, to reduce the impact on the rest of the server code. We can now start removing these caches, so caching will eventually be hidden behind an  API. When old caches are removed, we aim at using and enhancing our new more general dictionary cache.

Preparing for atomic and crash-safe DDL

We have working in progress to pave the way for transactional DDL. This is key for improved reliability, and especially important for MySQL replication. The current ongoing work is both in the MySQL server  and InnoDB codebase.

Faster INFORMATION_SCHEMA

INFORMATION_SCHEMA is reimplemented as views on dictionary tables, allowing optimization of INFORMATION_SCHEMA queries.  The INFORMATION_SCHEMA implementation in 8.0.0 may typically get a 30x performance boost compared to 5.7. More details on this will be a topic for another blog.

information-schema-performance

We will continue to work on optimizing the INFORMATION_SCHEMA queries. See also Scaling and Performance blog from Gopal.

The new implementation fixes a number of longstanding bugs, like:

  • Bug#34921 comparison with information schema tables don’t honor collation
  • Bug#48445 Inconsistency with SHOW and SELECT FROM I_S
  • Bug#61846 SHOW FULL COLUMNS displays incorrect privileges for table
  • Bug#65121 Inconsistent result for select on INFORMATION_SCHEMA.STATISTICS
  • Bug#75532 Join between I_S schema tables is case insensitive/returns wrong value
  • Bug#81347 unnecessary scanned all databases for information schema

INFORMATION_SCHEMA views will be created for TRIGGERS and EVENTS too.

SDI – Serialized Dictionary Information

The dictionary information is stored persistently in InnoDB tables. In addition a serialized form of these data is created and can have several uses:

  • Data migration. The serialized representation of dictionary objects are appended to  InnoDB tablespaces. This makes the InnoDB tablespaces selfcontained wrt to meta data and data and simplifies IMPORT of  transportable tablespaces
  • For redundancy and disaster recovery. In cases where only parts of the server are salvaged, and no backup exists, the metadata and the data are bundled. Since the SDI is in JSON format, a corrupt SDI can even be edited.

We have work in progress for moving larger data sets around. The serialized dictionary information is an enabler for this and an improved IMPORT statement will provide ease of use and control for the user.

For the 8.0.0 DMR, SDIs are not stored on InnoDB tablespaces.  An SDI file is created for schema, and for MyISAM tables.

Prepared for simpler and robust upgrade

The upgrade from MySQL 5.7 with the “old” dictionary to MySQL 8.0 with the transactional data dictionary is just as simple as upgrading from MySQL 5.6 to MySQL 5.7

The design is ready for adding versioning of meta data for assisting upgrades, so upgrades from 8.0 and above will be simple and robust. The MySQL 8.0 server and beyond will have code in the mysqld binary for meta data upgrades.

Work in progress for MySQL 8.0

We now have work in progress for to complete the integration of InnoDB data dictionary and the transactional data dictionary. This will also include work for atomic DDL.

We continue to improve the INFORMATION_SCHEMA implementation by adding more views, replacing more legacy implementions.

We are working to complete the SDI, adding them to the InnoDB tablespaces. And as an extension to this we will complete the work on IMPORT from SDI.

 

About Ståle Deraas

Ståle Deraas has been working with MySQL Database team since 2008. He is currently employed by Oracle, based in Norway. He is Senior Software Development Manager for the runtime team of the MySQL Database. He has a background in the database industry, working for Sun Microsystems on Java DB, and prior to that on Software Development Tools. He has a B.Sc Hons degree in Computing Science from the University of Glasgow.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter * Time limit is exhausted. Please reload CAPTCHA.