Using SYS.SESSION as an alternative to SHOW PROCESSLIST

A modern MySQL server contains a lot of useful meta-data in information_schema and performance_schema, which can help bring visibility into what is happening inside of your database server. However, sometimes this data is quite fine grained and needs finessing in order to get to that point πŸ™‚

MySQL 5.7 includes a new SYS schema installed by default. SYS is a set of views, functions and procedures that aggregate the data in a way that is useful to DBAs trying to perform common tasks.

For Linux users I like to compare performance_schema to /proc, and SYS to vmstat.

SHOW PROCESSLIST

Having introduced SYS as a DBA-task-oriented-set-of-views, is there a more common DBA task than logging into a system and seeing what is running? Here is the SYS way to do it:

Allow me to point out a few things here that are not present in SHOW PROCESSLIST:

  • In row#1 you can see that I am running an ALTER TABLE command to add an index on a table. SYS has included the current stage of that execution (it is currently “read PK and internal sort”), and is 19% complete with that stage. It also includes the current execution time (5.78s) which is more precise.
  • In row #2 you can see that I am running an INSERT .. SELECT command. It is an autocommit transaction that has been running for 410.06 ms. You can also see that is requires a temporary table to execute this statement.
  • In row #3 you can see that I actually have a transaction that has been open for 4.94 minutes. It is my current connection where I am running the command SELECT * from sys.session from. Long running transactions can sometimes cause performance problems as they prevent some house-keeping operations from running.
  • All three connections are created by mysql command line clients (program_name: mysql, pid: os pid), so this is a simulation, but your client library will also populate these connection attributes, and it could be useful in tracking down the source of a problem.

Conclusion

Hopefully this serves as a good demonstration that SYS is both easy to use, and contains a lot of the useful meta data that is hidden away in performance_schema. If this inspires you to look for other examples, I suggest running SHOW TABLES in SYS on your MySQL 5.7 installation. There is a lot of good stuff in there πŸ™‚

8 thoughts on “Using SYS.SESSION as an alternative to SHOW PROCESSLIST

  1. A very unscientific check on an unloaded server shows that querying sys takes 70ms vs running SHOW PROCESSLIST taking 0s (the mysql client shows query run times to 1/100th of a second). So clearly (and that’s not unexpected) there is some overhead. On a busier server with 1400 connections SHOW PROCESSLIST still takes 0.00 seconds, and the select on sys.session took 1 minute 38 seconds.

    There has been some push to move us away from SHOW PROCESSLIST in favour of SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST I think as the latter triggers less locks and is thus better.

    I like the output of sys.processes but given the current runtime on a busy server performance is going to be an issue. I think similar performance issues have been resolved with other P_S queries which were slow so maybe it is possible to do the same on this query.

    Also looking at the EXPLAIN of the query gives some indication of why this takes time. See: http://merlin.wl0.org/201601/27/explain.txt

    This seems to indicate that some sort of indexing in P_S might be appropriate on tables with a large number of rows? I have discussed this with a couple of people and understand it complicates things so agree that avoiding if possible is better, but people will expect to get good performance out of P_S/sys and this shows that currently that is not always possible.

    1. Hi Simon,

      Indexes on p_s tables is something we are currently investigating. It would speed up the query on a busy system.

      To maybe clarify in more detail:
      – SHOW PROCESSLIST if fast, but it can cause stalls on new connection creation as it does require more locks to execute.
      – If a very slow ‘select * from sys.session’ translates to contention with other queries is something we would be interested in analyzing. Certainly if it saturates a shared resource (disk, CPU) it would, but my naive comment is that it should be reading from memory and consuming the resources of a single CPU core.

      1. Hi Morgan,

        If SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST _is_ faster than SHOW PROCESSLIST I’d be tempted to suggest you change the underlying SHOW PROCESSLIST implementation to use the I_S query instead. I note, and never understood why this was, that column names in the I_S query use upper case vs SHOW PROCESSLIST using CamelCase. Consistency would be good there as you could fall back to a single implementation:

        root@localhost [(none)]> select * from information_schema.processlist;
        +—-+——+———–+——+———+——+———–+———————————————-+
        | ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO |
        +—-+——+———–+——+———+——+———–+———————————————-+
        | 2 | root | localhost | NULL | Query | 0 | executing | select * from information_schema.processlist |
        +—-+——+———–+——+———+——+———–+———————————————-+
        1 row in set (0.00 sec)

        root@localhost [(none)]> show processlist;
        +—-+——+———–+——+———+——+———-+——————+
        | Id | User | Host | db | Command | Time | State | Info |
        +—-+——+———–+——+———+——+———-+——————+
        | 2 | root | localhost | NULL | Query | 0 | starting | show processlist |
        +—-+——+———–+——+———+——+———-+——————+
        1 row in set (0.00 sec)

        As for sys.processes I understand you are collecting and showing more data than before, but DBAs don’t expect to wait this amount of time for this type of information. In fact with ps-top I chose the approach of SELECTing from the various tables and doing the join in the app to avoid doing any joins on the DB where resources are relatively scarce. The other thing I’ve done for similar types of work joining tables with no indexes is to create temporary tables, load data, add indexes and do the final join that way. I don’t think that makes sense for sys as it’s likely to be too heavy, so look forward to seeing how a query such as this can be made to be more efficient in the future.

        Note: The information is good, and we want that, but indeed figuring out how to make the performance fast will be important if people want to substitute I_S or SHOW PROCESSLIST queries with sys.processes.

  2. Hi Tsubasa,

    Thank you for the note about progress πŸ™‚

    The change was made in 5.7.7. It is this note in the release notes:

    “Performance Schema stage event instruments that provide statement progress information now are enabled and timed by default. The affected instruments are those displayed by this statement [..]”

  3. Hi

    Thank you for your article.
    I translated this article into Japanese for users in Japan
    Translated one is as follows.
    https://yakst.com/ja/posts/3749

    If there is any problem, please get in touch with me.
    I also added footnotes for Tsubasa and your comment about progress.

    In short, instruments is enabled by default from 5.7.7 but consumer is not enabled by default.

    Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter * Time limit is exhausted. Please reload CAPTCHA.