Database Query Performance Monitoring

pt-query-digest is a great tool from the team at Percona.

You can use it to sniff your MySQL traffic and report on lengthy queries. It's brill!

I use CentOS for most of my deployments, so the install guide is for that, but their repo can be found @ https://www.percona.com/doc/percona-repo-config/index.html

First of all, install them yum repository:

yum install http://www.percona.com/downloads/percona-release/redhat/0.1-6/percona-release-0.1-6.noarch.rpm

Then install the percona toolkit:

yum install percona-toolkit

Finally you'll also need tcpdump to sniff network traffic:

yum install tcpdump

To run pt-query-digest, you need to send it some data. The easiest way I've found is to dump all the traffic on port 3306 (which is what MySQL uses by default) and pipe it over to pt-query-digest using the query below.
Obviously you can change the 60s part to make it run for as long as you wish.

tcpdump -s 65535 -x -nn -q -tttt -i any port 3306 | pt-query-digest --type tcpdump --run-time 60s > results.txt

The results are really useful.

First of all, you get a list of the most important queries to look at (based upon time of execution, and the amount of times they're executed). Something like this:

# Profile
# Rank Query ID           Response time Calls  R/Call V/M   Item
# ==== ================== ============= ====== ====== ===== ==============
#    1 0xFF59A07A4F0E387A 31.6792 12.6% 123137 0.0003  0.00 ADMIN INIT_DB
#    2 0x40CDC4563A589BDA 27.0624 10.8%    601 0.0450  0.02 INSERT log
#    3 0xFBA67B02412E0D47 25.2034 10.1%   1905 0.0132  0.00 SELECT music
#    4 0x49E7A665CC89FAFE 21.1498  8.4%    459 0.0461  0.02 INSERT UPDATE wp_wfLeechers
#    5 0xEF4011276CCD96AA 19.1319  7.6%  13388 0.0014  0.00 SELECT Events
#    6 0x91CB4F415D9891E7 18.9118  7.6%      6 3.1520  0.00 SELECT UNION districts_area burglary districts districts_area burglarynodata districts
#    7 0x148D5BE310E1D013 17.8678  7.1%    864 0.0207  0.00 SELECT log
#    8 0x3647AFDFAED9F143  9.1985  3.7%  13270 0.0007  0.00 SELECT Newsletter
#    9 0xA8D7E337251A403A  8.5424  3.4%  12384 0.0007  0.00 SELECT Publicity
#   10 0xEE3EB8EF0A7AF985  7.2433  2.9%    527 0.0137  0.02 INSERT wp_wfHits
#   11 0x94350EA2AB8AAC34  7.2026  2.9%    540 0.0133  0.01 UPDATE wp_options
#   12 0x2323126481D61563  4.0180  1.6%      9 0.4464  0.00 SELECT kindle?
#   13 0x9B3A0257175AA174  3.8914  1.6%    870 0.0045  0.01 SELECT retailers music catalogue
#   14 0x761011A9807BC98F  3.8603  1.5%  13485 0.0003  0.00 SELECT Pages
#   15 0x3D386D065C90C1FC  2.0605  0.8%     50 0.0412  0.02 INSERT amazon
#   16 0xAAF91558B7627ADC  1.8858  0.8%    671 0.0028  0.00 INSERT offers
#   17 0xAD1969307AF6A50C  1.8252  0.7%   6988 0.0003  0.00 SELECT wp_wfConfig
#   18 0x7AEDF19FDD3A33F1  1.7202  0.7%   5835 0.0003  0.00 SELECT wp_options
#   19 0x5B8CFD81A11B28FE  1.5851  0.6%     40 0.0396  0.02 UPDATE wp_wfHits
#   20 0x7D18C5C503757896  1.5124  0.6%   3511 0.0004  0.00 SELECT wp_postmeta
# MISC 0xMISC             34.9239 13.9%  61512 0.0006   0.0 <213 ITEMS>

You then see an example of each of the queries so you can see their typical execution times (it gives a spread):

# Query 6: 0.02 QPS, 0.07x concurrency, ID 0x91CB4F415D9891E7 at byte 88421643
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2018-06-11 15:44:55.448756 to 15:49:28.681546
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count          0       6
# Exec time      7     19s      3s      3s      3s      3s    86ms      3s
# Rows affecte   0       0       0       0       0       0       0       0
# Query size     0   2.60k     442     445  444.50  441.81       0  441.81
# Warning coun   0       0       0       0       0       0       0       0
# String:
# Databases
# Hosts        10.131.65.246 (3/50%), 10.131.66.80 (3/50%)
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms
# 100ms
#    1s  ################################################################
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `1` LIKE 'districts_area'\G
#    SHOW CREATE TABLE `1`.`districts_area`\G
#    SHOW TABLE STATUS FROM `1` LIKE 'burglary'\G
#    SHOW CREATE TABLE `1`.`burglary`\G
#    SHOW TABLE STATUS FROM `1` LIKE 'districts'\G
#    SHOW CREATE TABLE `1`.`districts`\G
#    SHOW TABLE STATUS FROM `1` LIKE 'burglarynodata'\G
#    SHOW CREATE TABLE `1`.`burglarynodata`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT a.postcode, a.`centroid`, a.area, a.lat, a.long, d.figure, di.towns as location, di.region FROM districts_area a
                JOIN burglary d ON a.postcode=d.postcode
                JOIN districts di ON a.postcode=di.postcode
                UNION
                SELECT a.postcode, a.`centroid`, a.area, a.lat, a.long, '', di.towns as location, di.region FROM districts_area a
                JOIN burglarynodata d ON a.postcode=d.postcode
                JOIN districts di ON a.postcode=di.postcode
                LIMIT 2000,500\G

From the results you can see that this query was run 7 seconds, taking 19 seconds in total, and all the queries taking at least 1 second to execute.

You can then go into your MySQL client and run an explain on the query to see (a) whether it needs to be optimised and (b) how you can improve the query time.

If you find that pt-query-digest errors with "Terminating pipeline because process 2 (TcpdumpParser) caused too many errors." you can change the threshold in the pt-query-digest file (typically /usr/bin/pt-query-digest):

$self->{retries}->{$name} = $args{retry_on_error} || 100;
to something like:
$self->{retries}->{$name} = $args{retry_on_error} || 123456789;



Want to get in touch?