首页 > 数据库 > 文库 > 正文

一个left join SQL 简单优化分析

2024-09-07 22:12:42
字体:
来源:转载
供稿:网友
        有个关联查询的sql,需要2秒多,于是进行查看一番:
 
       SELECT
a.id,
a.brand_id,
a.series_id,
a.product_id,
a.material_id,
a.custom_category_id,
a.price,
a.product_url,
a.organ_id,
.....
FROM
pm_brand_xxxx a
LEFT JOIN pm_brand_yyyyy d ON a.series_id = d.id
WHERE
a.is_delete = 0
AND d.is_delete = 0
AND a.organ_id = 'Cxxx'
AND a.brand_id = 6491603
AND d.brand_id = 6491603
AND a.model_flag = 14;
mysql> show profile for query 4;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000072 |
| checking permissions | 0.000002 |
| checking permissions | 0.000002 |
| Opening tables       | 0.000011 |
| init                 | 0.000026 |
| System lock          | 0.000007 |
| optimizing           | 0.000016 |
| statistics           | 0.000142 |
| preparing            | 0.000018 |
| executing            | 0.000002 |
| Sending data         | 2.281192 |<<<<<<<执行的主要时间消耗
| end                  | 0.000007 |
| query end            | 0.000011 |
| closing tables       | 0.000011 |
| freeing items        | 0.000030 |
| logging slow query   | 0.000003 |
| logging slow query   | 0.000102 |
| cleaning up          | 0.000022 |
+----------------------+----------+
+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------+-------+----------+------------------------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table | partitions | type        | possible_keys                                                                                                 | key                                                                       | key_len | ref   | rows  | filtered | Extra                                                                                                                                          |
+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------+-------+----------+------------------------------------------------------------------------------------------------------------------------------------------------+
|  1 | SIMPLE      | d     | NULL       | ref         | PRIMARY,idx_pm_yyyy_bid                                                                                        | idx_pm_yyyyy_bid                                                        | 9       | const |     1 |    10.00 | Using where                                                                                                                                    |
|  1 | SIMPLE      | a     | NULL       | index_merge | idx_pm_xxxx_sid,idx_pm_xxx_bid,idx_pm_brand_xxxx_organ                                                         | idx_pm_xxx_organ,idx_pm_brand_xxxx_bid                                   | 99,9    | NULL  | 11314 |     0.04 | Using intersect(idx_pm_xxxxx_organ,idx_pm_xxxx_bid); Using where; Using join buffer (Block Nested Loop)                                        |
+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------+-------+----------+------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
      从执行计划来看,d表是做了驱动表,a做了被驱动表
 
      d表 type = ref ,使用非唯一性索引或者唯一索引的前缀扫描,返回匹配某个单独值的记录行,这里使用了索引idx_pm_yyyyy_bid,该索引正是brand_id上的索引,
 
      即是说,在和a表的关联中d先通过brand_id来查找记录行,再通过相应记录的id去和a表的series_id做匹配。
 
      我查看相应的记录数,发现a表145万的大表,d表是4075的小表。
 
a表
 
mysql> select count(*) from pm_xxxxxx;
 
+----------+
 
| count(*) |
 
+----------+
 
|  1459777 |
 
+----------+
 
1 row in set (0.27 sec)
 
d表:
 
mysql> select count(*) from pm_yyyyyy;
 
+----------+
 
| count(*) |
 
+----------+
 
|     4075 |
 
+----------+
 
1 row in set (0.00 sec)
 
而 a表是type=index_merge 索引合并,这里走了idx_pm_xxx_organ(organ_id),idx_pm_brand_xxxx_bid(brand_id) ,extra 是
 
Using intersect(idx_pm_xxxxx_organ,idx_pm_xxxx_bid); Using where; Using join buffer (Block Nested Loop)
 
Using intersect正说明了这里使用了(idx_pm_xxxxx_organ,idx_pm_xxxx_bid)的交集
 
Using where 是用model_flag等这些其他条件的过滤
 
Using join buffer (Block Nested Loop) 说明使用BNL的算法进行匹配
 
 BNL 算法是将外层循环的行/结果集(驱动表)存入join buffer, 内层循环的每一行与整个buffer中的记录做比较,从而减少内层循环的次数.
 
举例来说,外层循环的结果集是100行,使用NLJ 算法需要扫描内部表100次,如果使用BNL算法,先把对Outer Loop表(外部表)每次读取的10行记录放到join buffer,然后在InnerLoop表(内部表)中直接匹配这10行数据,内存循环就可以一次与这10行进行比较, 这样只需要比较10次,对内部表的扫描减少了9/10。所以BNL算法就能够显著减少内层循环表扫描的次数.
 
在这里就是d表中取得结果集分批放入buffer中与a表进行匹配。
 
而这个语句无论如何都要2秒中,也在我们的认识中小表驱动大表并没错,我的猜想应该就是在进行BNL时消耗了时间,表现到过程中就是 Sending data 的时间消耗增多。
 
吐槽的是mysql中貌似没有什么办法来多方面看查询消耗了。

(编辑:武林网)

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表