使用索引的误区之五：空值的妙用

2024-07-21 02:06:29

字体：大中小

来源：转载

供稿：网友

使用索引的误区之五：空值的妙用
并不是因为完全为空的条目不被记录到索引中，就坚决不能使用空值，相反，有时候合理使用oracle的空值会为我们的查询带来几倍甚至几十倍的效率提高。

举个例子，加入有一个表，里面有个字段是“处理时间”，如果没有处理的事务，该列就为空，并且在大部分情况下，处理的事务总是在总记录数的10％或者更少，而等待处理的记录（“处理时间”这列为空）总是绝大多数的记录，那么在“等待时间”这列上建立索引，索引中就总是会保存很少的记录，我们希望的访问方式是，当访问表中所有代处理的记录（即10％或者更多的记录数目）时，我们希望通过全表扫描的方式来检索；然而，当我们希望访问已经处理的事务（即5％或者更少的记录数目）时，我们希望通过索引来访问，因为索引中的记录数目很少，请看下面的例子：

sql> create table tt as select * from sys.dba_objects;

table created

executed in 0.601 seconds

sql> alter table tt add (t int);

table altered

executed in 0.061 seconds

sql> select count(*) from tt;

count(*)

----------

      6131c

executed in 0.01 seconds

sql> update tt set t=1 where owner='demo';

10 rows updated

executed in 0.03 seconds

sql> commit;

commit complete

executed in 0 seconds

sql> select count(*) from tt where owner='demo';

count(*)

----------

        10 ――――――――――――――已经处理的数目

executed in 0.08 seconds

s

sql> select count(*) from tt;

count(*)

----------

      6131      ――――――――――――――总记录数目

executed in 0.01 seconds

下面的查询因为访问表中的大多数记录（代处理的记录，即10％以上的记录数目），可以看见，它如我们所希望的那样使用了全表扫描：

select object_name from tt where t is null;

plan_table_output

--------------------------------------------------------------------------------

--------------------------------------------------------------------

| id | operation            | name       | rows | bytes | cost |

--------------------------------------------------------------------

|   0 | select statement     |             |       |       |       |

|* 1 | table access full   | tt          |       |       |       |

--------------------------------------------------------------------

predicate information (identified by operation id):

---------------------------------------------------

   1 - filter("tt"."t" is null)

note: rule based optimization

14 rows selected

executed in 0.05 seconds

下面的查询因为要访问表中的少数记录，我们希望通过索引来访问：

select object_name from tt where t=1;

plan_table_output

--------------------------------------------------------------------------------

--------------------------------------------------------------------

| id | operation            | name       | rows | bytes | cost |

--------------------------------------------------------------------

|   0 | select statement     |             |       |       |       |

|* 1 | table access full   | tt          |       |       |       |

--------------------------------------------------------------------

predicate information (identified by operation id):

---------------------------------------------------

   1 - filter("tt"."t"=1)

note: rule based optimization

14 rows selected

executed in 0.06 seconds

请注意，这里并没有如我们所希望的那样使用索引，而是使用了全表扫描，这里有一个结论：

建立了索引后，要想在cbo下合理的使用索引，一定要定期的更新统计信息

下面我们分析一下索引，看看有什么效果：

sql> analyze index tt_idx validate structure;

index analyzed

executed in 0 seconds

sql> select lf_rows from index_stats;

   lf_rows

----------

        10 ――――――――――索引中总共有10行

executed in 0.05 seconds

sql> exec dbms_stats.gather_index_stats('demo','tt_idx');

pl/sql procedure successfully completed

executed in 0.03 seconds

sql> select distinct_keys from user_indexes;

distinct_keys

-------------

            1   ――――――――――只有一个键值

executed in 0.05 seconds

sql> select * from tt where t is null;

已选择6121行。

execution plan

----------------------------------------------------------

   0      select statement optimizer=choose

   1    0   table access (full) of 'tt'

statistics

----------------------------------------------------------

          0 recursive calls

          0 db block gets

        485 consistent gets

          0 physical reads

          0 redo size

     355012 bytes sent via sql*net to client

       4991 bytes received via sql*net from client

        410 sql*net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

       6121 rows processed

sql> select * from tt where t=5;

未选定行

execution plan

----------------------------------------------------------

   0      select statement optimizer=choose

   1    0   table access (by index rowid) of 'tt'

   2    1     index (range scan) of 'tt_idx' (non-unique)

statistics

----------------------------------------------------------

          0 recursive calls

          0 db block gets

          1 consistent gets

          0 physical reads

          0 redo size

        964 bytes sent via sql*net to client

        372 bytes received via sql*net from client

          1 sql*net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

          0 rows processed

sql> select * from tt where t=1;

已选择10行。

execution plan

----------------------------------------------------------

   0      select statement optimizer=choose

   1    0   table access (by index rowid) of 'tt'

   2    1     index (range scan) of 'tt_idx' (non-unique)

statistics

----------------------------------------------------------

          0 recursive calls

          0 db block gets

          4 consistent gets

          0 physical reads

          0 redo size

       1639 bytes sent via sql*net to client

        503 bytes received via sql*net from client

          2 sql*net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

         10 rows processed

sql> update tt set t=2 where t=1;

已更新10行。

execution plan

----------------------------------------------------------

   0      update statement optimizer=choose

   1    0   update of 'tt'

   2    1     index (range scan) of 'tt_idx' (non-unique)

statistics

----------------------------------------------------------

          0 recursive calls

         14 db block gets

          1 consistent gets

          0 physical reads

       3216 redo size

        616 bytes sent via sql*net to client

        527 bytes received via sql*net from client

          3 sql*net roundtrips to/from client

          2 sorts (memory)

          0 sorts (disk)

         10 rows processed

sql> set autotrace traceonly

sql> update tt set t=3 where t is null;

6121 rows updated.

execution plan

----------------------------------------------------------

   0      update statement optimizer=choose

   1    0   update of 'tt'

   2    1     table access (full) of 'tt'

statistics

----------------------------------------------------------

          0 recursive calls

      18683 db block gets

         80 consistent gets

          0 physical reads

    2583556 redo size

        618 bytes sent via sql*net to client

        533 bytes received via sql*net from client

          3 sql*net roundtrips to/from client

          1 sorts (memory)

          0 sorts (disk)

       6121 rows processed

sql>

上一篇：使用索引的误区之四：空值对索引的影响

下一篇：使用索引的误区之六：为索引列都建立索引