使用索引的误区之五:空值的妙用
2024-07-21 02:06:29
供稿:网友
使用索引的误区之五:空值的妙用
并不是因为完全为空的条目不被记录到索引中,就坚决不能使用空值,相反,有时候合理使用oracle的空值会为我们的查询带来几倍甚至几十倍的效率提高。
举个例子,加入有一个表,里面有个字段是“处理时间”,如果没有处理的事务,该列就为空,并且在大部分情况下,处理的事务总是在总记录数的10%或者更少,而等待处理的记录(“处理时间”这列为空)总是绝大多数的记录,那么在“等待时间”这列上建立索引,索引中就总是会保存很少的记录,我们希望的访问方式是,当访问表中所有代处理的记录(即10%或者更多的记录数目)时,我们希望通过全表扫描的方式来检索;然而,当我们希望访问已经处理的事务(即5%或者更少的记录数目)时,我们希望通过索引来访问,因为索引中的记录数目很少,请看下面的例子:
sql> create table tt as select * from sys.dba_objects;
table created
executed in 0.601 seconds
sql> alter table tt add (t int);
table altered
executed in 0.061 seconds
sql> select count(*) from tt;
count(*)
----------
6131c
executed in 0.01 seconds
sql> update tt set t=1 where owner='demo';
10 rows updated
executed in 0.03 seconds
sql> commit;
commit complete
executed in 0 seconds
sql> select count(*) from tt where owner='demo';
count(*)
----------
10 ――――――――――――――已经处理的数目
executed in 0.08 seconds
s
sql> select count(*) from tt;
count(*)
----------
6131 ――――――――――――――总记录数目
executed in 0.01 seconds
下面的查询因为访问表中的大多数记录(代处理的记录,即10%以上的记录数目),可以看见,它如我们所希望的那样使用了全表扫描:
select object_name from tt where t is null;
plan_table_output
--------------------------------------------------------------------------------
--------------------------------------------------------------------
| id | operation | name | rows | bytes | cost |
--------------------------------------------------------------------
| 0 | select statement | | | | |
|* 1 | table access full | tt | | | |
--------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter("tt"."t" is null)
note: rule based optimization
14 rows selected
executed in 0.05 seconds
下面的查询因为要访问表中的少数记录,我们希望通过索引来访问:
select object_name from tt where t=1;
plan_table_output
--------------------------------------------------------------------------------
--------------------------------------------------------------------
| id | operation | name | rows | bytes | cost |
--------------------------------------------------------------------
| 0 | select statement | | | | |
|* 1 | table access full | tt | | | |
--------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter("tt"."t"=1)
note: rule based optimization
14 rows selected
executed in 0.06 seconds
请注意,这里并没有如我们所希望的那样使用索引,而是使用了全表扫描,这里有一个结论:
建立了索引后,要想在cbo下合理的使用索引,一定要定期的更新统计信息
下面我们分析一下索引,看看有什么效果:
sql> analyze index tt_idx validate structure;
index analyzed
executed in 0 seconds
sql> select lf_rows from index_stats;
lf_rows
----------
10 ――――――――――索引中总共有10行
executed in 0.05 seconds
sql> exec dbms_stats.gather_index_stats('demo','tt_idx');
pl/sql procedure successfully completed
executed in 0.03 seconds
sql> select distinct_keys from user_indexes;
distinct_keys
-------------
1 ――――――――――只有一个键值
executed in 0.05 seconds
sql> select * from tt where t is null;
已选择6121行。
execution plan
----------------------------------------------------------
0 select statement optimizer=choose
1 0 table access (full) of 'tt'
statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
485 consistent gets
0 physical reads
0 redo size
355012 bytes sent via sql*net to client
4991 bytes received via sql*net from client
410 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6121 rows processed
sql> select * from tt where t=5;
未选定行
execution plan
----------------------------------------------------------
0 select statement optimizer=choose
1 0 table access (by index rowid) of 'tt'
2 1 index (range scan) of 'tt_idx' (non-unique)
statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1 consistent gets
0 physical reads
0 redo size
964 bytes sent via sql*net to client
372 bytes received via sql*net from client
1 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
sql> select * from tt where t=1;
已选择10行。
execution plan
----------------------------------------------------------
0 select statement optimizer=choose
1 0 table access (by index rowid) of 'tt'
2 1 index (range scan) of 'tt_idx' (non-unique)
statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1639 bytes sent via sql*net to client
503 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed
sql> update tt set t=2 where t=1;
已更新10行。
execution plan
----------------------------------------------------------
0 update statement optimizer=choose
1 0 update of 'tt'
2 1 index (range scan) of 'tt_idx' (non-unique)
statistics
----------------------------------------------------------
0 recursive calls
14 db block gets
1 consistent gets
0 physical reads
3216 redo size
616 bytes sent via sql*net to client
527 bytes received via sql*net from client
3 sql*net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
10 rows processed
sql> set autotrace traceonly
sql> update tt set t=3 where t is null;
6121 rows updated.
execution plan
----------------------------------------------------------
0 update statement optimizer=choose
1 0 update of 'tt'
2 1 table access (full) of 'tt'
statistics
----------------------------------------------------------
0 recursive calls
18683 db block gets
80 consistent gets
0 physical reads
2583556 redo size
618 bytes sent via sql*net to client
533 bytes received via sql*net from client
3 sql*net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6121 rows processed
sql>