AUTO_SPACE_ADVISOR_JOB系统作业运行故障导致插入表被阻塞
(1)问题开始时间是20130918 22:40 从20次每秒,增加加0919号的00:20分的2500次每秒。
(2)应用程序插入SQL执行失败,程序缓冲池满了,由于插入失败,日志表中没有数据
故障分析:
(1)查看等待事件,看到SID 980和SID 1063 为阻塞源,阻塞了大量session,而1063的阻塞源又是980,所980是根源。
18-9月 -13 10.05.09.517 下午 980 p49fnjdb fcfzp8zgxx4gx Wait for shrink lock oracle@p49fnjdb (J002) UNKNOWN
18-9月 -13 10.11.20.224 下午 1063 njdmweb1 gs9qxx1pbvuw2 row cache lock JDBC Thin Client VALID 980
(2)对应SQL_ID为如下语句
alter index "CPE"."CPE_ACS_LOCALE" modify partition "P_CPE_ACTION_STATUS_201309" shrink space CHECK
INSERT /*+ append */ INTO CPE_ACTION_STATUS SELECT * FROM CPE_ACTION_STATUS_T
fcfzp8zgxx4gx
是数据库系统自动在每天10点(周六日为早上6点)运行的一个JOB,名为AUTO_SPACE_ADVISOR_JOB
该job的主要作用是是用于segment advisor,即收集表和索引的信息,并且提出优化建议。
*(3)查看JOB的情况可见平时运行 17分钟的JOB在18日22点时异常运行了8个小时并且运行失败了。
status start_date log_date run_duration
1 SUCCEEDED 2013-08-22 22:00:01 2013-08-22 22:16:51 +000 00:16:50
2 SUCCEEDED 2013-08-23 22:00:02 2013-08-23 22:16:35 +000 00:16:33
3 SUCCEEDED 2013-08-24 06:00:02 2013-08-24 06:05:11 +000 00:05:09
4 SUCCEEDED 2013-08-26 22:00:02 2013-08-26 22:13:45 +000 00:13:43
5 SUCCEEDED 2013-08-27 22:00:02 2013-08-27 22:18:16 +000 00:18:14
6 SUCCEEDED 2013-08-28 22:00:02 2013-08-28 22:17:59 +000 00:17:57
7 SUCCEEDED 2013-08-29 22:00:03 2013-08-29 22:18:55 +000 00:18:52
8 SUCCEEDED 2013-08-30 22:00:02 2013-08-30 22:19:39 +000 00:19:37
9 SUCCEEDED 2013-08-31 06:00:04 2013-08-31 06:08:42 +000 00:08:39
10 SUCCEEDED 2013-09-02 22:00:01 2013-09-02 22:17:14 +000 00:17:12
11 SUCCEEDED 2013-09-03 22:00:01 2013-09-03 22:18:14 +000 00:18:13
12 SUCCEEDED 2013-09-04 22:00:03 2013-09-04 22:16:18 +000 00:16:14
13 SUCCEEDED 2013-09-05 22:00:02 2013-09-05 22:15:35 +000 00:15:34
14 SUCCEEDED 2013-09-06 22:00:02 2013-09-06 22:14:55 +000 00:14:53
15 SUCCEEDED 2013-09-07 06:00:02 2013-09-07 06:05:39 +000 00:05:37
16 SUCCEEDED 2013-09-09 22:00:01 2013-09-09 22:17:58 +000 00:17:56
17 SUCCEEDED 2013-09-10 22:00:01 2013-09-10 22:18:45 +000 00:18:45
18 SUCCEEDED 2013-09-11 22:00:01 2013-09-11 22:16:20 +000 00:16:19
19 SUCCEEDED 2013-09-12 22:00:02 2013-09-12 22:18:17 +000 00:18:16
20 SUCCEEDED 2013-09-13 22:00:02 2013-09-13 22:18:07 +000 00:18:04
21 SUCCEEDED 2013-09-14 06:00:04 2013-09-14 06:08:16 +000 00:08:12
22 SUCCEEDED 2013-09-16 22:00:02 2013-09-16 22:22:21 +000 00:22:19
23 SUCCEEDED 2013-09-17 22:00:03 2013-09-17 22:17:25 +000 00:17:22
24 STOPPED 2013-09-18 22:00:03 2013-09-19 06:00:01 +000 07:59:58
25 SUCCEEDED 2013-09-19 22:00:02 2013-09-19 22:13:05 +000 00:13:03
26 SUCCEEDED 2013-09-20 22:00:01 2013-09-20 22:16:01 +000 00:16:00
27 SUCCEEDED 2013-09-21 06:00:04 2013-09-21 06:14:16 +000 00:14:12
(4)AUTO_SPACE_ADVISOR_JOB 执行的shrink space CHECK 与实际执行shrink space道理相同,都会给表或索引加上排他锁
,当执行完后释放锁,其他session会开始正常插入,那么为什么18日当天此JOB运行异常呢?我们分析可知有两种可能:
---If the default Oracle tasks AUTO_SPACE_ADVISOR_JOB & GATHER_STATS_JOB is started when the Activities server is heavily loaded, disk I/O and CPU utilization on the database server will become high resulting in performance degradation.
由此可知,P49F系统高峰和非高峰期,很难准确定位,18日系统JOB刚好遇到系统高峰,不想平时顺利执行成功,而是执行异常,产生了非常久时间的排它锁,导致对表CPE_ACTION_STATUS表的插入等待锁无法完成,从而导致应用缓冲堆满。应用故障。
---数据库bug,查看mos发现在10.2.0.4中已经修复,我们数据库版本是10.2.0.5,应该以解决此bug,不过也不排除bug的
(4)AUTO_SPACE_ADVISOR_JOB 执行的shrink space CHECK 与实际执行shrink space道理相同,都会给表或索引加上排他锁
,当执行完后释放锁,其他session会开始正常插入,那么为什么18日当天此JOB运行异常呢?我们分析可知有两种可能:
---If the default Oracle tasks AUTO_SPACE_ADVISOR_JOB & GATHER_STATS_JOB is started when the Activities server is heavily loaded, disk I/O and CPU utilization on the database server will become high resulting in performance degradation.
由此可知,P49F系统高峰和非高峰期,很难准确定位,18日系统JOB刚好遇到系统高峰,不想平时顺利执行成功,而是执行异常,产生了非常久时间的排它锁,导致对表CPE_ACTION_STATUS表的插入等待锁无法完成,从而导致应用缓冲堆满。应用故障。
---数据库bug,查看mos发现在10.2.0.4中已经修复,我们数据库版本是10.2.0.5,应该以解决此bug,不过也不排除bug的可能性。
解决方法:
(1)如果这个Shrink Space的检查对于环境显得多余,可以直接关闭:
execute dbms_scheduler.disable('AUTO_SPACE_ADVISOR_JOB');
(2)修改每天的运行时间,避开系统的各个高峰期即:
Schedule the Oracle tasks AUTO_SPACE_ADVISOR_JOB & GATHER_STATS_JOB to run during non-peak load times.
修改方法:
exec dbms_scheduler.disable('AUTO_SPACE_ADVISOR_JOB');
exec dbms_scheduler.set_attribute_null('AUTO_SPACE_ADVISOR_JOB','schedule_name');
exec dbms_scheduler.set_attribute('AUTO_SPACE_ADVISOR_JOB','start_date',to_date('2013/09/XX 02:00:00','yyyy-mm-dd hh24:mi:ss'));
exec dbms_scheduler.enable('AUTO_SPACE_ADVISOR_JOB');
----由于此JOB对我系统没有太大价值,所以我已经关闭该JOB可以继续观察
目录 返回
首页