Thursday, October 10, 2013

12c match_recognize feature for pattern matching

I though i will add a  practical example to help DBAs understand the power of 12c pattern matching.

In the OEM express manager view below, you will see a spike in load between 7.46 PM to 7.51PM.














The challenge is to find this starting point from v$active_session_history when the load spiked and had a sustained load for more than 5 minutes before it dropped down.
This pattern will help us capture start and end time of some sustained load on DB perhaps caused by some batch jobs.  The same query using 11g analytical window functions will not be easy to create and will have a lot more lines of code.




To capture this using 12c new feature match_recognize, i defined 2 points.
 1)  starting point. defined as UP in the example. This is when the service_wait_time values are continuously more than the starting point,  It need to be at least 200% more than the starting point (to be a reasonable spike) . Also  200% spike needs to be sustained for at least 5 minutes. 
2) end point : define as END_PT here. This is when the load tapers down and eventually less than the starting value or zero.

Below is the 12c Pattern Matching query that can capture this.

SELECT TO_CHAR(first_x,'MM/DD HH24:MI') first_time,
  TO_CHAR( last_z,'MM/DD HH24:MI') end_time,
  strt_value,
  end_value,
  ROUND(avg_t)
FROM
  (SELECT TRUNC(sample_time,'MI') time_t,
    SUM(time_waited) sum_time_waited
  FROM v$active_session_history
  WHERE wait_class != 'Idle'
  GROUP BY TRUNC(sample_time,'MI')
  ) aa MATCH_RECOGNIZE (
ORDER BY time_t MEASURES
                 FIRST(strt.time_t) AS first_x,
                 FIRST(strt.sum_time_waited) strt_value,
                 LAST(END_PT.sum_time_waited) end_value,
                 LAST(END_PT.time_t) AS last_z,
                 AVG(up.sum_time_waited) avg_t
                 ONE ROW PER MATCH
                 AFTER MATCH skip TO END_PT
                 PATTERN (STRT UP{5,} END_PT)   -- {5,} means 5 or more times.
                 DEFINE
                  UP  AS
                      up.sum_time_waited          > strt.sum_time_waited
                      AND strt.sum_time_waited   > 0
                      AND ( up.sum_time_waited -strt.sum_time_waited) *100/ strt.sum_time_waited >200,
                  END_PT AS END_PT.sum_time_waited < strt.sum_time_waited ) ; 

To see the raw data passed to this pattern matching query, you can query the following.

SELECT TO_CHAR(TRUNC(sample_time,'MI'),' MM/DD HH24:MI:') time_t,
  SUM(time_waited)
FROM v$active_session_history
WHERE wait_class != 'Idle'
GROUP BY TRUNC(sample_time,'MI')
ORDER BY TRUNC(sample_time,'MI');

In my test system, to generate some load, i ran the following load generator.

create or replace  procedure load_test (times in number) as
ct number;
begin
for i in 1..times loop
select count(*) into ct from abc;
end loop;
end load_test;
/
exec load_test(100000);

-----------------------------------------
Update: i added another blog about fraud detection using pattern matching. 

No comments:

Post a Comment

Feedback welcome