Thursday, October 31, 2013

SAS Solutions OnDemand on Oracle Exadata

If  there are companies out there making a decision to run SAS products  and wanted to choose a database  or wondered if Oracle Exadata is a good fit ?
Consider this. SAS runs their Cloud solutions on Oracle Exadata. SAS Solutions OnDemand   did talk at Oracle  OpenWorld 2013. They were happy with ROI, performance, scalability and simplicity of provisioning. 
Interesting info from the presentation slides is the ability of SAS specific sqls are able to make use of exadata features to improve performance.
some of the examples listed were.
  • Up to 45x performance increase with Exadata storage indexes.
  • Dropped some of the indexes to save space and make use of smart scans and storage indexes.
  • Daily ETL runs < 10 hours vs. > 20 hours
  • Interface in use by 33,000 users now returns all queries in less than 30 seconds vs. many selections timing out at 3 minutes.
For more info, checkout http://www.oracle.com/sas

Wednesday, October 23, 2013

12c Fraud detection using pattern matching. (MATCH_RECOGNIZE)

Just wanted to show different real scenarios  where 12c pattern matching could be used.  
This feature could be handy in running one of several scenarios of Credit Card frauds.

One common use case is when a credit card gets stolen and the thief first purchases a small item such as candy or gas to make sure the card works and then purchases a large ticket item as quickly as possible before suspicions arise in a 15 minute span. 

To run this with 12c match_recognize, the query will be as show below.

select * from transaction
MATCH_RECOGNIZE
(
  partition by seller_name order by tx_time
  MEASURES
         to_char(FIRST(low.tx_time),'DD-MON-YY HH24:MI') AS first_low_time,
        FIRST(low.tx_amount) first_tx_value,
      --  last(low.tx_amount) last_low_amount,
        (SPIKE.tx_amount) big_tx,
        to_char(spike.tx_time,'DD-MON-YY HH24:MI') AS big_tx_time,
        match_number() as match_num
        ONE ROW PER MATCH
        AFTER MATCH skip to next row
        PATTERN (STRT LOW* SPIKE)
        DEFINE
        LOW as low.tx_amount < 50,
        SPIKE as spike.tx_amount > 500 and (spike.tx_time - low.tx_time)*1440 < 15
) ;

SE FIRST_LOW_TIME   FIRST_TX_VALUE      BIG_TX       BIG_TX_TIME            MATCH_NUM
-- ---------------- --------------  ----------   ------------------------  ----------
aa 22-OCT-13 19:25              32         
503       22-OCT-13 19:28               1
aa 22-OCT-13 19:26              23         503       22-OCT-13 19:28               2
aa 22-OCT-13 19:27              12         503       22-OCT-13 19:28               3

This is a challenge to build using 11g style analytical functions.
By simple way of defining patters and then searching for one or more occurrence of them, we build a complex query. If we to add additional condition to the above query, we define the condition and do patter matching.

This patter matching within database is also called mapreduce technique also.

Below is the sample data created to run this query.

-- drop table transaction purge;
create table transaction ( tx_amount number, zip number, tx_time date, seller_name varchar2(32)) ;
insert into transaction values (2, 75028,sysdate +1/1440, 'aa');
insert into transaction values (12, 75028,sysdate +2/1440, 'aa');
insert into transaction values (2, 75028,sysdate +3/1440, 'aa');
insert into transaction values (25, 75028,sysdate +4/1440, 'aa');
insert into transaction values (55, 75028,sysdate +4/1440, 'aa');
insert into transaction values (5, 75028,sysdate +5/1440, 'aa');
insert into transaction values (10, 75028,sysdate +6/1440, 'aa');
insert into transaction values (2, 75028,sysdate +7/1440, 'aa');
insert into transaction values (432, 75028,sysdate +8/1440, 'aa');
insert into transaction values (2, 75028,sysdate +9/1440, 'aa');
insert into transaction values (43, 75028,sysdate +10/1440, 'aa');
insert into transaction values (2, 75028,sysdate +11/1440, 'aa');
insert into transaction values (35, 75028,sysdate +12/1440, 'aa');
insert into transaction values (35, 75028,sysdate +13/1440, 'aa');
insert into transaction values (35, 75028,sysdate +15/1440, 'aa');
insert into transaction values (88, 75028,sysdate +16/1440, 'aa');
insert into transaction values (52, 75028,sysdate +17/1440, 'aa');
insert into transaction values (32, 75028,sysdate +18/1440, 'aa');
insert into transaction values (23, 75028,sysdate +19/1440, 'aa');
insert into transaction values (12, 75028,sysdate +20/1440, 'aa');
insert into transaction values (503, 75028,sysdate +21/1440, 'aa');
insert into transaction values (66, 75028,sysdate +21/1440, 'aa');
insert into transaction values (45, 75028,sysdate +22/1440, 'aa');
insert into transaction values (45, 75028,sysdate +23/1440, 'aa');
insert into transaction values (23, 75028,sysdate +21/1440, 'aa');
insert into transaction values (66, 75028,sysdate +21/1440, 'aa');
insert into transaction values (509, 75028,sysdate +40/1440, 'aa');
insert into transaction values (45, 75028,sysdate +23/1440, 'aa');

commit;

Friday, October 11, 2013

12c CDB PDB views. (Metadata link and Object links )


One of the key feature of 12c container architecture is isolating the data dictionary tables and views specific to the application separate from the data dictionary tables specific to manage the Database ( now called the container) .  

However, the Oracle has some internal ways to talk to each other and populate the views required for management and monitoring.
  The views can work both ways. 
1) From CDB to PDBs :AWR is run at CDB level and internal object links populate the PDB's tables with information only specific to that PDB.
2) From PDBs to CDBs: Dictionary views like DBA_USERS, DBA_TABLES and DBA_SOURCE on the other hand is local to PDBs and has a superset view with a Prefix of CDB which will add the data from each of these PDBS through a mechanism called metadata link.

The following graphic shows the relationship among the different categories of dictionary views: 



One important point to note is these CDB_views are populated ONLY when the PDBs ARE OPEN.

Below is a test to prove this.
-- from CDB, close all PDBs and select count(*) from CDB_USERS.

SQL> alter pluggable database all close immediate;

Pluggable database altered.

SQL> show pdbs;

    CON_ID CON_NAME                 OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
     2 PDB$SEED                 READ ONLY  NO
     3 PDB1                  MOUNTED
     4 PDB2                  MOUNTED
     5 PDB_2                 MOUNTED
     6 PDB_1                 MOUNTED
     7 PDB_4                 MOUNTED
     8 PDB_3                 MOUNTED
     9 PDB_5                 MOUNTED
     10 PDB_1_DUP                 MOUNTED
SQL> select count(*) from cdb_users;

  COUNT(*)
----------
     70

-- from CDB, OPEN all PDBs and select count(*) from CDB_USERS.

SQL> alter pluggable database all open;

Pluggable database altered.

SQL> show pdbs

    CON_ID CON_NAME                 OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
     2 PDB$SEED                 READ ONLY  NO
     3 PDB1                  READ WRITE NO
     4 PDB2                  READ WRITE NO
     5 PDB_2                 READ WRITE NO
     6 PDB_1                 READ WRITE NO
     7 PDB_4                 READ WRITE NO
     8 PDB_3                 READ WRITE NO
     9 PDB_5                 READ WRITE NO
     10 PDB_1_DUP            READ WRITE NO
SQL> select count(*) from cdb_users;

  COUNT(*)
----------
       372


SQL> 


The count of users increased when we open all the PDBs. Thus prooving that CDB Views are populated  through metadata link only when the PDBs are OPEN.

To see tests on object links views ( Link from CDB to PDBs) click here.

Update:
 To see a list of all objects that are local to PDB and have a metadata link to CDB, you can run the below query.

select distinct object_name from dba_objects where sharing ='METADATA LINK';

When you clone from a seed PDB, I observer about 84,206 objects that have metadata link. This might change in future versions of 12c.

Thursday, October 10, 2013

12c match_recognize feature for pattern matching

I though i will add a  practical example to help DBAs understand the power of 12c pattern matching.

In the OEM express manager view below, you will see a spike in load between 7.46 PM to 7.51PM.














The challenge is to find this starting point from v$active_session_history when the load spiked and had a sustained load for more than 5 minutes before it dropped down.
This pattern will help us capture start and end time of some sustained load on DB perhaps caused by some batch jobs.  The same query using 11g analytical window functions will not be easy to create and will have a lot more lines of code.




To capture this using 12c new feature match_recognize, i defined 2 points.
 1)  starting point. defined as UP in the example. This is when the service_wait_time values are continuously more than the starting point,  It need to be at least 200% more than the starting point (to be a reasonable spike) . Also  200% spike needs to be sustained for at least 5 minutes. 
2) end point : define as END_PT here. This is when the load tapers down and eventually less than the starting value or zero.

Below is the 12c Pattern Matching query that can capture this.

SELECT TO_CHAR(first_x,'MM/DD HH24:MI') first_time,
  TO_CHAR( last_z,'MM/DD HH24:MI') end_time,
  strt_value,
  end_value,
  ROUND(avg_t)
FROM
  (SELECT TRUNC(sample_time,'MI') time_t,
    SUM(time_waited) sum_time_waited
  FROM v$active_session_history
  WHERE wait_class != 'Idle'
  GROUP BY TRUNC(sample_time,'MI')
  ) aa MATCH_RECOGNIZE (
ORDER BY time_t MEASURES
                 FIRST(strt.time_t) AS first_x,
                 FIRST(strt.sum_time_waited) strt_value,
                 LAST(END_PT.sum_time_waited) end_value,
                 LAST(END_PT.time_t) AS last_z,
                 AVG(up.sum_time_waited) avg_t
                 ONE ROW PER MATCH
                 AFTER MATCH skip TO END_PT
                 PATTERN (STRT UP{5,} END_PT)   -- {5,} means 5 or more times.
                 DEFINE
                  UP  AS
                      up.sum_time_waited          > strt.sum_time_waited
                      AND strt.sum_time_waited   > 0
                      AND ( up.sum_time_waited -strt.sum_time_waited) *100/ strt.sum_time_waited >200,
                  END_PT AS END_PT.sum_time_waited < strt.sum_time_waited ) ; 

To see the raw data passed to this pattern matching query, you can query the following.

SELECT TO_CHAR(TRUNC(sample_time,'MI'),' MM/DD HH24:MI:') time_t,
  SUM(time_waited)
FROM v$active_session_history
WHERE wait_class != 'Idle'
GROUP BY TRUNC(sample_time,'MI')
ORDER BY TRUNC(sample_time,'MI');

In my test system, to generate some load, i ran the following load generator.

create or replace  procedure load_test (times in number) as
ct number;
begin
for i in 1..times loop
select count(*) into ct from abc;
end loop;
end load_test;
/
exec load_test(100000);

-----------------------------------------
Update: i added another blog about fraud detection using pattern matching. 

Friday, October 4, 2013

12c CLONE PDBS online

 In oracle release 12.1.0.1,  we can now clone PDBS on the same container. While this is really easy and good enough for most of the database, it  requires shutting down PDBs and opening in READ ONLY mode. So the question is, " what to do if we need to Clone a PDB, but do not want to shutdown the PDB ?"

The solution is simple and includes the following steps.

1) Duplicate the PDB in a container to a auxiliary CDB .
2) unplug the PDB from auxiliary CDB.
3) Plug-in the PDB with the CLONE MOVE or CLONE COPY clause to move/copy the PDB and change the name of the PDB at the same time.
4) You can now destroy the  auxiliary CDB .(optional)

Some detail steps below.

1) Duplicate the PDB in a container to a ancillary CDB .

   To make a Duplicate copy of a CDB with specific PDBs, the steps are similar to regular RMAN DUPLICATE clause, with the addition of PDB name lists. To duplicate the DB, the database NEED NOT be in archivelog mode.

 the steps include 
  a)  Edit listener.ora to add auxiliary CDB in SID_LIST_LISTENER.

       THis is a mandatory step even if you opt to use port 1521 and depend on dynamic registration.
       The reason is that RMAN command will first shutdown  and restart the auxiliary instance. 
       Note:  If the source container CDB1 in my case is not in SID_LIST_LISTENER, it is because CDB1 is automatically registered and RMAN will not restart it.

    LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
    )
  )

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = dup)
      (ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1)
      (SID_NAME = dup)
    )
  )



 b) add entry in tnsnames.ora for source and auxiliary  CDBs.

     This again is a mandatory step as rman connects to the auxiliary db using the connect string after it shuts the db internally.

dup =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = dup)(UR=A)
    )
  )

c) create a sample pfile for DUP database..
    connect to the source database CDB1 and create a editable pfile;
    sql> conn  / as sysdba
            create pfile='/tmp/initdup.ora' from spfile;

   edit and replace all CDB1 with DUP. also change the name for controlfile to avoid override errors.

 *.audit_file_dest='/u01/app/oracle/admin/dup/adump'
*.audit_trail='db'
*.compatible='12.1.0.0.0'
*.control_files='/u01/app/oracle/oradata/dup/controdup1.ctl','/u01/app/oracle/fast_recovery_area/dup/controdup2.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_files=800
*.db_name='dup'
*.db_recovery_file_dest='/u01/app/oracle/fast_recovery_area'
*.db_recovery_file_dest_size=4800m
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=dup)'
*.enable_pluggable_database=true
*.memory_target=400m
*.open_cursors=300
*.processes=300
*.remote_login_passwordfile='EXCLUSIVE'
*.temp_undo_enabled=FALSE
*.undo_tablespace='UNDOTBS1'

 then move the edited file to $ORACLE_HOME/dbs directory.

 mv /tmp/initdup.ora $ORACLE_HOME/dbs/.

d)  Add password file for auxillary database/
     CD $ORACLE_HOME/dbs

     export ORACLE_SID=dup
     orapwd file=$ORACLE_HOME/orapwdup password=oracle entries=10

  
 d) connect to RMAN and run the DUPLICATE COMMAND.

  
  • To duplicate the PDB pdb1 from cdb1 to CDB cdb2, use the following command:
    DUPLICATE TARGET DATABASE TO cdb2 PLUGGABLE DATABASE pdb1;
    
  • To duplicate the PDBs pdb1pdb3, and pdb4 from database cdb1 to cdb2, use the following command:
    DUPLICATE TARGET DATABASE TO cdb2 PLUGGABLE DATABASE pdb1,pdb3,pdb4;
    

E.G.
 $>rman target sys@cdb1 AUXILIARY sys@dup

Recovery Manager: Release 12.1.0.1.0 - Production on Tue Oct 1 16:57:29 2013

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

target database Password:
connected to target database: CDB1 (DBID=792037445)
auxiliary database Password:
connected to auxiliary database: DUP (not mounted)


RMAN>

 run
{
SET NEWNAME FOR DATABASE TO '/u01/app/oracle/oradata/%U';
DUPLICATE TARGET DATABASE TO dup PLUGGABLE DATABASE PDB_1 FROM ACTIVE DATABASE NOFILENAMECHECK;

}
    

2) unplug the PDB.

  We can now unplug the PDB from the duplicate database .

  alter pluggable database pdb_1 unplug into '/u01/app/oracle/oradata/pdb_1.xml';

3) Plug-in the PDB with the CLONE MOVE  or CLONE COPY clause to move or copy the PDB and change the name of the PDB at the same time.
Now, this PDB can be plugged back to the original CDB.
This way you can clone PDBs without closing any PDBs. The original PDB  pdb1 can be online all the while. The clone of pdb1 is called pdb_1_dup.
SQL> create pluggable database pdb_1_dup
        AS CLONE using '/u01/app/oracle/oradata/pdb_1.xml'
        COPY
        FILE_NAME_CONVERT=('/u01/app/oracle/oradata/','/u01/app/oracle/oradata/cdb1/pdb_1_dup')
SQL> /

Pluggable database created.

SQL> show pdbs

    CON_ID CON_NAME                 OPEN MODE  RESTRICTED
---------- ------------------------------ ----------      ----------
     2       PDB$SEED                       READ ONLY          NO
     3       PDB1                               READ WRITE       NO
     4       PDB2                               MOUNTED
     5       PDB_2                             MOUNTED
     6       PDB_1                             READ WRITE        NO
     7       PDB_4                             MOUNTED
     8       PDB_3                             MOUNTED
     9       PDB_5                             MOUNTED
     10       PDB_1_DUP                    MOUNTED
SQL>
SQL> 

Tuesday, October 1, 2013

12c connect to multiple PDBs with the same name in different CDBs



When you create a Pluggable Database (PDB) , the database automatically creates and starts a service inside the Container Database ( CDB) . Let us assume you have a PDB_1 database in CDB1. I created a another container called DUP and created a pdb called PDB_1.
In my case, i just used RMAN to duplicate the CDB1 to DUP.

Now, if i try to connect to the service using the easy connect or tnsnames.ora to PDB_1, which CDB--> PDB will i be connected to ?

easy_connect syntax is
sqlplus scott/tiger@//localhost:1521/pdb_1 

What chances are that i could be connected to CDB1-->PDB_1 database or DUP-->PDB_1 database ..
To prove my point, let us look at the listener status.

[oracle@unknown080027afa328 dup]$ lsnrctl stat

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 01-OCT-2013 17:03:22

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                01-OCT-2013 15:58:19
Uptime                    0 days 1 hr. 5 min. 3 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/unknown080027afa328/listener/alert/log.xml
Services Summary...

Service "cdb1" has 2 instance(s).
  Instance "cdb1", status UNKNOWN, has 1 handler(s) for this service...
  Instance "cdb1", status READY, has 1 handler(s) for this service...
Service "cdb1XDB" has 1 instance(s).
  Instance "cdb1", status READY, has 1 handler(s) for this service...
Service "dup" has 2 instance(s).
  Instance "dup", status UNKNOWN, has 1 handler(s) for this service...
  Instance "dup", status READY, has 2 handler(s) for this service...
Service "pdb_1" has 2 instance(s).
  Instance "cdb1", status READY, has 1 handler(s) for this service...
  Instance "dup", status READY, has 1 handler(s) for this service...
...

 The service has the same name as the PDB_1. But observe that it has two instances...
Solution:
This is where i remember that in easy connect and tnsnames.ora we always had a option to connect to an instance_name. I was using that functionality in RAC services or db links to connect to one node, even if the services can span multiple nodes for load balancing.. 
Now using the existing functionality, i can now  pick which PDB to connect..

example below.

sqlplus scott/tiger@//localhost:1521/pdb_1/dup  and

sqlplus scott/tiger@//localhost:1521/pdb_1/cdb1

or if you use tnsnames.ora it would be

PDB_1_dup =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = pdb_1) (instance_name = dup)
    )
  )