Thursday, November 13, 2014

"CPU :In-memory query" in Oracle 12c

 In 12c EM, we have a  " CPU :in-memory query" displayed in the  graphical report of performance tab. Since Oracle Database In-Memory sessions access only memory, it makes perfect sense to track the CPU utilized for such queries.  I did not see any separate event or  info of how it is derived. 
However, there are new columns called "in_memory_query" and "in_memory_populate" which have values 'Y' and 'N'
I think  "CPU :In-memory query" is derived from the following logic.

 From the v$active_session_history columns session_state, session_type and in_memory_query or in_memory_populate values and its meaning.

      session_state + session_type       + in_inmemory_xxx                                = "Wait-class"
     "ON CPU"        + "FOREGROUND" + in_inmemory_query ='Y'                      =  CPU in-memory query
      "ON CPU"       + "FOREGROUND" + in_inmemory_query ='N'                      =  CPU
      "ON CPU"       + "BACKGROUND" + in_inmemory_populate ='Y'                  =  CPU: in-memory populate
     "ON CPU"        + "BACKGROUND" + in_memory_populate='N'                     =  BCPU /* background CPU */

else
      "WAITING"                                                                                                 = "wait_class value"


the query for the above logic will be something like
 
SELECT sample_time,
DECODE (session_state,'ON CPU',
   DECODE(session_type,'FOREGROUND',
        DECODE( in_inmemory_query,'Y','CPU in-memory query','CPU'),'BACKGROUND',DECODE(in_inmemory_populate,'Y','CPU: in-memory populate','BCPU')),wait_class)
        FROM v$active_session_history order by sample_time desc;

Tweaks for Linux VirtualBox (VBOX) to use SIMD for 12c InMemory.


  Oracle InMemory uses  SSE4.2 extensions to run SIMD.
  However, by default, Virtualbox(VBOX) may not configure this. You can verify if the extensions are enabled
by checking  the flag in cpuinfo as follows in your Linux guest OS

  cat /proc/cpuinfo | grep sse4

If it is not set,  then no rows will be returned.

 you have to do the following tweak to enable it.
1)  Shutdown the guest VM, and the GUI VM manager.
2) cd to VM home directory.
    eg.  cd  D:\software\virtualbox
3) run the following command  Allowing VirtualBox to pass the SSE 4.2 extensions
( supported in VirtualBox starting in Versions 4.3.8)

eg.
D:\software\virtualbox>VBoxManage setextradata "VM name" VBoxInternal/CPUM/SSE4.2 1

Where "VM Name" is the name of your guest Linux VM.

4) start the GUI VManager  and start the guest OS.

5) verify if the  flag is set.
cat /proc/cpuinfo | grep sse4

[oracle@bigdatalite ~]$ cat /proc/cpuinfo | grep sse4
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl pni ssse3 sse4_2 lahf_lm

To see if the library is linked dynamically , you can run the following 
ps -ef | grep ora_p | awk '{system ( "pmap "$2 " |grep libshpk " ) ; print "-->"$8 }'

[oracle@bigdatalite ~]$  ps -ef | grep ora_p | awk '{system ( "pmap "$2 " |grep libshpk " ) ; print "-->"$8 }'
00007f732897d000   1500K r-x--  /u01/app/oracle/product/12.1.0/dbhome_1/lib/libshpksse4212.so
00007f7328af4000   2044K -----  /u01/app/oracle/product/12.1.0/dbhome_1/lib/libshpksse4212.so
00007f7328cf3000     72K rw---  /u01/app/oracle/product/12.1.0/dbhome_1/lib/libshpksse4212.so
-->ora_pmon_orcl
00007f8e01a71000   1500K r-x--  /u01/app/oracle/product/12.1.0/dbhome_1/lib/libshpksse4212.so
00007f8e01be8000   2044K -----  /u01/app/oracle/product/12.1.0/dbhome_1/lib/libshpksse4212.so
00007f8e01de7000     72K rw---  /u01/app/oracle/product/12.1.0/dbhome_1/lib/libshpksse4212.so

Tuesday, September 30, 2014

12c PDBs and SPFILE

In 12c multi-tenant database, the spfile is for CDB only. A newly created PDB will inherit the initialization values from the container database if they are created from the pdb$seed ( default empty PDB). 
 There are about 184 parameters that can be modified at the PDB level and are stored in the sys table of the PDB. If you altered the parameter in a PDB  and then plugged in to a different Container database, these modified parameters go with the PDB.

The v$system_parameter shows the different configured parameters of all plugged in database. To see all non default parameters, you can run the following query.
select  name,value, con_id from v$system_parameter where con_id!=0;


One interesting fact is, once a parameter has been set, later even if you reset the value equal to container default, the PDB will still maintain the parameter values.

Tuesday, January 21, 2014

Oracle In-database MapReduce in 12c (big data)

There is some interest from the field about what is In-database map-reduce option and why and how it is different than hadoop solution.
I though I will share my thoughts on it.

 In-database map-reduce is an umbrella term that includes two features.
  •             "SQL Map-reduce" or  "SQL pattern matching".
  •              In database container for Hadoop.  to be released in future release. 


  • "SQL MapReduce" : Oracle database 12c introduced a new feature called PATTERN MATCHING using "MATCH_RECOGNIZE" clause in SQL. This is one of the latest ANSI SQL standards proposed and implemented by Oracle. The new sql syntax helps to intuitively solve complex queries that are not easy to implement using 11g analytical functions alone. Some of the use cases are fraud detection, gene sequencing, time series calculation, stock ticker pattern matching . Etc.  I found most of the use case for Hadoop can be done using match_recognize in database on structured data. Since this is just a SQL enhancement , it is there in both Enterprise & Standard Edition database.


  • "In database container for Hadoop  (beta)" : if you have your development team more skilled at Hadoop and not SQL , or want to implement some complex pre-packaged Hadoop algorithms, you could use oracle container for Hadoop (beta). It is a Hadoop prototype APIs  which run within the java virtual machine in the database. It implements Hadoop Java APIs and interfaces with database using parallel table functions to read data in parallel. One interesting fact about parallel table functions is that it can run in parallel across RAC cluster and also can also route data to a specific parallel processes . This functionality is the key in making Hadoop scale across clusters and  this functionality exited in database for over 15 years now.  Advantage of in-database Hadoop  is 
  1.  No need to move data out of database for running Mapreduce functions and hence save time and resources.
  2.  More  real time data could be used.
  3.  Less redundant copies of data and hence better security & less disk space used.
  4.  The servers could be used for not just MapReduce work, but also used to run the database making better resource utilization,
  5. The output of the MapReduce is immediately available for analytic tools and can combine this functionality along with database features like "in-memory option (beta) to get near real time analysis of Big Data. 
  6. Combine db features for security. Backup, auditing, performance with MapReduce. API.
  7. The ability to stream the output of one parallel table function as input to the next parallel table function has an advantage of not needing to maintain any intermediate stages.
  8. Features like graphical, test, spacial and semantic within oracle database can be used for further analysts.
In addition to this, Oracle 12c will support schema less access using JSON protocol. That will help big data use cases of NOSQL to run on data within Oracle database as well.  

Conclusion.
Having these features will help to solve MapReduce challenges when the data is mostly within database and reduce data movement and make better use of available resources.. 
If Most of your data is outside the DB, then sql Connectors for hadoop and Oracle Loader for Hadoop could be used. 


Reference

1) presentation from Kuassi Mensah 

Friday, January 3, 2014

Sql* developer guage sample report

 I had tried to show something like guage in sqlplus using RPAD trick.
Howevere, if you are running the query in sql*developer , using GUAGE will be a good way to display the graph. 

below is the  output in sql*developer and sqlplus.


Sql*Developer OUTPUT
select name,con_id, bufferCache_MB, 'SQLDEV:GAUGE:0:100:0:100:' ||PCT as PCT from
(  SELECT NVL(pdb.name,'CDB') name,b.con_id,
  ROUND(b.subtotal *
  (SELECT value FROM V$PARAMETER WHERE name = 'db_block_size'
  )                        /(1024 *1024)) bufferCache_MB,
  ROUND(b.subtotal         /b.total * 100) PCT,
  rpad('*',ROUND(b.subtotal/b.total * 100),'*') PCT_BARCHART
FROM
  ( SELECT DISTINCT a.*
  FROM
    (SELECT con_id,
      COUNT(*) over (partition BY con_id ) subtotal ,
      COUNT(*) over () total
    FROM sys.gv$bh
    ) a
  )b, v$pdbs pdb
  where b.con_id=pdb.con_id(+)
ORDER BY con_id);
Sqlplus Output 
col PCT_BARCHART format a30
set linesize 200
 col name format a10
 col con_id format 99
 col PCT format 999
    SELECT NVL(pdb.name,'CDB') name,b.con_id,
  ROUND(b.subtotal *  
 (SELECT value FROM V$PARAMETER WHERE name = 'db_block_size'
  )                        /(1024 *1024)) bufferCache_MB,
  ROUND(b.subtotal         /b.total * 100) PCT,
  rpad('*',ROUND(b.subtotal/b.total * 100),'*') PCT_BARCHART
FROM
  ( SELECT DISTINCT a.*
  FROM
    (SELECT con_id,
      COUNT(*) over (partition BY con_id ) subtotal ,
      COUNT(*) over () total
    FROM sys.gv$bh
    ) a
  )b, v$pdbs pdb
  where b.con_id=pdb.con_id(+)
ORDER BY con_id;