DBD::Oracle module for Perl

This is a shameless post as my friend, Pawel Krol, pointed this out to me so I must give all credit to him. As I am a big fan of the Perl-DBI I thought this was too good to share.

Oracle 11.2.0.1 on Linux and possibly earlier versions is shipped with Perl environment complete with Perl-DBI and the “database dependent” Oracle::DBD module. I would expect it is included in other Oracle releases too, e.g. for AIX. I think Perl is a really useful skill for any DBA working on a Unix-like platform, maybe even Windows as well, and it’s even better when you can embed SQL statements in your code instead of calling SQL*Plus and trying to parse the output. Perl also gives you the ability to use bind variables properly and re-use statement handles, something you would need to write PL/SQL for, if using SQL*Plus.

There are two parts to the Perl database interface:

  1. The database-independent part, the Perl-DBI, which is commonly included in many Linux distributions.
  2. The database-dependent part, in Oracle’s case DBD::Oracle. This is rarely installed by default in a Linux distribution; you will usually just find the MySQL-specific module and one for handling CSV files.

Of course you could download and compile DBD::Oracle yourself but this makes life much easier in environments you don’t have control over as you can use it by just setting a few environment variables.

So how do you use it? There are cleverer ways, I know Pawel has one, but I just have a quick bash script I source into my environment:

#!/bin/bash

if [ -n $ORACLE_HOME ]; then
echo "Setting up Perl environment. ORACLE_HOME set to $ORACLE_HOME"
export ORACLEPERL=$ORACLE_HOME/perl
export PERL5LIB=$PERL5LIB:$ORACLEPERL/lib/5.10.0:$ORACLEPERL/lib/site_perl/5.10.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib32:$ORACLE_HOME/lib
export LIBPATH=$ORACLE_HOME/lib32
export PATH=$ORACLEPERL/bin:$PATH
fi

And then I can run my test script which just selects the current date from the dual table:

#!/u01/app/oracle/product/11.2.0/db_1/perl/bin/perl

use strict;
use warnings;
use DBI;

&main;

sub main () {

my %params = (
'database' => 'ORCL',
);

my $dbh = &dbconnect(\%params);

my ($sysdate) = $$dbh->selectrow_array("SELECT sysdate FROM dual");
print "Current date is $sysdate\n\n";

&dbdisconnect($dbh);

}

sub dbconnect () {

my $params = shift;
my $dbh = DBI->connect("dbi:Oracle:$$params{'database'}", 'username', 'password') or die ("$DBI::errstr\n\n");
return (\$dbh);
}

sub dbdisconnect () {

my $dbh = shift;
$$dbh->disconnect();
}

The world of Perl is now hopefully your oyster and there is plenty of documentation here.

Ulimits and ORA-04030

Oracle database processes are to many systems administrators just another service that runs on the machines they look after. Understandably sys admins like to be able to put some controls in to prevent these processes from consuming too many CPU resources, filling up the disks or hogging all the memory on the system. This can be done on UNIX and Linux systems with ulimits which control the resources available to your shell and the programs it starts.

You can check your ulimits that currently apply to you by running the command ulimit -a. On Linux it looks like this and it’s similar with AIX:

[ben@linux ~]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) 32768
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 23551
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 32768
cpu time               (seconds, -t) unlimited
max user processes              (-u) 23551
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

You’ll note that I ran the above commands as my own user. Oracle makes some recommendations for the settings for the oracle user, generally found in the release notes or prerequisites for your platform. Because Oracle makes no recommendations about shell limits on a DBA account, the limits applying to you personally will often be much lower than those applying to the Oracle user.

So when might this be a problem? Let’s take how you start up your Oracle service. Here are two ways it’s possible to do it:

  • Using srvctl, e.g. srvctl start database -d dbname
  • Using SQL*Plus, e.g. sqlplus / as sysdba and then issue command startup.

The problem comes with the second method. When invoking Oracle using SQL*Plus, the shell limits that apply to all the Oracle processes are inherited from your shell. Therefore you need to make sure you’re logged in as the Oracle user. There are some subtleties around this: what happens if you use sudo, e.g. sudo -s -u oracle to invoke an Oracle user’s shell? On some systems you might find that your shell limits don’t change and you’re still stuck with the lower limits of your own shell: you can check by running ulimit -a. You may be setting yourself up for ORA-04030 errors; limiting the PGA memory area allowed for sorting data; and imposing other operating system limits.

This problem doesn’t exist if you invoke sudo su - oracle to change to the oracle user.

What happens if Oracle is already running and you’re not sure if it was started correctly? With Linux it is quite easy to find out, provided you have the right privileges. First get the process id of your smon process, in this case 4057:

[ben@linux ~]$ ps -ef | grep smon
root      3790     1  1 Feb05 ?        08:04:14 /u01/app/grid/11203/home_1/bin/osysmond.bin
grid      4057     1  0 Feb05 ?        00:02:02 asm_smon_+ASM1
oracle    4495     1  0 Feb05 ?        00:04:50 ora_smon_beverley1
ben       7468  7438  0 11:17 pts/0    00:00:00 grep smon

And then take a look in /proc/4057/limits:

[ben@linux 4057]$ sudo cat limits
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            33554432             unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             23551                23551                processes 
Max open files            65536                65536                files     
Max locked memory         unlimited            unlimited            bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       23551                23551                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

You can compare the above with the ulimits set for the oracle user.

With AIX it’s harder since there is no /proc filesystem. You have to run a debugger against the process. There’s a nice blog post on one method here, although it is not suitable for a production system because it interrupts the Oracle process and kills it afterwards.

So is that everything? Unfortunately not, ulimits don’t just affect the server processes like smon; they also affect server processes for connections using SQL*Plus. Below are two server processes running, each serving a connection from SQL*Plus, one for my ben user and another for the oracle user:

SQL> select spid, username, program from v$process where spid in (9351, 9798);

SPID			 USERNAME	 PROGRAM
------------------------ --------------- ------------------------------------------------
9798			 oracle 	 oracle@o112-c2n1.ora.pwk (TNS V1-V3)
9351			 ben		 oracle@o112-c2n1.ora.pwk (TNS V1-V3)

The below shows these processes:

SQL> !ps -ef | egrep '(9350|9351|9797|9798)'
ben       9350  7438  0 13:08 pts/0    00:00:00 sqlplus
oracle    9351  9350  0 13:09 ?        00:00:00 oraclebeverley1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    9797  9762  0 13:34 pts/0    00:00:00 sqlplus
oracle    9798  9797  0 13:34 ?        00:00:00 oraclebeverley1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

As I am running Linux, I can easily compare the ulimits applying to processes 9351 and 9798:

[ben@linux 9351]$ sudo cat limits
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             131072               131072               bytes     
Max stack size            33554432             unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             23551                23551                processes 
Max open files            1024                 1024                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       23551                23551                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

[ben@linux 9798]$ sudo cat limits
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             131072               131072               bytes     
Max stack size            33554432             unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             131072               131072               processes 
Max open files            131072               131072               files     
Max locked memory         51200000000          51200000000          bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       23551                23551                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

The differences are highlighted.

All this brings me onto the ORA-04030 problem, which can be related to the value of max data size. This parameter limits the amount of PGA memory that your session can use. Checking the ulimits is quick and easy, at least on Linux, so if you encounter such a problem this is one place you can easily check. It may save you from having to experiment with undocumented Oracle parameters like _pga_max_size which may not be the cause.

UK OUG conference: RAC & HA special interest group round table

The RAC and HA roundtable gives users a chance to share experience and gotchas. Often the discussion points that generate most interest are when something unusual or unexplained has happened, with the participants attempting to drill down to where the problem was by firing questions at the DBA telling the tale. The discussion this time there was such a discussion around a RAC stretch cluster where the storage connection had been deliberately broken during a test (but not the RAC interconnect between the two sites). The end result of this was a failure of the cluster but after a restart, the database had been rolled back to a point in time several minutes before the failure. This is discussed here: http://davidburnham.wordpress.com/category/oracle-rac-stretched-cluster-testing/

Some participants suggested that Active Data Guard was a better option than a stretch RAC cluster with one saying that Active Data Guard should be the default position with RAC only deployed if Active DataGuard didn’t fit the bill. Unfortunately as a presentation on DataGuard was taking place at the same time, most people using it were presumably elsewhere.

Another discussion point was whether RAC actually reduces the volume of outages, ensuring high availability. Participants generally agreed that it is great for installing rolling updates and so on without disruption – in other words, planned maintenance work – and dealing with some other modes of failure. But sometimes the added complexity could cause unplanned outages that may never have happened on a single instance system. One participant argued that some of the outages cited should not happen, since they were down to human error.

Talk moved on to whether an Oracle Exadata box represents good value for money. One participant mentioned that he could easily build a similar set up from commodity hardware for a fraction of the cost, but it would not include a high-speed interconnect or the Exadata software. He cited some data-warehouse type queries that were sped up by an order of magnitude when implemented using Exadata. This had transformed the attitude to these queries in the company concerned: because they were quick to run, users were more likely to use them. So the answer is that it depends.

A somewhat inconclusive discussion took place on how best to deploy ASM when there are multiple databases: should all databases share the same ASM instance or diskgroup? One participant suggested that ASM was effectively a single point of failure which could cause all databases to fail in the event on a software bug. Another responded that he had been using ASM for four years without any such disruption. I guess ASM should just be viewed as another filesystem for this point of view and what are the chances of GPFS or even raw devices failing?

The number of RAC nodes in a cluster was another talking point. With a two node cluster, in the event of a failure you are left with just a single node which ought to be able to manage the full system load. Many participants seemed to prefer a three or four node cluster as 4 to 3 or 3 to 2 isn’t as dramatic a change as 2 to 1.

So, it was an interesting discussion and definitely a refreshing change from sitting passively through presentations.