How are Oracle background processes renamed?

We all know that when an Oracle instance starts, the background processes are all (or perhaps mostly?) copies of the same executable – $ORACLE_HOME/bin/oracle.

We can see this using the “args” and “comm” format options of the Unix “ps” command. Below we can see the name of the SMON process (“args”) and the name of the source executable (“comm”):

$ ps -p 6971 -o args,comm
COMMAND                     COMMAND
ora_smon_orcl1              oracle

This is common knowledge and when in the past I’ve wondered how this works I’ve always just accepted this Oracle magic and moved on to more important thoughts such as “what’s for lunch?”. Well today my belly is full and I have time to let such thoughts develop!

It turns out this is a quite simple trick. If you have a basic knowledge of the C programming language you’ll be familiar with the argc and argv parameters. e.g.

int main (int argc, char *argv[])

argc – the number of command line arguments
argv – an array of pointers to the command line arguments with argv[0] being the process name.

Surprisingly the values of this array are modifiable. So we can manipulate the name of a running process with a simple sprintf() command (other options are available). e.g.

sprintf(argv[0], "renamed");

My description of this is high level and I’m sure there are risks in doing this, such as overflowing argv[0] into argv[1], but by way of a simple example here is a demo C program:

#include <stdio.h>
void main (int argc, char *argv[])
{
  int pid;
  pid = (int) getpid ();
  printf("My name is '%s'\n",argv[0]);
  printf("Sleeping for 10 seconds (quick - 'ps -p %d -o args,comm')...\n",pid);
  sleep(10);
  sprintf(argv[0], "renamed");
  printf("My name is now '%s'\n",argv[0]);
  printf("Sleeping for 10 seconds (quick - 'ps -p %d -o args,comm')...\n",pid);
  sleep(10);
}

This program sleeps so I can capture process details from another session using “ps”, renames the running process and then sleeps again so I can capture more details. So let’s compile the C program and see what happens when I run it from one session and capture ps details from another session:

Session 1

$ cc newname.c -o newname
$ ./newname
My name is './newname'
Sleeping for 10 seconds (quick - 'ps -p 7930 -o args,comm')...
My name is now 'renamed'
Sleeping for 10 seconds (quick - 'ps -p 7930 -o args,comm')...

Session 2

$ ps -p 7930 -o args,comm
COMMAND                     COMMAND
./newname                   newname
$ ps -p 7930 -o args,comm
COMMAND                     COMMAND
renamed e                   newname

Pretty cool. Notice how my renamed process is now called “renamed e”. The trailing “e” is left over from the original name of “./newname” thus proving it’s not quite as simple as I suggest and that my C skills are basic. None-the-less I think this is pretty cool.

Protect single instance with GI (without RAC-One-Node)

This is my first post since I meet Martin Nash in London 2 years ago on a Storage seminar with James Morle. He invited me on ORAganism blog and since then I wanted to start blogging but didn’t take the time to do so.
I spent one year working as a Consultant for a company called Digora where I have had great time and learnt a lot with all the different consultants, Laurent Leturget made me felt guilty not to blog ( laurent-leturgez.com ) !

Anyway, I discovered lately that it is now really simple to protect a single instance with “Cold failover” technique  through the Grid Infrastructure. Before discovering this technique, it was not this easy. There was an article on Oracle blogs talking about an equivalent technique :
https://blogs.oracle.com/xpsoluxdb/entry/clusterware_11gr2_setting_up_an_activepassive_failover_configuration

Compared to this post, the new technique can’t be much easier, there are 2 commands :

- Check the current properties of your database - dbtest here
 crsctl status resource ora.dbtest.db -p
- Change the placement to FAVORED for your database
 crsctl modify resource ora.dbtest.db -attr "PLACEMENT=favored"
- Change the list of HOSTs that can receive the instance of your database (here host1 or host2)
 crsctl modify resource ora.dbtest.db -attr "HOSTING_MEMBERS=host1 host2"
- Check back the new properties of your database - dbtest here
 crsctl status resource ora.dbtest.db -p

If you want your client connection to be handle the failover automatically and the SCAN listener to be aware of the host change, you need to modify your remote_listener parameter so that it points to the SCAN address of our GI setup

sqlplus / as sysdba 
alter system set remote_listener="host-scan:1521" scope=both;

That’s it !!

You now have a single-instance database that is protected again hardware failure. So next question is why using this technique when you can use the RAC-One Node and what are the differences ?

- Reason – Cost
This is no additional cost and no extra licence compared to the RAC-One Node solution and moreover, you can use this technique with a Standard Edition database.

- Difference 1 – No RAC feature

Compared to the RAC-One Node solution, the database cannot failover smoothly to the other host (online failover). When you use RAC-One Node, if you choose in advance that you want to failover to another host (host2), you can use the following command :

srvctl relocate database -d dbtest -n host2 -v

When you launch this command, while they are some active transactions on the first host, the instance will be up and one second instance will start on the future host, so you are in the case of RAC features since both instances are up at the time, that’s why we have 2 redolog threads and 2 undo on RAC-One Node configuration.

If you want to learn more on RAC One Node, Marcin Przepiorowski has a serie on the subject (http://oracleprof.blogspot.fr/2009/12/oracle-rac-one-node-part-1.html)

If you try to use the relocate command with “poor’s man” failover solution, you’ll get this error :
   PRCD-1027 : Failed to retrieve database orcltest
   PRCD-1146 : Database orcltest is not a RAC One Node database

- Difference 2 – No native support in Cloud Control 12c

If you want to add this database to your beloved Cloud Control, you will need to add a wee extra script in case of failover so that CC12c can know it has to connect to the new host, here is an example below. If needed I wrote a script which handle this automatically (ask for it).

- Step 1 : create blackout (better but not compulsary)

emcli create_blackout -name="relocating active passive targets" -add_targets=${DB_NAME}_${INSTANCE_NAME}:oracle_database -schedule="frequency:once;duration:0:30" -reason="DB relocate"

- Step 2 – relocate the target from host1 to host2

emcli relocate_targets -src_agent=${SERVER_SOURCE}:3872 -dest_agent=${SERVER_DEST}:3872 -target_name=${DB_NAME}_${INSTANCE_NAME} -target_type=oracle_database -copy_from_src -force=yes -changed_param=MachineName:${VIP_NAME}

- Step 3 : stop blackout

emcli stop_blackout -name="relocating active passive targets"

- Difference 3 – No “Omotion like” relocation

At the beginning, RAC-One Node was named “Omotion“, it could be used almost as “Vmotion” from Vsphere. In the configuration using the Grid Infrastructure, you can only failover if you shutdown the host or shutdown the instance down on host1 and start it again on the second host :

-- From Host 1
srvctl stop database -d dbtest
-- From Host 2
srvctl start database -d dbtest -n host2

You are now ready to test this out !! This is amazing how Oracle handle this configuration, even if you delete the spfile from host1, the GI will recreate it automatically.

Hope this helps. I did not find any blog article talking about this type of configuration so this is my first contribution to the Oracle community from which I learned so much.

 

 

 

 

Unexpected Database Directory Permissions

In some secure environments code deployment privileges are split away from data modification ones. Below is an example user created only to execute DDL for application schema changes.

create user deploy identified by deploy;
grant create any directory to deploy;
grant create any table to deploy;
grant create session to deploy;

And here we see the account being used to make some changes to an application schema, creating a new table and a database directory.

conn deploy/deploy@pdb
create table app.secrets
( cardnum varchar2(15)
, holder varchar(40)
, balance number);
create directory sensitive_files as '/tmp/secrets';

Two weeks later my code monkey technician decides to have a cheeky look at the contents of the secrets table created previously.

select * from app.secrets
                  *
ERROR at line 1:
ORA-01031: insufficient privileges

Lovely – no access and my secrets are safe. However the technician happens to be aware of the file naming convention used by the ETL process and tries to read a dump file using UTL_FILE.

set serveroutput on
declare
  fh utl_file.file_type;
  str varchar2(1024);
begin
  fh := utl_file.fopen('SENSITIVE_FILES', 'secrets.dat', 'R');

  begin
    loop
      utl_file.get_line(fh, str);
      dbms_output.put_line(str);
    end loop;
  exception
    when no_data_found then
      null;
  end;
  
  utl_file.fclose(fh);    
end;
/

cardnum,holder,balance
1234-1234-4321-4321,Neil,0.05

Oh my, the “CREATE ANY DIRECTORY” privilege also gives me read permissions on the “contents” of the directory. And…

set serveroutput on
declare
  fh utl_file.file_type;
begin
  fh := utl_file.fopen('SENSITIVE_FILES', 'secrets.dat', 'W');

  utl_file.put_line(fh,'cardnum,holder,balance');
  utl_file.put_line(fh,'1234-1234-4321-4321,Neil,999999.99');
  utl_file.fclose(fh);    
end;
/

PL/SQL procedure successfully completed.

We can modify the file too. Here are the new contents when read again using UTL_FILE.

cardnum,holder,balance
1234-1234-4321-4321,Neil,999999.99

Looking at the permissions it may not be clear how the file contents were changed, “CREATE ANY DIRECTORY” is more powerful than it looks.

  1* select * from dba_sys_privs where grantee = 'DEPLOY'
SQL&gt; /

GRANTEE    PRIVILEGE            ADM COM
---------- -------------------- --- ---
DEPLOY     CREATE SESSION       NO  NO
DEPLOY     CREATE ANY DIRECTORY NO  NO
DEPLOY     CREATE ANY TABLE     NO  NO

It’s probably an edge case but I found it surprising.

INS-30502: No ASM disk group found

I’ve been mostly looking busy Informix instances lately so it was nice to have a 12c installation to do at work today. I quickly reacquainted myself of some of my personal rules of Oracle problem-solving:

  1. There is a lot of rubbish posted in online Oracle forums.
  2. Using MOS is generally a better bet.

However, both of these can be lazy ways of solving problems and neither provided the answer to the problem I had. There’s often no substitute for working through a problem yourself.

My 12c instance is a stand-alone one including Grid Infrastructure, ASM and role separation. This means installing Grid Infrastructure as user grid and the database software as user oracle. Having successfully installed Grid Infrastructure and browsed around my working ASM instance using SQL*Plus, I was slightly perturbed to get this error during the installation of the database software when it came to configuring the storage to be used:

INS-30502: No ASM disk group found

In the installActions log I had:

SEVERE: [FATAL] [INS-30502] No ASM disk group found.
CAUSE: There were no disk groups managed by the ASM instance +ASM.
ACTION: Use Automatic Storage Management Configuration Assistant to add disk groups.

But this wasn’t true!

[grid@oraserver ~]$ . oraenv
ORACLE_SID = [grid] ? +ASM
The Oracle base has been set to /u01/app/grid
[grid@oraserver ~]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.1.0 Production on Tue Sep 10 11:28:49 2013

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Automatic Storage Management option

SQL> select name, state from v$asm_diskgroup;

NAME STATE
------------------------------ -----------
FRA MOUNTED
DATA MOUNTED

So what was going on? I vageuly remembered that there were some old bugs related to role-separation so I checked the ownership of key files and the group memberships of the oracle and grid users. Everything was ok!

There is a MOS note, 1069517.1, that offers some suggestions but none of the solutions fit, in particular the recommendation to check the permissions on $ORACLE_HOME/bin/oracle. Such a file doesn’t exist yet because I haven’t installed any database software!

Despite this, the problem did smell of a permissions issue. As the installer runs as user oracle, I decided to have a poke around as this user and found this whilst connected to my ASM instance:

oracle@oraserver ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Tue Sep 10 10:47:55 2013

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Automatic Storage Management option

SQL> select * from v$asm_diskgroup;

no rows selected

Is this the same ASM instance as the one above with its two disk groups? Isn’t it only with user_ views or VPD where the output depends on the user?

Here the documentation comes to the rescue:

While an ASM instance is initialized, ASM discovers and examines the contents of all of the disks that are in the paths that you designated with values in the ASM_DISKSTRING initialization parameter. Disk discovery also occurs when you:

  • Run the ALTER DISKGROUP…ADD DISK and ALTER DISKGROUP…RESIZE DISK commands
  • Query the V$ASM_DISKGROUP and V$ASM_DISK views

So the v$asm_diskgroup view actually gets the OS to do something under the covers.

At which point the problem became somewhat obvious:

SQL> show parameter diskstring

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring string /dev/asm/

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Automatic Storage Management option

On this system, /dev/asm contains the block devices set up in udev. A permission check revealed all:

[oracle@oraserver ~]$ cd /dev/asm
-bash: cd: /dev/asm: Permission denied

The directory could only be viewed by the grid user and members of the asmadmin group. A quick chmod 775 on the directory resolved the problem.

How Eter Pani became Oracle Certified Master

I got my OCM. Hurray!
Now one of the common question people ask me is how to pass this OCM exam.
Firstly I am very grateful to Oracle to introduce this exam. Finally it is the real exam, but not test with set of answers. I clearly see complicatedness for examiners, but it is the real proof of DBA skills, not general memory skills.
Secondly this exam does not prove your exceptional knowledge of Oracle. It is the proof that you are fluent in all basic skills. During the exam everything works according to documentation. Personally I have collection of my favourite Metalink Notes and Advanced instructions that are used at least once a week. You do not need such things on exam.
The environment is ready. You do not need to reconfigure switches for RAC or install additional OS packages. What you are doing is only Oracle Database Administration, no OS specific. If something is really different between different UNIX OS, forget about it. It should not be part of the exam.
When I came to client site I frequently have no access to internet thus have a copy of Oracle Documentation locally. Moreover reading through local copy is actually faster then browsing through internet copy. I used to use local copy and it was really useful during exam.
Another my habit that I find useful is preparing all scripts in text file and then copy it to SQL*Plus or Shell window. If you need to rerun script or slightly alter it for a different skill-set you can reuse your one history. E.g. I store in this file backup/restore scripts.
You have 2 days, 14 hours, including lunch and breaks for 7 skill-sets. None of skill-set takes more then 2 hours. If you do not believe you can do something in less then 2 hours forget about it. Even if it would be on exam you would not be able to do it in time. Focus on things that you would be able to do.
The exam is based on 11.2g database. If something is different between patch sets again forget about it. Asking information specific for patch set is unfair to people who used to basic one, thus this question would not arrive on exam.
When you read through skill-set task at the beginning, read it up to the end. Mark for yourself tasks that would require you some investigation through the documentation. Mark for yourself tasks that you doubt to solve. Estimate time for each task. Start from the short and easy one and if you significantly overflow the time frame you set switch to the next task in your ordered list. If you have time you can came back and fix the issues later.
I recommend 15 minutes before end of skill-set to check the whole environment, there is special button for end state of the skill-set. 15 minutes should be enough to bring it to correct state.
Read tasks carefully, frequently tasks include markers how to avoid hidden rocks of the environment, e.g. check all options of the objects to create. If you would not follow it exactly the problems would make your life significantly harder.
Some tasks are not clear, you can ask your proctor for clarification. But proctor not always can rephrase task without violation of exam rules, if he could not provide explanation what is requested in a task follow “best practice”.
In general be concentrated, careful and have a lot of practice before exam. I passed preparation courses but honestly it was just way to guarantee time and environment for training. You can do preparation yourself if your management and family would grant the opportunity to do it. If you have no such generous option apply for a preparation course, it is really value for money, very useful and informative. Course provide to you experience of working on the same machines that you would use on exam. In my case the machines was really physically the same, just different OS image. BTW try to became used to local keyboards of the country where you are going to pass the exam. English and US keyboards are different and this difference can be that point which consume the vital time on exam.
Good Luck.
Eter

Checking Oracle Directory Object Path Validity

Bit of a mouthful that title. Today I was asked to check which, if any, directory objects had an invalid path. Normally this would be trivial but on the system I was on there were more than 90. So I knocked up a quick bit of SQL/KSH to do it for me and thought I’d share it here. I could have done something with PL/SQL but I was happy with quick and dirty.

In summary the snippet below generates shell test commands to a temporary file and then uses “sed” to chop off the first and last lines before running the resulting shell script. I just pasted this in to a putty window SQL*Plus prompt – your mileage may vary.


set define off heading off pages 0 lines 200 trimspool on feedback off echo off
spool /tmp/dirchk.tmp
select '[[ -d '||directory_path||' ]] || echo "'||directory_name||' missing ('||directory_path||')"' from dba_directories order by 1;
spool off

!ls -l /tmp/dirchk.tmp
!sed -e '1d;$d' /tmp/dirchk.tmp > /tmp/dirchk.sh
!chmod 700 /tmp/dirchk.sh
!/tmp/dirchk.sh

On Linux I could have used the sed “-i” switch the edit in place but this was AIX.

Example output:

NJTEMP missing (/home/nj)
RESPONSE_DIR missing (/mnt/app/files/response)
SCHEMA_DIR missing (/mnt/app/schema)
CONVERSION missing (/mnt/app/conversion)
FTP_DIR missing (/mnt/app/ftp_files)