After more than 4 years happy blogging on this site I’ve decided to setup an individual blog. If you want to keep reading what I write then please head over to blog.nasmart.me. There’s not much there at the moment, but all being well that should change reasonably quickly. You can access the RSS feed via blog.nasmart.me/feed.
I arrived back in the UK yesterday after my second time at the Oracle User Group Norway’s Spring Seminar. I had a great time and even those that suffered with sea-sickness enjoyed themselves when they weren’t praying to the porcelain god. It was definitely a rougher sea on the first night this year compared to last, but lucky for me I was pretty much unaffected. However, the “Martin Cluster” suffered some major node failures with outages from Bach and Widlake.
The first day of the conference is on land in Oslo and some guy called Justin Bieber did a really good job of making sure that hotels in Oslo were in demand. I heard reports that his guys has fans that booked rooms in multiple hotels in the hope of one of them being the same hotel that Justin was staying in… Madness and an inconvenience of some of the conference attendees.
On day 1 Martin Bach and I ran a workshop on client connectivity to RAC databases under the banner of “RAC Attack II”. We covered Fast Connection Failover (FCF) for both Java and C# clients with particular focus on the bugs and gotcha that await those attempting to use the feature. On day 2 I did a presentation entitled “How Virtualisation Changed My Life” that aims to encourage attendees to make active use free virtualisation products on their own hardware in order to increase their knowledge and hands-on experience with the technology they work with or want to work with.
Outside of my speaking commitments I attended some great sessions and the following is a selection of my notes:
“Happiness is a state change” – Cary Millsap. Without the context of the rest of the keynote presentation (“Learning about Life through Business and Software”) this quotation might not make much sense. The point that Cary is making is that it is development and progression that we humans find rewarding rather than our state at specific point.
e-Vita employees Cato Aune and Jon Petter Hjulstad co-presented a session on “Weblogic 12c – experiences”. My only exposure Weblogic is when installing or managing Oracle Enterprise Manager and Oracle Identity Management products, neither of which use/support Weblogic 12c at this time, but I wanted to hear about what the latest Weblogic will surely bring my way in due course.
Joel Goodman gave a very good presentation on “RAC Global Resource Management Concepts” revealing the complexity of what goes on under the covers of your RAC database. Unfortunately the slides are not available even to conference attendees.
Connor McDonald‘s “Odds & Ends” was very enjoyable and it’s definitely worth grabbing the slides. My notes include:
- Use of oradebug suspend/resume as an alternative to killing a resource hungry session is an appealing idea
- I wasn’t aware of the use of “#” to run SQL*Plus command mid way through typing a SQL statement in SQL*Plus
- Making use of “set errorlogging on” isn’t something I currently do, but will look at
- The unsupported, but interesting “overlaps” clause in SQL is worth being aware of and Connor provides an associated MOS note ID in the slides
Frits Hoogland gave 3 presentations during the conference. Unfortunately the first (“Exadata OLTP”) was at the same time as mine. Fortunately I saw the other 2: “About multiblock reads” and “Advanced Profiling of Oracle Using Function Calls—A Hacking Session”. These work very well together and the hacking session was the highlight of the conference for me. There were no slides, so you can’t download them, but Frits and documented what he covers in “Profile of Oracle Using Function Calls (PDF)“. Notes from the sessions include:
- Frits prefers to set db_file_multiblock_read_count manually rather than unset or setting to zero
- The “physical reads” in autotrace output is number of blocks read not number IOs, which is a mistake he sees others making
- Direct path reads don’t stop at extent boundaries and a single request can read multiple [contiguous] extents
- Use perf to break out what CPU is being used for
Kai Yu presented “Optimizing OLTP Oracle Database Performance using PCIe SSD”. He shared his experiences and covered the use cases for this type of storage in an Oracle database infrastructure. Very significant performance improvements are available, but as always it depends on your implementation/workload.
Bjoern Rost‘s “The ins and outs of Total Recall” covered his experiences using Total Recall aka Flashback Data Archive (FBA). Does it really need 2 names? He showed how it had been used for what I understood to be a slowly changing dimension use case without the need to change existing parts of the application. They had been bitten by the change covered by MOS Note “Initial Extent Size of a Partition Changed To 8MB From 64KB [ID 1295484.1]”. The most interesting part of presentation was detailed coverage of DISSOCIATE_FBA so grab the slides if you use FBA. It’s also worth noting that Total Recall/Flashback Data Archive is included in Advanced Compression so you might find you have the option of using it without specifically purchasing it.
Cary Millsap‘s “Millsap’s Grand Unified Theory of ʺTuningʺ” emphasised the point that end user experience is what really matters and covered what tools are appropriate in specific phases of performance analysis.
If the agenda for next year is anything like this year then it’s definitely worth considering a trip to Oslo for a boat ride to Kiel and back.
A massive thank you to OUGN for putting on the seminar, accepting my presentations, excellent organisation and fantastic hospitality.
When the question of what starts OSWatcher (OSW) on Exadata was raised at a client site I thought I’d take a quick look. It took me a little longer than I expected to work out the detail and therefore it seems worth sharing.
If you’re simply looking to change the “snapshot interval”, “archive retention” or “compression command” then /opt/oracle.cellos/validations/init.d/oswatcher is what you need to modify and you’ll find a line with ./startOSW.sh X Y Z. Where X is the snapshot interval, Y is the archive retention and Z is the compression command used to compress the output files.
If you’re curious to know the details of what starts and restarts OSWatcher than read on.
The following is applicable to the X2-2 I regularly get my hands on which is running 126.96.36.199.2 and I don’t know if things change with later versions, so apologies if this isn’t applicable to your Exadata environment.
Startup of OSWatcher on boot is indirectly handled by /etc/init.d/rc.local, which includes:
########### BEGIN DO NOT REMOVE Added by Oracle Exadata ########### if [ -x /etc/rc.d/rc.Oracle.Exadata ]; then . /etc/rc.d/rc.Oracle.Exadata fi ########### END DO NOT REMOVE Added by Oracle Exadata ###########
# Perform validations step /opt/oracle.cellos/vldrun -all
The main purpose of /opt/oracle.cellos/vldrun and the Perl script /opt/oracle.cellos/validations/bin/vldrun.pl appears to be ensuring configuration changes are made on initial boot and after upgrades, although I haven’t looked into all the detail yet. The part of /opt/oracle.cellos/vldrun that is relevant in the context of starting OSWatcher on every boot is:
$VLDRUN_PL -quiet "$@"
This executes /opt/oracle.cellos/validations/bin/vldrun.pl with the -quiet and -all arguments (as that was passed to /opt/oracle.cellos/vldrun)
The “quiet” argument is pretty obvious and a little reading reveals that “all” simply means that all scripts in /opt/oracle.cellos/validations/init.d/ should be executed.
So off to /opt/oracle.cellos/validations/init.d/ we go:
root@my-host ~]# ls -1 /opt/oracle.cellos/validations/init.d/ beginfirstboot biosbootorder cellpreconfig checkconfigs checkdeveachboot checklsi diskhealth ipmisettings misceachboot misczeroboot oswatcher postinstall sosreport syscheck [root@my-host ~]#
… and in oswatcher, as already mentioned in the second paragraph of the post, you’ll find ./startOSW.sh X Y Z, where X is the snapshot interval, Y is the archive retention and Z is the compression command used to compress the output files.
OK, so that’s what starts OSWatcher on boot, but you should also know that OSWatcher is restarted daily by /etc/cron.daily/cellos, which includes:
/opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher > /dev/null 2>&1
The only bit of all this that doesn’t really sit right with me is that OSWatcher is included with “validations”. That doesn’t seem like an appropriate description to me.
Trivial as it may be, I hope that later version of the Exadata software move from what is described above to the “service” based approach used on non-Exadata platforms and documented in How To Start OSWatcher Black Box Every System Boot [ID 580513.1]. This feel like a much more standard approach and allows control of the service using the /sbin/service and /sbin/chkconfig commands.
I’m not sure exactly when this change happened, but in Oracle [Enterprise] Linux 5 days a default installation would result in the root file system being created as a LVM logical volume (LV) named LogVol00 in a volume group (VG) named VolGroup00. I must confess that I wasn’t paying too much attention to LV and VG default names in the days I was playing with OL5 a lot, but that’s partly because there was nothing to drag my attention to them.
Along comes Oracle Linux 6 and off I go creating VMs, cloning them and then really not liking the fact that the default VG created during the installation, and holding the LV for the root file system, is named vg_<hostname> where <hostname> is the hostname of the original VM I installed. If I clone a VM the next thing I do is change the hostname, which means that I’m left with an inconsistent and somewhat confusing VG name. I think I screwed up one VM before realising that it wasn’t just simply a case of renaming the VG and updating /etc/fstab. I asked a friend who does much more Linux admin how to achieve what I wanted and didn’t take it any further when he said words to the effect of, “Yeah, it’s more complicated than that.”
Update [5th August 2014]
It turns out that renaming the volume group that hold the logical volume the root file system is on is not as complex as I had previously thought. Comments from Brian suggest that there is no need to recreate the initramfs and that it can be done without booting into rescue mode. I’ve just tested Brian suggestions and he’s right. It is as simple as:
- Rename Volume Group
- Update /etc/fstab
- Update /boot/grub/grub.conf
Brian – Thanks a lot for your comments and pointing out unnecessary steps.
This update makes the rest of the post mostly useless, but I’ll leave it all there for context.
End of update [5th August 2014]
Fairly recently I walked into the same situation again, only this time I decided that I wasn’t going to take “more complicated” for an answer🙂. I searched, found a few articles that seemed to have logic in their approach and figured I had nothing to lose. I also thought there were some redundant steps in the posts I was following, hence feeling it’s worth blogging my slimmed down approach.
Well, that’s enough preamble. Here’s the details:
1) Boot Into Rescue Mode
For me booting from CD/iso was the easiest way to get into “rescue mode”. To do this select “Rescue installed system” when the welcome screen is presented during the boot process. You will then be prompted with:
Choose a Language – You know better than I what’s the best choice for you. Select it and then OK.
Keyboard Type – Again, pick what you think best matches your keyboard. Then select OK.
Rescue Method – Select “Local CD/DVD”, then OK.
Setup Networking – Select “No”
Rescue – Select “Continue”
Rescue (message about your system being mounted under /mnt/sysimage and use of chroot) – OK.
Rescue (another message about having mounted your system under /mnt/sysimage) – OK.
First Aid Kit quickstart menu – Select “shell Start shell”, then OK.
The above will get you to a prompt so you can actually do what you came here for!
2) Rename Volume Group
The LVM commands you issue are the same as usual, only they need to be prefixed with lvm. I suggest listing the VGs to be sure the state of the system is as you expect, and using more is a good idea as you don’t have a scrollbar, i.e.:
lvm vgdisplay | more
Once you’re happy, rename the VG as below:
lvm vgrename <original> <desired>
You should get a success message after this command.
3) Update All References
Change the root directory to that of your installed system using chroot:
The following files need to be modified to replace references to the old VG name with the new VG name:
There will be multiple references per line in grub.conf, so a bit of “global replace” is in order.
4) Create New Ramdisk Image
Run the following command to make a new initial ramdisk image
mkinitrd --force /boot/initramfs-<kernel version>.img <kernel version>
Note that the force option is only required because there is already an existing image with the same name. You could use a different name if you want, but you’d need to add an appropriate entry to the grub.conf to reflect this.
If the above command completes without error messages and you didn’t make any errors in the editing of the files earlier then you should be all set… Only one way to find out!
5) Reboot Machine
Exit out of the chroot environment (type “exit”).
Exit out of the shell to return to the “First Aid Kit quickstart menu” (type “exit”).
First Aid Kit quickstart menu – Select “reboot Reboot”, then OK.
At the welcome screen select “Boot from local drive”.
If all goes well then remember to remove the CD/iso from your [virtual] CD drive.
The 2 articles that helped me with this are the following, so thanks to the authors:
Update [27th January 2014]
I have just noticed that the default volume group name when installing Oracle Linux 6.5 has changed from “vg_<hostname>” to “VolGroup”.
At least one person asked me why I did this, so I’ll start by explaining the motivation for setting up my own mirror of Yum repositories freely available on public-yum.oracle.com.
It comes down to 5 main reasons:
- Wanting my Oracle Linux 6 installation can take advantage of the “latest” repositories
- Wanting the ability to update to a consistent version by using repositories I control
- Reducing the amount of data I download over the internet
- A desire to learn how to set up a Yum repository [mirror]
- Making my updates faster as they only have to retrieve packages from the LAN
When I first looked into setting up a Yum mirror I found a number of articles covering how to do so via rsync and I then found this post where one of the comments suggests that allowing rsync access to public-yum.oracle.com would be nice. This made me realised that the rsync approach wasn’t going to work for the Oracle Linux repositories (it seems the suggestion was well received so this may change in the future). I also found a OTN article covering “How to Create a Local Yum Repository for Oracle Linux“. I eagerly started to read and quickly hit a snag for me in the prerequisites section:
Have valid customer support identifier (CSI)
I don’t have a CSI. My customers all have CSIs, because they run Oracle in production. I don’t have a CSI as I only run the OTN versions of Oracle software in my lab so that I can test out things I don’t have opportunity, access or time to test on client sites.
Anyway, with a little bit of reading around I found a way to create local mirrors of the Oracle Linux 6 Latest and Oracle UEK Latest repositories.
What follows was carried out on a VM, but there is no reason why any of this won’t work equally on a physical host. If you encounter any problem replicating what I’ve done here then please comment and I’ll gladly try to help.
1) Allocating Storage
You’re going to need a reasonable amount of storage for this. My “repos” file system currently holds 24G of data and that is just for Oracle Linux 6 Latest and Oracle UEK Latest. I created a dedicated file system for my repositories on a LVM volume, but won’t cover that here. Allocate the storage as you see fit, but you’re going to want at least the 24G quoted.
2) Create Directory For Repos
As mentioned above, I have a dedicated file system for my repositories. It’s mounted under /repos and I’ll include that in all the code listings that follow. If you chose to use a different directory structure then clearly you’ll need to make the required changes.
# mkdir -p /repos/x86_64/
3) Install yum-utils
yum-utils includes a couple of commands you’re going to need for this, reposync and createrepo.
# yum -y install yum-utils
4) Setup Repositories
Follow the instructions on public-yum.oracle.com in order to set up the Oracle repositories.
By default reposync will create a local copy of all your enabled repositories, but it is also possible to specify the name of the repo[s] you want to sync on the command line using the “r” or “repoid” flag. I use this option as I want to have my local repositories enabled on all my Oracle Linux 6 hosts, including the repository machine, but only want reposync to run for the public-yum.oracle.com repositories I want to mirror locally. This means that I do not enable any of the repositories in the public-yum-ol6.repo file downloaded from Oracle and create a new .repo file for my local repositories that I can distribute to all machines.
5) Run reposync
Running reposync is as simple as the command below:
# /usr/bin/reposync --repoid=ol6_UEK_latest --repoid=ol6_latest -p /repos/x86_64
6) Run createrepo
Once the repositories are downloaded to the local file system you need to run createrepo in order to create the repository metadata:
# createrepo /repos/x86_64/ol6_UEK_latest/getPackage/ # createrepo /repos/x86_64/ol6_latest/getPackage/
The “update” option for createrepo looked attractive in the man page, but whenever I used it the process was killed by the OOM Killer and I haven’t investigated in detail.
7) Allowing Web Access
In order to make use of the repositories they need to be exposed to the machines requiring access. HTTP is as good a way as any for my purposes, so I installed Apache (yum -y install httpd), ensured it would restart on reboot (chkconfig httpd on) and created symbolic links to my repositories:
# cd /var/www/html/repo/OracleLinux/OL6 # ln -s /repos/x86_64/ol6_UEK_latest/getPackage/ ./UEK/latest/x86_64 # ln -s /repos/x86_64/ol6_latest/getPackage/ ./latest/x86_64
8) Script for Updating Mirrors
Once I’d got it working I created a very simple shell script to allow me update whenever I appropriate:
#!/bin/bash LOG_FILE=/repos/logs/repo_cron_$(date +%Y.%m.%d).log /usr/bin/reposync --repoid=ol6_UEK_latest --repoid=ol6_latest -p /repos/x86_64 >> $LOG_FILE 2>&1 /usr/bin/createrepo /repos/x86_64/ol6_UEK_latest/getPackage/ >> $LOG_FILE 2>&1 /usr/bin/createrepo /repos/x86_64/ol6_latest/getPackage/ >> $LOG_FILE 2>&1
It’s then just a matter of pointing my Oracle Linux 6 installations at my local repository.
For reference my repo file is as follows (with hostnames removed)
[ol6_latest_local] name=Oracle Linux $releasever Latest ($basearch) baseurl=http://<hostname removed>/repo/OracleLinux/OL6/latest/$basearch/ gpgkey=http://<hostname removed>/RPM-GPG-KEY-oracle-ol6 gpgcheck=1 enabled=1 [ol6_UEK_latest_local] name=Latest Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch) baseurl=http://<hostname removed>/repo/OracleLinux/OL6/UEK/latest/$basearch/ gpgkey=http://<hostname removed>/RPM-GPG-KEY-oracle-ol6 gpgcheck=1 enabled=1
I recently found myself wanting to set up a DNS slave for the DNS server I run in my lab environment; and taking the view that it can’t be that hard I jumped into achieving that goal. It was pretty straightforward and this post is just a few references and hopefully enough information on the error messages I encountered (due to misconfiguration) to bring someone here that has made the same mistake. The existing DNS (master) server runs on Oracle Linux 6 and I wanted to setup a slave on Ubuntu 12.04. The site that I found most useful as a reference for someone that hadn’t done this before was www.server-world.info. Not a site I’m aware of visiting before, but it seems like a great reference from what I’ve looked at so far.
After setting things up I found I was getting the following messages in /var/log/syslog on the Ubuntu (slave) machine:
Feb 10 10:36:26 <hostname> named: running Feb 10 10:36:26 <hostname> named: zone <zone file 1>/IN: Transfer started. Feb 10 10:36:26 <hostname> named: transfer of '<zone file 1>/IN' from 192.168.1.3#53: failed to connect: host unreachable Feb 10 10:36:26 <hostname> named: transfer of '<zone file 1>/IN' from 192.168.1.3#53: Transfer completed: 0 messages, 0 records, 0 bytes, 0.001 secs (0 bytes/sec) Feb 10 10:36:27 <hostname> named: zone <zone file 2>/IN: refresh: skipping zone transfer as master 192.168.1.3#53 (source 0.0.0.0#0) is unreachable (cached) Feb 10 10:36:27 <hostname> named: zone <zone file 3>/IN: refresh: skipping zone transfer as master 192.168.1.3#53 (source 0.0.0.0#0) is unreachable (cached) Feb 10 10:36:27 <hostname> named: zone <zone file 4>/IN: refresh: skipping zone transfer as master 192.168.1.3#53 (source 0.0.0.0#0) is unreachable (cached) Feb 10 10:36:27 <hostname> named: zone <zone file 5>/IN: refresh: skipping zone transfer as master 192.168.1.3#53 (source 0.0.0.0#0) is unreachable (cached) Feb 10 10:36:27 <hostname> named: zone <zone file 6>/IN: refresh: skipping zone transfer as master 192.168.1.3#53 (source 0.0.0.0#0) is unreachable (cached)
While investigating I found myself reading the following articles:
- http://ubuntuforums.org/showthread.php?t=1805880 – Very similar errors apparently caused by missing reverse DNS for the slave name server
- http://ubuntuforums.org/showthread.php?t=767138 – Different errors, but one I hit along the way as a result of attempting to mix and match Red Hat configuration with Ubuntu configuration
I’ve included them here in case they are applicable to anyone else’s issues.
The last thing I read on the subject was http://firstname.lastname@example.org/msg03151.html. The letters TCP jumped out at me. I run iptables on the Oracle Linux 6 host (DNS master) and it was fresh in my mind that I had port 53 open for UDP traffic for DNS lookup. I knew DNS lookups worked against that host as I’d been testing from various locations minutes before. It had to be worth a quick try to see if it was something so simple. It was! I’d been able to do DNS lookup on the master DNS from the slave as port 53 was open for UDP traffic, but as I’d just learnt: zone transfers are carried out using TCP as covered on Wikipedia.
When I was doing some testing of service failover I ran into something that I think is interesting behaviour. If I issue an “abort” command I expect an abort, not a bit of tidying up before aborting, which is what I found the following command doing:
srvctl shutdown instance -d <database name> -i <instance name> -o abort
Alert log from “shutdown abort” of instance via srvctl
2012-07-18 10:34:53.067000 +01:00 ALTER SYSTEM SET service_names='DB_TST_SVC2','DB_TST_SVC3','DB_TST_SVC5','DB_TST_SVC4' SCOPE=MEMORY SID='DB_TST1'; ALTER SYSTEM SET service_names='DB_TST_SVC2','DB_TST_SVC5','DB_TST_SVC4' SCOPE=MEMORY SID='DB_TST1'; ALTER SYSTEM SET service_names='DB_TST_SVC5','DB_TST_SVC4' SCOPE=MEMORY SID='DB_TST1'; ALTER SYSTEM SET service_names='DB_TST_SVC5' SCOPE=MEMORY SID='DB_TST1'; ALTER SYSTEM SET service_names='DB_TST' SCOPE=MEMORY SID='DB_TST1'; 2012-07-18 10:34:54.145000 +01:00 Shutting down instance (abort) License high water mark = 7 USER (ospid: 3008): terminating the instance 2012-07-18 10:34:55.158000 +01:00 Instance terminated by USER, pid = 3008 Instance shutdown complete
Alert log from “shutdown abort” of instance via SQL*Plus
2012-07-18 10:41:02.663000 +01:00 Shutting down instance (abort) License high water mark = 8 USER (ospid: 19176): terminating the instance Instance terminated by USER, pid = 19176 2012-07-18 10:41:03.812000 +01:00 Instance shutdown complete
The tests were done using Oracle 188.8.131.52
This probably isn’t going to change anyone’s life, but no harm in knowing it🙂