Category Archives: Troubleshooting

expdp not consistent

Came across this small oddity that documentation of 10.2 and 11.2 states that expdp by default takes consistent image of the database. But actually it is not so. You need to use flashback_scn/flashback_time for that. Metalink doc 377218.1 explains the scenario.

dbc_min_pct and dbc_max_pct in HP-UX

It was a 10g (10.2.0.5 on HP-UX 11.23 RISC) database which was recently upgraded from 9.2.0.8. The CPU and memory utilization was going really high. After tuning few of the queries coming in top, CPU usage was coming within accetable limits but the memory usage was still high. There was a total of 16 GB of RAM on the server and the usage was above 90%, constantly. One of the reasons behind high usage was increase in the SGA size. It was increased from 2.5 GB (in 9i) to around 5 GB (in 10g). Another major chunk was being eaten by OS buffer cache. While looking at the memory usage with kmeminfo:

Buffer cache        =  1048448    4.0g  25%  details with -bufcache

In HP-UX, The memory allocated to (dynamic) buffer cache is controlled by two parameters dbc_min_pct and dbc_max_pct. It can vary between dbc_min_pct and dbc_max_pct percent of the total RAM. They default to 5 and 50 respectively. For a system that is running an Oracle database value of 50 for dbc_max_pct is way too high. That means half of the memory is going to be allocated to OS buffer cache. As Oracle has got its own buffer cache so the OS cache is not of much use. As mentioned in the metalink note 726652.1, the value of dbc_max_pct can be safely lowered without impacting the Oracle database performance. In many of the threads (on HP website) people have suggested the value of 10 for db_max_pct. Not sure if it is more like a thumb rule but in the same metalink note (726652.1) it is mentioned that if %rcache in sar -b is above 90, that means your OS buffer cache is adequately sized.

After setting the value of dbc_max_pct to 15 (It will be changed to 10, finally), around 1.6 GB more memory was freed. Also there was no impact on the database or OS performance. Here are few of the metalink notes and threads on HP-UX website that talk about these parameters in detail:

Oracle Shadow Processes Are Taking Too Much Memory (Doc ID 434535.1)

How OS Buffer Cache Size Affects Db Performance (Doc ID 726652.1)

Commonly Misconfigured HP-UX Kernel Parameters (Doc ID 68105.1)

http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1306231311459+28353475&threadId=1266914

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=727618

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=467288

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=750342

 

ORA-12547: TNS:lost contact

Very simple issue but took some amount of time in troubleshooting so thought about posting it here. May be it proves to be useful for someone.

Scenario was: Oracle is installed from “oracle” user and all runs well. There is a new OS user “test1” that also needs to use sqlplus. So granted the necessary permissions on ORACLE_HOME to test1. Tried to connect sqlplus scott/tiger@DB and yes it works. But while trying sqlplus scott/tiger it throws:

$ sqlplus scott/tiger

SQL*Plus: Release 10.2.0.5.0 - Production on Wed May 18 09:32:35 2011

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

ERROR:
ORA-12547: TNS:lost contact


Enter user-name: ^C
$

Did a lot of troubleshooting including checking tnsnames.ora, sqlnet.ora, listener.ora and so on. Nothing was hitting my mind so finally raised an SR. And it has to do with the permissions of the $ORACLE_HOME/bin/oracle binary. The permissions of oracle executable should be rwsr-s–x or 6751 but they were not. See below:

$ id
uid=241(test1) gid=202(users) groups=1(staff),13(dba)
$

$ cd $ORACLE_HOME/bin
$ ls -ltr oracle
-rwxr-xr-x    1 oracle   dba       136803483 Mar 16 20:32 oracle
$

$ chmod 6751 oracle
$ ls -ltr oracle
-rwsr-s--x    1 oracle   dba       136803483 Mar 16 20:32 oracle
$

$ sqlplus scott/tiger

SQL*Plus: Release 10.2.0.5.0 - Production on Wed May 18 10:23:27 2011

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> show user
USER is "SCOTT"
SQL>

waiting for resmgr:become active – can’t login

Some time back, I was at a client where the customer complained that no one was able to log in to the database. It was Oracle 10.2.0.4 running on HP-Ux. I logged in to the database and checked the wait events:

SQL> @wait

EVENT                                                              COUNT(*)
---------------------------------------------------------------- ----------
wait for possible quiesce finish                                          1
Streams AQ: qmn coordinator idle wait                                     1
Streams AQ: qmn slave idle wait                                           1
Streams AQ: waiting for time management or cleanup tasks                  1
SQL*Net message to client                                                 1
smon timer                                                                1
pmon timer                                                                1
jobq slave wait                                                           4
rdbms ipc message                                                        11
SQL*Net message from client                                              27
resmgr:become active                                                    322

11 rows selected.

SQL>

Tanel’s snapper showed something like:

SQL> @snapper ash 5 1 all
Sampling with interval 5 seconds, 1 times...

-- Session Snapper v3.11 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
 26322% | 4ffu7nb93c2c9   | resmgr:become active      | Scheduler
  1900% | 2wn958z7gzh57   | resmgr:become active      | Scheduler
  1400% | 9d9bg2r538nd2   | resmgr:become active      | Scheduler
   600% | 4d3k70q6y344k   | resmgr:become active      | Scheduler
   500% | d6vwqbw6r2ffk   | resmgr:become active      | Scheduler
   500% | 4tsrz92mmshbw   | resmgr:become active      | Scheduler
   200% | 37td1bbvc1a69   | resmgr:become active      | Scheduler
   100% | ftdjfxws0s8q9   | resmgr:become active      | Scheduler
   100% | 41apc1bjqrfbv   | resmgr:become active      | Scheduler
   100% | af9d8aqtkvn02   | resmgr:become active      | Scheduler

--  End of ASH snap 1, end=2011-02-10 11:06:40, seconds=5, samples_taken=23

PL/SQL procedure successfully completed.

SQL>

If we check the description of the wait event, it says:

The session is waiting for a resource manager active session slot. This event occurs when the resource manager is enabled and the number of active sessions in the session’s current consumer group exceeds the current resource plan’s active session limit for the consumer group. To reduce the occurrence of this wait event, increase the active session limit for the session’s current consumer group.

But if we check the resource_limit settings:

SQL> show parameter resource

NAME_COL_PLUS_SHOW_PARAM   TYPE		VALUE_COL_PLUS_SHOW_PARAM
----------------------- -----------     --------------------------
resource_limit            boolean		FALSE
resource_manager_plan     string

SQL>

What ? Resource manager is not enabled. But why all the sessions are waiting for resmgr:become active and nobody is able to login ?

A bit of googling lead me to this page from where I got the clue.

Generally, this wait situation occurs when you execute certain EMCA operations such as the operation for creating the EM repository. As a result of these operations, the systems implicity switches to QUIESCE mode. Therefore, all database connections (except users SYS and SYSTEM) must wait for “resmgr:become active”. In this case, refer to Note 1044758 and execute the following command if necessary:

ALTER SYSTEM UNQUIESCE;

I asked around in the DBA team and one of the guys was trying to configure EM for the database due to which system switched tto QUIESCE mode and all the sessions were waiting on resmgr:become active.

After canceling the operation, the wait event was gone and everything was working normally.

Issues in upgrading from 9i to 10g

Last week I had a chance to upgrade a 9.2.0.7 database to 10.2.0.5. The size of the database was around 800 GB. The major applications connecting to the database were developed in Pro*C and Oracle Forms. The upgrade itself pretty smooth but there were few glitches around that needed to be handled. Just thought about documenting all the issues:

  • Few users in the database were assigned the CREATE SESSION privilege through a password protected role (That role was the default role for that user). 10.2.0.5 onwards, password protected roles can’t be set as default roles. The alternate is to either disable the password for the role or assign CREATE SESSION directly to the user, not through a role.
  • After the upgrade, few procedures became invalid and while compiling started giving ORA-00918: COLUMN AMBIGUOUSLY DEFINED. The issue was bug 2846640 which is fixed in 10.2. Actually, in few of the queries using ANSI syntax, the developer didn’t qualify the column names with table names. It worked fine in 9i but due to the bug getting fixed in 10g, it started giving ORA-00918. The simple solution is to prefix the column name with the table name.
  • Few of the application schema owner users complained that they were not able to modify the procedures/packages in their own schemas. The schemas were not assigned CREATE PROCEDURE privilege but as per documentation, they should be able to modify the existing procedures/packages owned by them. This again is a documentation bug. It worked fine in 9i but in 10g onwards you need to have either a CREATE PROCEDURE or ALTER ANY PROCEDURE privilege (a risky one) to be able to edit the PL/SQL units in your own schema.

These were few of the issues encountered, rest of the upgrade was super smooth !

Happy upgrading !