I was working on configuring a new database for backup to ZDLRA and hit this issue while testing a controlfile backup via Enterprise Manager -> Schedule backup. It could happen in any environment.
Unable to connect to the database with SQLPlus, either because the database is down or due to an environment issue such as incorrectly specified...
If the database is up, check the database target monitoring properties and verify that the Oracle Home value is correct.
The 2nd line clearly tells the problem but since the Cluster Database status in EM was green, so it took me a while to figure it out. Issue turned out to be a missing / in the end of ORACLE_HOME specified in monitoring configuration of the cluster database. The DB home specified was /u01/app/oracle/product/22.214.171.124/dbhome_1 instead of /u01/app/oracle/product/126.96.36.199/dbhome_1/.
On the server the bash_profile has the home set as /u01/app/oracle/product/188.8.131.52/dbhome_1. When I tried to connect as sysdba there, it gave an error TNS : lost contact. Then I set the environment with .oraenv and I was able to connect. /etc/oratab had correct home specified as /u01/app/oracle/product/184.108.40.206/dbhome_1/. After comparing the value of ORACLE_HOME in these two cases, the issue was identified. Then I updated the ORACLE_HOME value in the target monitoring configuration in Enterprise Manager and it worked as expected.
Yesterday I was configuring EM 12c for a Sun Super Cluster system. There were a total of 4 LDOMs where I needed to deploy the agent (Setup –> Add targets –> Add targets manually). Out of these 4 everything went fine for 2 LDOMs but for the other two it failed with an error message. It didn’t give much details on the EM screen but rather gave a message to try to secure/start the agent manually. When I tried to do that manually the secure agent part worked fine but the start agent command failed with the following error message:
oracle@app1:~$emctl start agent Oracle Enterprise Manager Cloud Control 12c Release 2 Copyright (c) 1996, 2012 Oracle Corporation. All rights reserved. Starting agent ………………………………………………………. failed. HTTP Listener failed at Startup Possible port conflict on port(3872): Retrying the operation… Failed to start the agent after 1 attempts. Please check that the port(3872) is available.
I thought that there was something wrong with the port thing so I cleaned the agent installation, made sure that the port wasn’t being used and did the agent deployment again. This time it again failed with the same message but it reported a different port number ie 1830 agent port no:
oracle@app1:~$emctl start agent Oracle Enterprise Manager Cloud Control 12c Release 2 Copyright (c) 1996, 2012 Oracle Corporation. All rights reserved. Starting agent ……………………………………………. failed. HTTP Listener failed at Startup Possible port conflict on port(1830): Retrying the operation… Failed to start the agent after 1 attempts. Please check that the port(1830) is available.
Again checked few things but found nothing wrong. All the LDOMs had similar configuration so what worked for the other two should have worked for these two also.
Before starting with the installation I had noted the LDOM hostnames and IPs in a notepad file and had swapped the IPs of two LDOMs (actually these two only ). But later on I found that and corrected. While looking at the notepad file it occurred to me that the same stuff could be wrong in /etc/hosts of the server where EM is deployed. Oh boy that is what it was. While making the entries in /etc/hosts of EM server, I copied it from the notepad and the wrong entries got copied. The IPs for these two LDOMs got swapped with each other and that was causing the whole problem.
deinstalled the agent, correct the /etc/hosts and tried to deploy again…all worked well !