• Aucun résultat trouvé

Trace Files Generated in Response to Internal Errors

Dans le document SECOND EDITION (Page 140-145)

I’d like to close this section with a discussion about those other kinds of trace files—the ones we did not expect that were generated as a result of an ORA-00600 or some other internal error. Is there anything we can do with them?

The short answer is that, in general, they are not for you and me. They are useful to Oracle Support.

However, they can be helpful when we file a service request with Oracle Support. That point is crucial: if you are getting internal errors, the only way they will ever be corrected is if you file a service request. If you just ignore them, they will not get fixed by themselves, except by accident.

For example, in Oracle 10g Release 1, if you create the following table and run the query, you may well get an internal error (or not; it was filed as a bug and is corrected in later patch releases):

ops$tkyte@ORA10G> create table t ( x int primary key );

Table created.

ops$tkyte@ORA10G> insert into t values ( 1 );

1 row created.

ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select count(x) over () 2 from t;

from t *

ERROR at line 2:

ORA-00600: internal error code, arguments: [12410], [], [], [], [], [], [], []

Now, suppose you are the DBA and all of a sudden this trace file pops up in the trace area. Or you are the developer and your application raises an ORA-00600 error and you want to find out what happened. There is a lot of information in that trace file (some 35,000 lines, in fact), but in general it’s not useful to you and me. We would generally just compress the trace file and upload it as part of our service request processing.

Starting in Oracle database 11g, the process of gathering the trace information and uploading it to support has been modified (and made significantly easier). A new command-line tool, in conjunction with a user interface via Enterprise Manager, allows you to review the trace information in the ADR, and package and transmit it to Oracle Support.

The ADRCI tool allows you to review “problems” (critical errors in the database) and incidents (occurrences of those critical errors) and to package them up for transmission to support. The packaging step includes retrieving not only the trace information, but also details from the database alert log and other configuration/test case information. For example, I set up a situation in my database that raised a critical error (no, I won’t say what it is. You have to generate your own critical errors.). I knew I had a

“problem” in my database because the ADRCI tool told me so:

CHAPTER 3 ■ FILES

$ adrci

ADRCI: Release 11.2.0.1.0 - Production on Wed Jan 20 14:15:16 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.

ADR base = "/home/ora11gr2/app/ora11gr2"

adrci> show problem

ADR Home = /home/ora11gr2/app/ora11gr2/diag/rdbms/orcl/ora11gr2:

*************************************************************************

PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME

--- ---- --- ---

1 ORA 4031 7228 2009-12-15 03:32:51.964000 -05:00

1 rows fetched

On December 15, 2009 I caused an ORA-4031, a serious problem, in the database. I can now see what was affected by that error by issuing the show incident command:

adrci> show incident

ADR Home = /home/ora11gr2/app/ora11gr2/diag/rdbms/orcl/ora11gr2:

*************************************************************************

INCIDENT_ID PROBLEM_KEY CREATE_TIME

--- --- --- 6201 ORA 4031 2009-12-15 03:22:54.854000 -05:00 6105 ORA 4031 2009-12-15 03:23:05.169000 -05:00 6177 ORA 4031 2009-12-15 03:23:07.543000 -05:00 6202 ORA 4031 2009-12-15 03:23:12.963000 -05:00 6203 ORA 4031 2009-12-15 03:23:21.175000 -05:00 5 rows fetched

I can see there were five incidents, and I can identify the information related to each incident via the show tracefile command:

adrci> show tracefile -I 6177

diag/rdbms/orcl/ora11gr2/incident/incdir_6177/ora11gr2_ora_26528_i6177.trc adrci>

This shows me the location of the trace file for incident number 6177. Further, I can see a lot of detail about the incident if I so choose:

adrci> show incident -mode detail -p "incident_id=6177"

ADR Home = /home/ora11gr2/app/ora11gr2/diag/rdbms/orcl/ora11gr2:

*************************************************************************

**********************************************************

INCIDENT INFO RECORD 1

**********************************************************

INCIDENT_ID 6177 STATUS ready

CHAPTER 3 ■ FILES

And, finally, I can create a “package” of the incident that is useful for support. The package will contain everything a support analyst needs to begin working on the problem. Here’s an example command to create such a package:

CHAPTER 3 ■ FILES

adrci> ips create package incident 6177

Created package 1 based on incident id 6177, correlation level typical

This section is not intended to be a full overview or introduction to the ADRCI command, which is documented fully in the Oracle Database Utilities 11g Release 2 (11.2) manual. Rather, I just wanted to introduce the existence of the tool—a tool that makes using trace files easy.

Prior to ADRCI in 11g, was there anything you could do with the unexpected trace files beyond sending them to support? Yes, there is some information in a trace file that can help you track down the who, what, and where of an error. The trace file can also help you find out if the problem is something others have experienced.

The previous example shows that ADRCI is an easy way to interrogate the trace files in Oracle Database 11g (I showed just a small fraction of the commands available). In 10g and before, you can do the same thing, albeit it a bit more manually. For example, a quick inspection of the very top of a trace file provides some useful information. Here’s an example:

/home/ora10gr1/admin/ora10gr1/udump/ora10gr1_ora_2578.trc

Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 - Production With the Partitioning, OLAP and Data Mining options

ORACLE_HOME = /home/ora10gr1 System name: Linux Node name: dellpe

Release: 2.6.9-11.ELsmp

Version: #1 SMP Fri May 20 18:26:27 EDT 2005 Machine: i686

Instance name: ora10gr1

Redo thread mounted by this instance: 1 Oracle process number: 16

Unix process pid: 2578, image: oracle@dellpe (TNS V1-V3)

The database information is important to have when you go to http://metalink.oracle.com to file the service request or to search to see if what you are experiencing is a known problem. In addition, you can see the Oracle instance on which the error occurred. It is quite common to have many instances running concurrently, so isolating the problem to a single instance is useful.

Here’s another section of the trace file to be aware of:

*** 2010-01-20 14:32:40.007

*** ACTION NAME:() 2010-01-20 14:32:39.988

*** MODULE NAME:(SQL*Plus) 2010-01-20 14:32:39.988

*** SERVICE NAME:(SYS$USERS) 2010-01-20 14:32:39.988

This part of the trace file is new with Oracle 10g and above and won’t be there in Oracle9i and before. It shows the session information available in the columns ACTION and MODULE from V$SESSION.

Here we can see that it was a SQL*Plus session that caused the error to be raised (you and your developers can and should set the ACTION and MODULE information; some environments such as Oracle Forms and APEX already do this for you).

Additionally, we have the SERVICE NAME. This is the actual service name used to connect to the database—SYS$USERS, in this case—indicating we didn’t connect via a TNS service. If we logged in using user/pass@ora10g.localdomain, we might see:

*** SERVICE NAME:(SYS$USERS) 2010-01-20 14:32:39.988

CHAPTER 3 ■ FILES

92

where ora10g is the service name (not the TNS connect string; rather, it’s the ultimate service registered in a TNS listener to which it connected). This is also useful in tracking down which process or module is affected by this error.

Lastly, before we get to the actual error, we can see the session ID and related date/time information (all releases) as further identifying information:

*** SESSION ID:(19.27995) 2010-01-20 14:32:39.988

Now we are ready to get into the error itself:

ksedmp: internal or fatal error

ORA-00600: internal error code, arguments: [12410], [], [], [], [], [], [], []

Current SQL statement for this session:

select count(x) over () from t

--- Call Stack Trace --- _ksedmp+524

_ksfdmp.160+14 _kgeriv+139 _kgesiv+78 _ksesic0+59

_qerixAllocate+4155 _qknRwsAllocateTree+281 _qknRwsAllocateTree+252 _qknRwsAllocateTree+252 _qknRwsAllocateTree+252 _qknDoRwsAllocate+9 ...

Here we see a couple of important pieces of information. First, we find the SQL statement that was executing when the internal error was raised, which is very useful for tracking down what application(s) was affected. Also, since we see the SQL here, we can start investigating possible workarounds—trying different ways to code the SQL to see if we can quickly work around the issue while working on the bug.

Furthermore, we can cut and paste the offending SQL into SQL*Plus and see if we have a nicely reproducible test case for Oracle Support (these are the best kinds of test cases, of course).

The other important pieces of information are the error code (typically 600, 3113, or 7445) and other arguments associated with the error code. Using these, along with some of the stack trace information that shows the set of Oracle internal subroutines that were called in order, we might be able to find an existing bug (and workarounds, patches, and so on). For example, we might use the search string ora-00600 12410 ksesic0 qerixAllocate qknRwsAllocateTree

Using MetaLink’s advanced search (using all of the words, search the bug database), we

immediately find the bug 3800614, “ORA-600 [12410] ON SIMPLE QUERY WITH ANALYTIC FUNCTION”.

If we go to http://metalink.oracle.com and search using that text, we will discover this bug, see that it is fixed in the next release, and note that patches are available—all of this information is available to us. I often find that the error I receive is one that has happened before and there are fixes or workarounds for it.

CHAPTER 3 ■ FILES

Dans le document SECOND EDITION (Page 140-145)