Professional Oracle® Programming

(1)

(2)

Professional Oracle® Programming

Rick Greenwald, Robert Stackowiak, Gary Dodge, David Klein, Ben Shapiro,

Christopher G. Chelliah

(3)

(4)

Professional Oracle® Programming

(5)

(6)

Professional Oracle® Programming

Rick Greenwald, Robert Stackowiak, Gary Dodge, David Klein, Ben Shapiro,

Christopher G. Chelliah

(7)

Published by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8700. Requests to the Publisher for permission should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or online at http://www.

wiley.com/go/permissions_.

LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY:THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT.

NEITHER THE PUBLISHER NOT THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HERE- FROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAP- PEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.

For general information on our other products and services please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Trademarks:Wiley, the Wiley Publishing logo, Wrox, the Wrox logo, and Programmer to Programmer are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates. All other trademarks are the property of their respective owners. Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Library of Congress Cataloging-in-Publication Data:

Professional Oracle programming / Rick Greenwald ... [et al.].

p. cm.

Includes indexes.

ISBN-13: 978-0-7645-7482-5 (paper/website) ISBN-10: 0-7645-7482-5 (paper/website)

1. Oracle (Computer file) 2. Relational databases. I. Greenwald, Rick.

QA76.9.D3P76646 2005 005.75'85--dc22

2005010511 ISBN 13: 978-076-457482-5

ISBN 10: 0-7645-7482-5

Printed in the United States of America

(8)

About the Authors

Rick Greenwaldhas been in the technology industry for over 20 years and is the author of 12 previous books, most of them on Oracle. He has been involved with development and databases for his entire career, including stops at Data General, Cognos, Gupta Technologies, and his current employer, Oracle.

Computers and computing are a sideline for Rick — his real job is father to his three wonderful girls, with a primary hobby of music appreciation.

Robert Stackowiakis Senior Director of Business Intelligence (BI) in Oracle’s Technology Business Unit.

He is recognized worldwide for his expertise in business intelligence and data warehousing in leading the North American BI team at Oracle. His background includes over 20 years in IT related roles at Oracle, IBM, Harris Computer Systems, and the U.S. Army Corps of Engineers including management of technical teams, software development, sales and sales consulting, systems engineering, and business development.

Gary Dodge has been focused on database technology since his first COBOL programming job with IMS DB/DC in 1976. He joined Oracle Corporation in 1987 and has served in various management and technical positions within both the sales and consulting divisions. He has been a frequent speaker on database topics at many local and national information technology conferences. In addition to several magazine articles, he is co-author (with Tim Gorman) of Oracle8 Data Warehousing and Essential Oracle8i Data Warehousing, both published by John Wiley & Sons.

David Kleinhas been in the technology industry for over 20 years with a variety of companies, including Data General, Wang Laboratories, Gupta Technologies, Oracle, and a few consulting services companies.

He has had many roles, including management of application development and database design teams, sales and sales consulting, systems engineering and marketing. Recently, he has focused on developing classroom and online training courses. An active wife and two boys and a 200-year-old house take up any free time.

Ben Shapirois the president of ObjectArts Inc., a New York City-based technology consulting company, and has been designing database systems with Oracle since 1997. ObjectArts has worked with many large corporate clients developing XML-based publishing tools and web-based applications. Before ObjectArts, Ben worked with several NYC-based startup companies as a technical lead building content management software.

Christopher G. Chelliahjoined Oracle as a Consultant in 1995. He brings with him project management, architecture, and development experience from a number of large, multinational sites in the mining, oil and gas, and telecom industries. Chris has been involved with emerging and database technologies for his entire career and is an accomplished software architect with a flair for business development. His expertise has been actively sought by a number of major Oracle clients in Australia, Europe, and the United States. Chris, his wife and two kids are currently in Singapore, where he leads a team defining and executing on innovative E-Government strategies for Oracle’s Public Sector industry in Asia Pacific.

(9)

(10)

Credits

Vice President and Executive Group Publisher Richard Swadley

Vice President and Publisher Joseph B. Wikert

Executive Editor Robert Elliott Editorial Manager Mary Beth Wakefield Senior Production Editor Geraldine Fahey

Development Editor Sharon Nash Production Editor Felicia Robinson Technical Editors

Michael Ault, Tim Gorman, Wiley-Dreamtech India Pvt Ltd Text Design & Composition Wiley Composition Services

(11)

(12)

Acknowledgments

Although there are a number of names listed on the title page of this book, there are many other people involved in the creation of the work you are currently reading.

First and foremost, all of the authors would like to thank Bob Elliott and Sharon Nash for their steady hands on the throttle of this project. Although the birthing of this baby had more than the normal share of labor pains, the worth and beauty of this offspring owes an enormous amount to them.

Rick Greenwald, in particular, would like to thank two people. First of all, the process of writing three (is it three already) titles and several revisions of books with Bob Stackowiak has been a wonderful experience, both professionally and personally. I am delighted to count Bob as a friend, and, thanks to him, I still don’t know what BI stands for.

Secondly, throughout my writing career, Steven Feurstein has been a mentor as well as a sterling example of what an author and person should be like. I hope to someday reach his level in terms of writing and, even more importantly, humanity.

And, of course, I would like to thank my family — LuAnn for helping me have a life that supports writing and working, and Elinor Vera, Josephine, and Robin Greenwald for giving me all the inspiration anyone would ever need.

In addition, Rick would like to acknowledge all those people who came through with suggestions and knowledge at crucial times, including Richard Foote, Raj Mattamal and Tyler Muth.

(13)

(14)

Introduction

Professional Oracle Programming is intended for application developers who use Oracle as their database.

As such, this book does not devote much space to topics which are primarily of interest to database administrators, such as backup and recovery. Most importantly, this book is designed to help a programmer understand the database issues that can have a direct impact on the operation and performance of their applications.

What Does This Book Cover?

Although it is impossible to cover all the topics that a professional developer could use to the level of depth required in a single book, this volume attempts to provide a foundation for these developers, as well as guidance and examples of some of the key areas of interest and necessity. These areas include accessing data using SQL, handling issues such as multi-user concurrency and data integrity and security, the basics of programming with Java, PL/SQL and XML, and data warehousing. For a more complete list of topics, please refer to the section called “How Is This Book Structured?” later in this introduction.

Who Is This Book For?

The entire topic of Oracle databases is certainly not a new one. Oracle has been one of, if not the, leading databases in the world for almost two decades now. The standard Oracle documentation runs over 13,000 pages, and leaves out a lot of the good stuff. Determining the proper audience for this book, and writing at a level appropriate for that audience, was the biggest conceptual challenge for all of the authors.

Through discussions early in this project, the authors and publishers of this book came up with a fairly distinct profile of our presumed reader. This book is intended for people whose primary job is to create application systems. They are developers, rather than database administrators.

Following through on this profile, we believe that you, the professional developer, are well informed and skilled in the creation of these application systems. In our experience, even professional developers may think of the database they use as nothing more than a place to store data.

This description of you, the target reader, has guided the selection of topics as well as the actual content of this book.

(31)

What You Need to Use This Book

Although you do not necessarily need to have a working instance of an Oracle database to benefit from this book, the book includes many examples that can be run against your Oracle database.

All of the examples have been created against Oracle Database 10g, although, with a few noted exceptions, you could run the examples against Oracle9i. Since Oracle is a cross-platform database, the actual operating system that your version of Oracle runs on is unimportant.

All of the examples can be run in either the standard Oracle utilities, such as iSQL*Plus, that come with the product, or a standard development environment, including tools such as TOAD. Directions on how to run the examples are included in Chapter 7 of this book on installing an Oracle database, or in the chapter in which the examples are given.

Most of the screen shots in this book have used a Windows client platform, but that platform is simply a standard used for consistency, not a requirement.

How Is This Book Str uctured?

In a broad sense, this book is divided up into three main areas — information you need to know about how the Oracle database works, information that can help you to use the capabilities of the Oracle database in creating your application systems, and information to help you achieve optimal performance with your application systems. Although these three areas are covered in the order listed in the book, you will find relevant information for each of these areas throughout all of the chapters, along with tips, techniques, and code samples.

Part I: Oracle Essentials

Although the Oracle database is a standards-compliant SQL database, the architecture of the Oracle database and the way it operates include some unique features that are crucial to your effective use of Oracle. The first part of Oracle for Professional Developers covers topics that every developer must understand in order to use the Oracle database effectively. Please don’t think of these chapters as simply introductory material, since some of the information in these chapters can have an enormous effect on the eventual operation of your applications.

The topics covered in this section include:

❑ A brief introduction to the architecture and components of the Oracle database

❑ A detailed look at how Oracle processes your SQL requests and the impact this can have on your application code

❑ A description of how Oracle handles data access by multiple users, and how this impacts your application design

❑ A primer on effective database design

❑ A discussion on how Oracle administers security and how you can use this for your applications

(32)

xxix

Introduction

❑ An overview of the Oracle data dictionary, which contains information about the objects and operations of a specific Oracle database

❑ A brief tutorial on installing an Oracle database

Part II: Data Topics

The second section of this book focuses on how to access and manipulate the data in the Oracle database. The topics in this section are relevant to all developers, regardless of the language they are using to implement their applications.

The topics covered in this section include:

❑ An introdution to both basic SQL and some of the more sophisticated uses of this powerful data access language

❑ Indexes, which help to speed the performance of data retrieval

❑ Constraints, which are used to both validate and limit the data added to your Oracle database

❑ A discussion of a variety of other database objects, such as sequences

❑ The use of functions, both built-ins that come with the Oracle database and your own custom defined functions

❑ A discussion of data manipulation as a distributed database environment

Part III: Database Programming Languages

The Oracle database allows you to include logic that runs in the database itself.Traditionally, the language used for this work is PL/SQL — a set of procedural extensions based on the SQL language. Oracle8i introduced a Java engine into the database, which allowed you to use this language for implementing procedural logic in the database.

Part III of this book includes three chapters on PL/SQL and one on Java. The PL/SQL chapters are divided into one on the basics of PL/SQL, one on using PL/SQL in conjunction with SQL, and one of PL/SQL packages, which are a way of organizing and exposing the logic you create with PL/SQL.

Part IV: Programming Techniques

Part IV of this book dives into the wide and varied area of implementing logic with various programming techniques to use in the database and your applications.

The topics covered include:

❑ The use of triggers, which contain logic that is automatically executed whenever certain types of data access and manipulation are performed

❑ Regular expressions, which are a powerful syntax for implementing matching conditions, and the expression filter — both of these features are new with Oracle Database 10g

(33)

❑ The use of your Oracle database with object types, both for storage and for interactions with objects in your applications

❑ The use of Oracle with XML

❑ HTML-DB, a development environment that comes as part of Oracle Database 10g

Part V: Business Intelligence Techniques

Business intelligence describes the process of using raw data in a database as the raw material for analysis that can affect the direction of business operations. Business intelligence in the Oracle database encompasses the use of several specific features and techniques, which are covered in this section.

The topics covered include:

❑ Moving data between different databases, which is often used to take data from an OLTP system to a system used for business intelligence

❑ Data loading and management, which covers the manipulations that are sometimes required to prepare data for use in business intelligence operations

❑ And specific query features used for business intelligence, such as analytic functions

Part VI: Optimization

Creating an application is the first part of any programming project. But no application will be acceptable to your constituents if it does not run efficiently.

This final section of the book introduces you to the concepts behind optimization in the Oracle database.

The chapter includes a discussion of how Oracle determines the optimal access paths to your data, how to see what Oracle has determined, and how to shape the behavior of your Oracle database in this area.

Although database optimization is a subject worthy of many excellent volumes, this chapter is intended to give you all the information you need to start to make your applications perform well.

The Bigger Picture

As mentioned at the start of this introduction, no one book is complete enough to cover everything you will need to know to use Oracle as an effective database for your applications. We have tried to give you a map that can be used over the course of many years of working with an Oracle database. If you believe that there are destinations we have left off of this map, or descriptions that could be more useful or informative, we welcome your comments.

Conventions

To help you get the most from the text and keep track of what’s happening, we’ve used a number of conventions throughout the book.

(34)

xxxi

Introduction

Tips, hints, tricks, and asides to the current discussion are offset and placed in italics like this.

As for styles in the text:

❑ We highlight important words when we introduce them

❑ We show keyboard strokes like this: Ctrl+A

❑ We show file names, URLs, and code within the text like so: persistence.properties

❑ We present code in two different ways:

In code examples we highlight new and important code with a gray background.

The gray highlighting is not used for code that’s less important in the present context, or has been shown before.

Source Code

As you work through the examples in this book, you may choose either to type in all the code manually or to use the source code files that accompany the book. All of the source code used in this book is available for download at http://www.wrox.com. Once at the site, simply locate the book’s title (either by using the Search box or by using one of the title lists) and click the Download Code link on the book’s detail page to obtain all the source code for the book.

Because many books have similar titles, you may find it easiest to search by ISBN; for this book the ISBN is 0-7645-7482-5.

Once you download the code, just decompress it with your favorite compression tool. Alternately, you can go to the main Wrox code download page at http://www.wrox.com/dynamic/books/

download.aspxto see the code available for this book and all other Wrox books.

Errata

We make every effort to ensure that there are no errors in the text or in the code. However, no one is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake or faulty piece of code, we would be very grateful for your feedback. By sending in errata you may save another reader hours of frustration and at the same time you will be helping us provide even higher quality information.

Boxes like this one hold important, not-to-be forgotten information that is directly relevant to the surrounding text.

(35)

To find the errata page for this book, go to http://www.wrox.comand locate the title using the Search box or one of the title lists. Then, on the book details page, click the Book Errata link. On this page you can view all errata that has been submitted for this book and posted by Wrox editors. A complete book list including links to each book’s errata is also available at www.wrox.com/misc-pages/booklist.shtml. If you don’t spot “your” error on the Book Errata page, go to www.wrox.com/contact/techsupport.shtml and complete the form there to send us the error you have found. We’ll check the information and, if appropriate, post a message to the book’s errata page and fix the problem in subsequent editions of the book.

p2p.wrox.com

For author and peer discussion, join the P2P forums at p2p.wrox.com. The forums are a Web-based system for you to post messages relating to Wrox books and related technologies and interact with other readers and technology users. The forums offer a subscription feature to e-mail you topics of interest of your choosing when new posts are made to the forums. Wrox authors, editors, other industry experts, and your fellow readers are present on these forums.

At http://p2p.wrox.comyou will find a number of different forums that will help you not only as you read this book, but also as you develop your own applications. To join the forums, just follow these steps:

1.

Go to p2p.wrox.comand click the Register link.

2.

Read the terms of use and click Agree.

3.

Complete the required information to join as well as any optional information you wish to provide and click Submit.

4.

You will receive an e-mail with information describing how to verify your account and complete the joining process.

You can read messages in the forums without joining P2P but in order to post your own messages, you must join.

Once you join, you can post new messages and respond to messages other users post. You can read messages at any time on the Web. If you would like to have new messages from a particular forum e-mailed to you, click the Subscribe to this Forum icon by the forum name in the forum listing.

For more information about how to use the Wrox P2P, be sure to read the P2P FAQs for answers to questions about how the forum software works as well as many common questions specific to P2P and Wrox books. To read the FAQs, click the FAQ link on any P2P page.

(36)

Oracle Architecture and Storage

The core theoretical purpose of a relational database is to separate logical interaction with data within the database from the physical storage of that data. As a user or developer, you shouldn’t really have to know how an Oracle database works internally, or stores its data, in order to use it properly. At a high level, this is certainly true. You don’t need to know anything about the Oracle database to write SQL, the access language used to interact with the database, or to create programs that use SQL.

The Oracle database automatically takes care of translating your logical requests to access the data within its control. But, like all applications, the way your Oracle database is actually implemented can have an impact on what you do and do not do in the creation and maintenance of your application systems. Having a basic understanding of how Oracle does its jobs provides a foundation for the rest of this book. And understanding the ways that Oracle stores your data, which is also covered in this chapter, can have a direct impact on the way you write your applications that use that data.

This first chapter introduces you to a lot of the behind-the-scenes underpinnings of the Oracle database — the architecture of the database and how the data is stored within it. You very well may not ever have to use this information in your day-to-day tasks, but the substance of this chapter is necessary for the more detailed information about programming.

This chapter covers the architecture of the Oracle database as well as the storage options for your data. Depending on how the responsibilities are divided in your particular environment, there may be database administrators (DBAs) who do most of the work of translating your logical requirements into physical storage, or you may be called upon to perform some of these tasks yourself. In either case, this chapter gives you some familiarity with the way your Oracle database handles your data.

(37)

Architecture

The Oracle database is a relational database. The theory of relational databases was first put forward by Dr. E. F. Codd, a research scientist working for IBM, in the mid-1970s. Codd proposed a number of rules for a database to follow to be considered a relational database. For the first decade or so after these rules were proposed, early relational databases fought to show that they were the most compliant with Codd’s rules. Those early days are far behind us now, and the basic features of a relational database, such as guar- anteeing transactional integrity and allowing ad hoc access to data, are well established in all major relational databases.

The Oracle database consists of two main parts: the instance, which is the software service that acts as an intermediary between application requests and its data, and the actual data files, which are where the data is kept. The instance is a dynamic process and uses a variety of tasks and memory to support its operations. The data files are stored on disk, so the data itself will survive most service interruptions, except for catastrophic media failure.

The Instance

The Oracle instance is the collection of processes that handle requests from a client for a data. Some of these processes are shown in Figure 1-1. Figure 1-1 also includes some of the main areas of memory that an instance uses.

Figure 1-1: Oracle instance architecture.

An Oracle instance is either started as part of the process of booting a server or can be started explicitly with commands. Although you start an instance with a single command, there are actually three distinct steps in the startup process:

❑ Starting the instance process itself

Processes

Memory

Database buffers

SMON PMON

DBWR

LGWR ARC

Shared pool

SGA

PGA

User User

User

Redo Log buffers

(38)

❑ Mounting the database, which consists of opening the control files for the instance

❑ Opening the database, which makes the database available for user requests

An instance can be stopped with another command or via the system console, which follows the same sequence of events in reverse. You can stop an instance gracefully by stopping users from logging on to the database and only shutting down when the last active user has logged off, or you can simply stop the instance, which may result in incomplete transactions (described in detail in Chapter 3, “Handling Multiple Users”).

Processes Supporting an Instance

A number of processes are associated with an instance and perform specific tasks for the instance.

Listener

The Listener is a process that listens on the network for requests coming in to an Oracle instance. A Listener process can support one or more Oracle instances. The Listener acts as the intermediary between user requests from remote machines and the instance. The failure of the Listener will mean that the instance it supports is not accessible by a remote client — such as any application that accesses the instance.

Background Processes

A whole set of processes run in the background on an Oracle server, supporting the actions of the Oracle instance. The main processes are listed here, with the standard acronyms of the processes in parentheses:

❑ Database Writer (DBWR). Writes data blocks from the database buffers, as described in the fol- lowing section.

❑ Log Writer (LGWR). Writes redo log information from the redo log buffer in memory to the redo log on disk.

❑ System Monitor (SMON). Monitors the health of all components of an Oracle instance, and helps to recovery the database when it is restarted after a crash.

❑ Process Monitor (PMON).Watches individual user processes that access the Oracle instance and cleans up any resources left behind by an abnormal termination of a user process.

❑ Archiver (ARC). Writes a copy of a filled redo log file to an archive location. An Oracle instance can have up to 10 Archiver processes.

Files Supporting an Instance

There are a number of files that are used by an instance. The basic files for storing data are discussed in the next section on data organization. An Oracle instance has its own set of files that do not store user data but are used to monitor and manage the Oracle instance itself.

Initialization Files

Many parameters exist that affect the way that your Oracle instance operates. These parameters are typically set and maintained by a database administrator and, as such, are beyond the scope of this book.

The initial values for these parameters are kept in initialization files.

3 Oracle Architecture and Storage

(39)

Prior to Oracle9i, each instance had its own specific initialization file called INIT.ORA, as well as an optional file with additional configuration information called CONFIG.ORA. Starting with Oracle9i, you could have a virtual initialization file called the SPFILE, which could be shared with multiple instances.

SPFILE was especially handy for two reasons: You could change a configuration parameter for a running instance and not have to save the change to the SPFILE, which means that the value of the parameter would be reset to the stored value on the next startup. You can also use the SPFILE with Real Application Clusters, discussed later in this chapter, where multiple instances worked together in a cluster and shared the same initialization procedures.

Control File

The control file is used to store key information about an instance, such as the name of the instance, the time the database was created, and the state of backup and log files for the database. An Oracle instance requires a control file in order to operate. Although a control can be rebuilt, most Oracle installations use multiple copies of the control file to avoid this possibility.

Redo Log Files

One of key features of a relational database is its ability to recover to a logically consistent state, even in the event of a failure. Every relational database, including Oracle, uses a set of redo log files. These files keep track of every interaction with the database. In the event of a database failure, an administrator can recover the database by restoring the last backup and then applying the redo log files to replay user interactions with the database.

Redo log files eventually fill up and roll over to start a new volume. You can set up Oracle to avoid writing over existing logfiles by creating the database to automatically archive log files in ARCHIVELOG mode, which is discussed in detail in the Oracle documentation.

Since redo logs are crucial for restoring a database in the event of a failure, many Oracle shops set up an instance to keep multiple copies of a redo log file.

Rollback Segments

Unlike all other major databases, Oracle also uses rollback segments to store previous versions of data in the database. The use of rollback segments makes it possible for the Oracle database to avoid the use of read locks, which can significantly reduce performance degradation based on multiuser access as well as providing a consistent view of data at any particular point in time. For more on rollback segments and how they support multiuser concurrency, see Chapter 3, “Handling Multiple Users,” which covers this important topic in detail.

Because rollback segments track every change to data, the rollback segments are updated as soon as a change is made. This real-time update lets the Oracle database delay writing its redo log files until a time when these writes can be done efficiently.

Oracle 10g includes the ability to designate an automatic UNDO tablespace, which assigns the responsibility for managing rollback segments to the Oracle database itself.

Memory Used by an Instance

Figure 1-1, shown previously, includes several areas of memory that are used by an individual Oracle instance.

(40)

System Global Area

The System Global Area (SGA) is an area of memory that is accessible to all user processes of an Oracle instance. Three main areas are used by the SGA:

❑ The redo log buffer holds information used for recovery until the information can be written to the redo log.

❑ The shared pool holds information that can be shared across user processes, such as execution plans for SQL statements, compiled stored procedures, and information retrieved from the Oracle data dictionary.

❑ The database buffer pools are a group of memory areas that are used to hold data blocks. A data block held in a buffer pool can be accessed much more rapidly than going to disk to retrieve the block, so efficient use of these memory pools is essential for achieving optimal performance.

Following are the three basic database buffer pools:

❑ The DEFAULT pool holds all database objects, unless otherwise specified.

❑ The KEEP pool is used to hold objects in memory if specified for a table or index.

❑ The RECYCLE pool is used for objects that are not likely to be reused again.

Keep the following two points in mind:

❑ For all of these pools, the Oracle instance uses a Least Recently Used (LRU) algorithm to determine what data blocks to swap out.

❑ In addition to these main pools, the Oracle instance can also use a large pool, which is used for shared server, backup and recovery, and I/O operations. You can either configure a large pool or Oracle can create it automatically, if you configure adaptive parallelism, which is described further in Chapter 2.

Program Global Area

The Program Global Area (PGA) is an area of memory that is just available to a single server process.

The PGA contains items like user variables and cursor information for an individual user’s SQL statement, such as the number of rows that have been retrieved so far. The PGA is also used for sorting data for an individual user process.

Each time a new SQL statement is received for a user, a new space in the PGA has to be initialized. To avoid this overhead, transaction-intensive applications, such an online transaction processing (OLTP) applications, typically use a small set of SQL statements that are continually reused by a user process, reducing this potential source of overhead.

You can have a single server process handle more than one user process by configuring shared servers, which are described in Chapter 2, “Using SQL.”

The database administrator typically does the memory allocations for these areas, but setting these values inappropriately can have a dramatic affect on the perfor- mance of your applications. Oracle 10g makes it easier to manage these portions of memory, since it eliminates the need to specify an allocation for the individual com- ponents of each of these pools. With Oracle 10g, you can simply set a maximum size for the SGA and the PGA, and Oracle will take care of the allocation for the individ- ual components of these areas.

5 Oracle Architecture and Storage

(41)

Real Application Clusters

As you know, a single server can have one or more CPUs within it. You can also use a technique called clustering to combine more than one server into a larger computing entity. The Oracle database has sup- ported the ability to cluster more than one instance into an single logical database for over 15 years.

Oracle combines multiple instances into a cluster, shown in Figure 1-2, which uses underlying hardware and software to create a single database image from multiple separate servers.

Since Oracle9i, Oracle has called these clusters Real Application Clusters, commonly referred to as RAC.

As Figure 1-2 shows, RAC instances share a common source of data, which allows RAC to provide greater database horsepower for scalability while also providing higher levels of availability.

Figure 1-2: A Real Application Cluster deployment.

One of the best features of RAC is that you, as an application developer, don’t even have to think about it, since a RAC implementation uses the same API and SQL that a single instance uses. You also don’t have to worry about whether your Oracle application will be deployed to a single instance or a cluster when you are designing and creating your application, since there are virtually no specific performance considerations that only affect a RAC installation. As a developer, you don’t really have to know anything more about Real Application Clusters.

Data Organization

The previous section focused on the organization of the Oracle instance. As an application developer, you might find this type of material interesting (and it lays the foundation for a deeper understanding of the Oracle database that will come in handy later), but what you really care about is your data.

Data in a relational database is organized on two different levels — a logical level and a physical level. On a logical level, Oracle organizes your data into tables, rows, and columns, like other relational databases.

A table is a collection of information about a specific entity, such as an employee or a department. In non- relational systems, the equivalent of a table is a file. A row is an individual occurrence of a set of data, such as information about an individual employee or department. A column is one piece of data within a row, such as an employee’s name or the identifier for a department.

One of the key qualities of a relational database is to separate the logical organization of data, such as tables and rows, from the physical storage of data. Although you will be accessing data through the logical structures of tables, rows, and columns, this section introduces you to additional storage structures, both physical and logical, used by Oracle to organize your data. Since the care and feeding of storage is normally the responsibility

(42)

of the database administrator, this section does not provide detailed descriptions on how to implement and optimize these storage structures. If you will be acting as the DBA for any of your databases, we would suggest that you also get an Oracle book that is oriented towards DBAs, such as Oracle Administration And Management by Michael R. Ault (Wiley, 0-471-21886-3).

The hierarchy of logical and physical storage units is shown in Figure 1-3.

Figure 1-3: Data organization.

Tablespaces

The tablespace is another logical organizational unit used by the Oracle database, which acts as the intermediary between logical tables and physical storage. Every table or index, when created, is placed into a tablespace. A tablespace can contain more than one table or index, or the table or index can be partitioned across multiple tablespaces, as described later in the section on partitions. A tablespace maps to one or more files, which are the physical entities used to store the data in the tablespace. A file can only belong to one tablespace. A tablespace can contain row data or index data, and a table and an index for a table can be in different tablespaces.

A tablespace has a number of attributes, such as the block size used by the tablespace. This block size is the smallest amount of data retrieved by a single I/O (input/output) operation by the Oracle database. Your database administrator (hopefully) selected an appropriate block size for a tablespace, which depends on a wide range of factors, including the characteristics of the data and the block size of the underlying operating system.

Prior to Oracle9i, you could only have a single block size for an entire database. With Oracle9i, multiple block sizes are supported within a single database, although each block size must have its own data buffers in the SGA.

Row

Table

Logical storage units Physical storage units

Tablespace

Column

Data file Segment

Extent

Data block Data

block

7 Oracle Architecture and Storage

(43)

A tablespace is the basic administrative unit in an Oracle database. You can take a tablespace online or offline (that is, make it available or unavailable through the Oracle database instance) or back up and recover a tablespace. You can make a tablespace read-only to prevent write activity to the tablespace.

You can also use a feature called transportable tablespaces as a fast way to move data from one database to another. The normal way to take data out of one Oracle database and move it to another Oracle database is to export the data and then import the data. These processes operate by reading or writing each row. A transportable tablespace allows you to move the entire tablespace by simply exporting descriptive information about the tablespace from the data dictionary. Once you have exported this information, you can move the files that contain the tablespace and import the descriptive information into the data dictionary of the destination database (this is described in detail in Chapter 6, “The Oracle Data Dictionary.” This method of moving data is significantly faster than importing and exporting data.

Data movement is discussed in more detail in Chapter 24, “High-Speed Data Movement.”

The 10g release of the Oracle database includes the ability to use transportable tablespaces between Oracle databases on different operating system platforms.

Segments and Extents

Individual objects in the database are stored in segments, which are collections of extents. Data blocks are stored in an extent. An extent is a contiguous piece of disk storage. When Oracle writes data to its data files, it writes each row to a data block that has space available for the data. When you update information, Oracle tries to write the new data to the same data block. If there is not enough space in that block, the updated data will be written to a different block, which may be in a different extent.

If there are no extents that have enough space for new information, the Oracle database will allocate a new extent. You can configure the number and size of extents initially and subsequently allocated for a tablespace.

Partitions

As mentioned previously, you can use partitions to spread data across more than one tablespace. A partition is a way to physically segregate the data in a table or index based on a value in the table or index. There are four ways you can partition data in an Oracle database:

❑ Range partitioning, which divides data based on a range of values, such as putting customers with a last name starting with the letters A–G in one partition, the customers with last names starting with the letters H–M in another partition, and so on.

❑ Hash partitioning, which divides data based on the result of a hashing algorithm. Hash partitioning is used when range partitioning would result in unbalanced partitions. For instance, if you are partitioning a table based on a serial transaction identifier, the first partition will com- pletely fill before the second partition is used. A hash algorithm can spread data around more evenly, which will provide greater benefits from partitioning.

❑ Composite partitioning, which uses a hash algorithm within each specific range partion.

❑ List partitioning, which partitions based on a list of values. This type of partitioning is useful when a logical group, such as a list of states in a region, cannot be easily specified with a range description.

Partitions are defined when you create a table or an index. Since each partition of a table can be placed in a different tablespace, you can perform maintenance on a single partition, such as backup and recovery or moving tablespaces.

(44)

Partitions can also contribute to the overall performance of your Oracle database. The Oracle query optimizer is aware of the use of partitions, so when it creates a plan to execute a query, the plan will skip partitions that have been eliminated from the query with selection criteria. In addition, partitions can be in different

tablespaces, which can be in different parts of your disk storage, which can reduce disk head contention during retrievals. Query optimization and parallel execution are covered in more detail in Chapter 28, “Optimization.”

Data Types

One aspect of data storage that does significantly affect application development is data types. Each column in an Oracle database (or variable in a PL/SQL program, as described in Chapter 16, “PL/SQL and SQL,” is designated as a particular data type. The data type dictates some of the characteristics of the data. At the most basic level, the data type specifies how data will be stored and, more importantly, interpreted by the Oracle database. If you define a column as a number, Oracle interprets all data entered into that column as a number. Oracle stores that data as a number and returns the values as numbers.

Obviously, this type of gross distinction is crucial in defining data. But there are other differences between data types that can affect how you use them in your data design and applications. The remainder of this section covers the data types supported in the Oracle database.

You specify the data type for a column when you define the column, such as:

CREATE TABLE (ID NUMBER, EMP_NAME VARCHAR2(256));

Some data types can have a length indicated after the data type name, while others either do not take a length or have a default value for length, which makes the length specification optional. You can query the data dictionary, which is covered in Chapter 6, to retrieve the data type for any particular column.

Character Data Types

The Oracle database supports a number of different data types for storing character data. All of these data types interpret data entered and retrieved as character values, but each of the data types has additional characteristics as well.

CHAR

CHARcolumns can have a length specification after the data type or are assigned the default length of 1.

A^CHARcolumn will always use the assigned space in the database. If a column is defined as ^CHAR(25), every instance of that column will occupy 25 bytes in the database. If the value entered for that column does not contain 25 bytes, it will be padded with spaces for storage and retrieval.

VARCHAR2 and VARCHAR

VARCHAR2columns also can have a length specification, but this data type only stores the number of characters entered by the user. If you define a column as VARCHAR2(25)and the user only enters 3 characters, the database will only store those three characters. Because of this distinction, there may be some issues with comparison between a ^CHARcolumn and a ^VARCHAR2column.

The following SQL code snippet adds the same value, ‘ÂBC’, to two columns: ÂBCCHAR, which is described as a ^CHARcolumn with 20 characters, and ÂBCVARCHAR, which is described as a ^VARCHAR2column with 20 characters.

Professional Oracle® Programming

Professional Oracle® Programming

Rick Greenwald, Robert Stackowiak, Gary Dodge, David Klein, Ben Shapiro,

Christopher G. Chelliah

Professional Oracle® Programming

Professional Oracle® Programming

Rick Greenwald, Robert Stackowiak, Gary Dodge, David Klein, Ben Shapiro,

Christopher G. Chelliah

About the Authors

Credits

Acknowledgments

Contents

About the Authors v

Acknowledgments ix Introduction xxvii Introduction xxvii What Does This Book Cover? xxvii

Who Is This Book For? xxvii

What You Need to Use This Book xxviii How Is This Book Structured? xxviii

Part I: Oracle Essentials xxviii

Part II: Data Topics xxix

Part III: Database Programming Languages xxix

Part IV: Programming Techniques xxix

Part V: Business Intelligence Techniques xxx

Part VI: Optimization xxx

The Bigger Picture xxx

Conventions xxx

Source Code xxxi

Errata xxxi

p2p.wrox.com xxxii

Chapter 1: Oracle Architecture and Storage 1

Architecture 2

The Instance 2

Data Organization 6

Data Types 9

Character Data Types 9

BFILE 10

Numeric Data Types 11

Date Data Types 12

RAW and LONG RAW 12

Other Data Types 13

Summary 13

Chapter 2: Using SQL 15 The Processing Cycle for SQL Statements 15

Connecting to a Database 16

Establishing a Cursor 18

Submitting SQL Statements 18

Receiving Data 21

Performance Considerations 21

Retrieval Performance 22

Using Bind Variables 23

Parallel Operations 25

Summary 26

Chapter 3: Handling Multiple Users 27

Goals 28

Data Integrity 28

Isolation 29

Serialization 29

Transactions 29 Concurrent User Integrity Problems 30

Locks 33

Contention 34

The Oracle Solution 35

Multiversion Read Consistency 35

Integrity Implications 37

Performance Implications 37

Isolation Levels 38

Implementation Issues 39

Write Contention 39

Avoiding Changed Data 40

Summary 41

Chapter 4: Database Design Basics 43

Database Design Phases 45

Conceptual Database Design 45

Logical Database Design 45

Physical Design 46

Practical Design 47

Database Design Tools 48

Database Design Techniques 49

Database Design Case Study Example 53

xiii

Contents

Normalization 54

First Normal Form 56

Second Normal Form 59