• Aucun résultat trouvé

In the Internet's beginning, mapping hostnames to IP addresses was a simple function and was easy to maintain. The /etc/hosts file was used for this purpose, as it still is today. As new hosts came onto the Internet landscape, you simply added the hostname and its associated IP address. In those early days, adding ten or twenty new host addresses was a routine task. Yet that routine task can be time consuming for a system administrator.

As time passed and the Internet began to grow, the task of maintaining the /etc/hosts file became unwieldy. Another issue is the sheer size of the /etc/hosts file. Imagine a host file with hundreds and hundreds of entries. A better method of maintaining and performing host-to-IP-address mapping needed to be developed.

Sun Microsystems developed a solution called the Network Information System (NIS), also known as the Yellow Pages. The master host maintains the /etc/hosts file, allowing any client (host) to obtain the /etc/hosts file as required. NIS is really effective only with small- to medium-sized local-area networks.

For the Internet, the Network Information Center (NIC) implemented a similar scheme. A single database of hostnames and addresses was hosted by the NIC. All Internet sites had to download this master hosts file from the NIC. Imagine the load that was realized by the NIC servers as remote machines logged in to download, via FTP, the newest hosts file.

To resolve these issues, a new system was required. That system is known as the Berkeley Internet Domain System (BIND) and is synonymous with DNS. Actually, BIND is an implementation of DNS that includes the resolver and name server. The DNS system is described by a number of Request For Comments (RFCs). A good source for RFC lookup and retrieval can be found at http://www.cis.ohio-state.edu/hypertext/information/rfc.html.

With Linux (and UNIX) applications, the resolver is accessed using two library functions,

gethostbyaddress and gethostbyname. The first library function, gethostbyaddress, returns a hostname if given an IP address. The second function, gethostbyname, performs the inverse: given a hostname, it returns an IP address. The resolver communicates with the name server (called named) to perform the actual mapping. If you are a software developer, this is important to know because you must have an IP address before establishing a connection using TCP. Technically, the resolver is not

Each DNS is responsible for host information within its area of responsibility, or zone. In addition, each DNS maintains information about other DNS entities and their zones. At the root is the NIC, which maintains the top-level domains, called organizational domains in this book. Figure 10.1 provides a graphical representation of the organizational domains.

Figure 10.1. Top-level DNS.

Take note of the top-level domains that are connected to the root node. The root node, by the way, is actually unnamed; I've named it only as a point of reference. You have undoubtedly seen HTTP and FTP sites with some of these domains. Most people think that these top-level domains are exclusive to the United States; many people do not realize that a U.S. ("us") domain exists. Take a look at the example provided in the previous diagram; us.tx.houston is an actual site (at the time of this writing).

Try it out - enter the following URL into your favorite browser: http://www.houston.tx.us.

This brings up a point about the organizational domains. To set the record straight, .gov and .mil are the only two organizational domains that are bound to the United States. The other organizational domains, such as .com and .org are not specific to the United States; many non-U.S. entities use them.

Popular among non-U.S. entities is appending second-level (under top-level) domains. One well-known one is .co.uk, which relates to commercial entities within the United Kingdom (UK).

Domain names are not case sensitive; you can supply a domain name in uppercase, lowercase, or mixed case letters. A domain name consists of an individual domain name from a different level; each domain level is delimited by a period. Look again at the previous figure. Notice that the levels are maintained as a hierarchy - but in reverse when compared to the Linux file system. With the file system, the root is the start of a pathname, whereas the root is the end node in a fully qualified domain name.

A fully qualified domain name (FQDN) is a complete domain name that ends with a period. If the terminating period is absent from the domain name, the domain name is not fully qualified and requires completion.

Table 10.1 describes the organizational domains as shown in the previous figure. The special case (four letter) domain, arpa, is required for mapping an address to a name. The two-letter domains are designated as country codes. These country codes are derived from the International Standards Organization (ISO) 3166 standard.

Table 10.1. Organizational Domains and Descriptions

Organizational Domain Description

com Commercial organization

edu Educational organization

gov U.S. government organization

int International organization

mil U.S. military

net Network organization

org Other organization

How is a name resolved and an IP addressed revealed, based on a domain name? Most people do not know, or even care, how a Web page appears when they type http://www.whitehouse.gov; they just want the page to display. But for us, it is helpful to know just how a name is discovered and the IP address returned. For each top-level domain, a master name server exists. When a domain name inquiry is made, the request goes to a root server to return the address of a master name server for the top-level domain. A DNS server performs this inquiry.

After we have that address, the DNS server contacts a master name server and requests the name server responsible for the domain name we are requesting.

Next, DNS uses the name server's address, contacts that server, and requests the server's address that maintains the requested domain name. Finally, this last name server resolves the name and returns an IP address.

This process may seem time consuming and bandwidth intensive, but it is actually quite efficient - and also quite clever. One feature of DNS is result caching. If a mapping is obtained, it is cached for subsequent queries, thereby reducing "manual" lookups by DNS.

Let's look at an example to see how this works. Figure 10.2 graphically depicts the dialog. This diagram is a Unified Modeling Language (UML) sequence diagram. The entities are listed horizontally at the top of the diagram. Each entity has a vertical line associated with it; this vertical line is known as the entity's lifeline. A horizontal line that "connects" two lifelines can be an event, a process, or some other functionality. These events are shown as a sequence of events over time. The first event in the sequence is always at the top, with subsequent events progressing to the bottom of the diagram.

Suppose you need the address for http://www.mydomain.com. The DNS server interrogates the root server for the IP address of the server that handles the .com domain. That IP address is returned to the DNS server. Next, the DNS server asks the master name server for the IP address for the domain server responsible for http://mydomain.com; the IP address is then returned. Finally, the http://mydomain.com name server is interrogated for the IP address for http://www.mydomain.com server.

Figure 10.2. DNS and name server dialog.

As mentioned previously, the NIC is responsible for the top-level domains, whereas other entities are responsible for zones within the lower-level domains. The word "zone" has appeared in this chapter without a clear definition. A zone is a branch (or subtree) within the DNS tree. A majority of the zones branch from the second-level domain from the DNS tree. In Figure 10.1, a subtree zone might be found at the nasa.gov domain. Each zone is maintained by the organization responsible for the zone.

Furthermore, each zone can be divided into one or more smaller zones. How these zones are logically organized is determined by the responsible organization. A zone can represent a business unit, such as a department, or by graphical region, such as an organizational branch.

Using our fictitious domain, http://www.mydomain.com, we can divide this domain into zones based on departments. For example, the zone named http://sales.mydomain.com can represent the sales department and http://research.mydomain.com can represent the research and development department. Each host machine within each zone will have its name prepended to the domain name.

For example, if the research and development department has a host named stimpy, the domain name is http://stimpy.research.mydomain.com. A domain includes all hosts within the (entire) organization, whereas a zone includes only the hosts within the zone's area of responsibility.

Organizationally, responsibility can be delegated from the root administrator to a zone administrator.

Therefore, the administrator for http://research.mydomain.com is responsible for the host http://stimpy.research.mydomain.com and any other hosts within the administrator's zone of responsibility.

After a zone is established, name servers must be provided to support that zone. These name servers maintain a domain name space. The domain name space contains information concerning hosts and information about those hosts. The information contained in a domain name space persists in a database and is manipulated by a name server. Redundancy is provided for by the use of multiple name servers. A primary name server and at least one secondary name server is required (there can be many secondary name servers). The primary name server and secondary name server(s) must be independent of each other in the event of an outage. They should be on their own (dedicated) machine. This alleviates any DNS downtime if one of the machines fail to run. Another reason you must have more than one name server is because InterNIC requires it before approving your domain name.

The primary name server reads domain name space data from configuration files found on disk. If the configuration files change, such as when a new server is added to the system, a signal (normally SIGHUP) is sent to the primary name server to reread the configuration files. The secondary name server(s) retrieves domain name space data from the primary name server, rather than from disk. The secondary name server(s) polls, on a regular basis, the primary name server for new domain name space data. If the primary name server has new data, the data is transferred to the secondary name server(s).

An important feature of DNS is information caching. If a name server obtains a mapping, that mapping is cached for subsequent queries. Caching can be turned off for a name server, but it is highly recommended that caching be enabled.