Understanding How the Internet Is Structured


In order to operate , the Internet relies on maintaining a unique set of names and numbers. The names are domain names and hostnames, which enable the computers connected to the Internet to be identified in a hierarchy. The numbers are Internet Protocol (IP) addresses and port numbers , which enable computers to be grouped together into interconnected sets of subnetworks, yet remain uniquely addressable by the Internet.

An Internet service provider (ISP) will give you the information you need to set up a connection to the Internet. You plug that information into the programs used to create that connection, such as scripts to create a PPP connection over telephone lines. See the "Using Dial-up Connections to the Internet" section later in this chapter for descriptions of the information needed from your ISP and the procedures for configuring PPP to connect to the Internet. With a broadband connection (and the default DHCP configuration), you might just be able to plug in your Ethernet card and go.

The following list describes the basic Internet structure in more detail:

  • IP addresses - These are the numbers that uniquely define each computer known to the Internet. Internet authorities assign pools of IP addresses (along with network masks, or netmasks ) so that network administrators can assign addresses to each individual computer that they control. An alternative to assigned addresses is to use a reserved set of private IP addresses, so multiple computers in your home or business can share a single public IP address.

    Cross-Reference 

    See Chapter 15 for a further description of IP addresses.

  • Port numbers - Port numbers provide access points to particular services. Port numbers are like channels on a television. While all ports are inherently the same, many ports are associated with a particular service. A server computer will listen on the network for packets that are addressed to its IP address, along with one or more port numbers. For example, a Web server listens to port 80 to respond to requests for HTTP content. A Web browser will know, by default, to make its request for HTTP service (in other words, a Web page) to port 80 on the server.

    Note 

    In reality, you can assign any service to any port you choose. In fact, system administrators sometimes assign services to unusual ports for the express purpose of keeping most outsiders from finding them. If you decide to use a non-standard port for a service, you should try to avoid port numbers that are already assigned for common services, since might end up being scanned by an intruder anyway.

  • Domain names - On the Internet, computer names are organized in a hierarchy of domain names and host names. If you want to have and maintain your own Internet domain, you need to be assigned one that fits into one of the top-level domains (domains such as .com, .org, .net, .edu, .us , and so on).

  • Hostnames - If a domain name is assigned to your organization, you are free to create your own hostnames within that domain (sometimes called sub domains). This is a way of associating a name (host name ) with an address (IP address). When you use the Internet, you use a fully qualified domain name to identify a host computer. For example, in the domain handsonhistory.com , a host computer named baskets would have a fully qualified domain name of baskets.handsonhistory.com .

    Within an organization, you should choose a host-naming scheme that makes sense to you. For example, for handsonhistory.com , you could have hostnames dedicated to different crafts ( baskets, decoys, weaving , and so on). Some organizations use the names from Norse mythology such as thor, odin, and loki, or beer brands such as summit, jamespage, guinness, and so on.

    Tip 

    For some naming schemes, see http://c2.com/cgi/wiki?NamesGivenToComputers .

  • Routers - If you have a LAN or other type of network in your home or organization that you want to connect to the Internet, you can share an Internet connection. You do this by setting up a router. The router connects to both your network and the Internet, providing a route for data to pass between your network and the Internet. This is especially useful if you connect to the Internet over a dial-up or broadband connection, since your Linux box can act as a router and you can share the connection among all your computers.

  • Firewalls and IP masquerading - To keep your private network somewhat secure, yet still allow some data to pass between it and the Internet, you can set up a firewall. The firewall restricts the kind of data packets or services that can pass through the boundary between the private and public networks. If your network uses private addresses, or if you just want to protect the addresses of computers behind your firewall, you can use techniques such as Network Address Translation (NAT) or IP masquerading.

    Note 

    Although you can set up a firewall to filter packets on any computer on your private network, firewalls are typically configured most stringently on the machine that routes packets between the public and private networks. In this way, intruders can be stopped before they get on your private network and security can be relaxed somewhat between your computers behind the firewall.

  • Proxies - You can bypass some of the configuration required to allow the computers on your LAN to communicate directly with the Internet by configuring a proxy server. A proxy server can store (referred to as caching ) Internet objects (such as data from Web and FTP servers) so that clients of that proxy server don't have to contact the server originating that data each time it is requested . With a proxy server, a computer on your LAN can also run Internet applications (such as a Web browser) and have them appear to the Internet as if they are actually running on the proxy server.

Cross-Reference 

You can read about firewalls in Chapter 14. IP masquerading and proxy servers are described in the "Enable forwarding and masquerading" and "Setting up Linux as a Proxy Server" sections later in this chapter.

Internet Domains

You can't read a magazine, watch a TV commercial, or open a cereal box these days without coming across a " something .com ." When a company, organization, or person wants you to connect to them on the Internet, it relies on the uniqueness of its particular domain name. However, within that domain name, the company or organization to which it has been assigned can arrange its content however it chooses.

Internet domains are organized in a structure called the domain name system (DNS). At the top of that structure is a set of top-level domains (or TLDs). Some of the top-level domains are used commonly in the United States, although they are available for worldwide use. TLDs such as edu (for colleges and universities), gov (for United States government), and mil (for United States military sites) were among the most used TLDs in the early Internet. In more recent years , com (for commercial sites) has experienced the most growth.

The us domain was added to include U.S. institutions, such as local governments and elementary schools , as well as to individuals within a geographical region of the United States. Recently, new domains such as info (for people and business to publish information about themselves ) and biz (an alternative to com for businesses) have been added.

To facilitate the entry of other countries to the Internet, the International Organization for Standardization (ISO) has defined a set of two-letter codes that are assigned to each country; examples include codes such as tv for Tuvalu and de for Germany ( Deutschland in German). Within each country are naming authorities responsible for organizing the subdomains. Some subdomains are organized by categories, while others are structured by geographic location.

Domain names are hierarchical, which means there can be subdomains beneath second-level domains, as well as host computers. (Second-level domains are the names directly below the TLDs that are assigned to individual people and organizations.) Each subdomain is separated by a dot (.), starting with the top-level domain on the right and with the second-level domain and each subsequent subdomain appearing to the left. Here is an example of a fully qualified domain name for a host:

 baskets.crafts.handsonhistory.com 

In this example, the top-level domain is .com . The second-level domain name assigned to the organization that controls the domain is handsonhistory . Within that domain is a subdomain, or third-level domain, called crafts . The last name ( baskets ) refers to a particular computer within that third-level domain. From other hosts in the third-level domain, the host can be referred to simply as baskets . From the Internet, you would refer to it as baskets.crafts.handsonhistory.com .

Cross-Reference 

For more details on how the domain-name system is structured, and for information on how to set up your own DNS server in Linux, see Chapter 25.

Tip 

Several RFCs (Request for Comments) define the domain name system. RFC 1034 covers domain name concepts and facilities. RFC 1035 is a technical description of how DNS works. RFC 1480 describes the us domain. For a more general description of DNS, there is RFC 1591. You can view RFCs at the RFC Database ( www.rfc-editor.org/rfc.html ).

Hostnames and IP Addresses

In the early days of the Internet, every known host computer name and address was collected into a file called HOSTS.TXT and distributed throughout the Internet. This quickly became cumbersome because of the size of the list and the constant changes being made to it. The solution was to distribute the responsibility for resolving hostnames into IP addresses to many DNS servers throughout the Internet.

To make the domain names friendly, the names contain no network addresses, routes, or other information needed to deliver messages. Instead, each computer must rely on some method to translate domain names and hostnames into IP addresses. The DNS server is the primary means of resolving the names to addresses. If you request a service from a computer using a fully qualified domain name (including all domains and subdomains), the request will go to a DNS server to resolve that name into an IP address. It will gather that information either directly from the DNS server that owns that information or, which is more likely, from another DNS server along the way that has gathered that information.

If you have a private LAN or other network, you can keep your own list of hostnames and IP addresses. For the computers you work with all the time, it's easier to type baskets than baskets.crafts.handsonhistory.com . There are a couple of ways (besides DNS) that your computer can resolve the IP address for computers for which you give only the host name:

  • Check the /etc/hosts file. In your computer's /etc/hosts file, you can place the names and IP addresses for the computers on your local network. In this way, your computer doesn't need to query the DNS server to get the address (which may not be there anyway if you are on a private network and don't have your own DNS server). Another use of the /etc/hosts file is to override an address that a DNS server is giving you, for example if it is giving you an errant IP address for that host, or if you wanted to point your system to a different mail server.

  • Check specified domains. You can specify that if the hostname requested doesn't include a fully qualified domain name and the hostname is not in your /etc/hosts file, then your computer should check certain specified domain names. You can do this in the /etc/resolv.conf file.

On your Linux system, when you make a request to resolve a hostname into an IP address, the contents of the /etc/resolv.conf file will most likely determine where your computer searches for that information. That file can specify your local domain, an alternative list of domains, and the location of one or more DNS servers. Here is an example of an /etc/resolv.conf file:

 domain crafts.handsonhistory.com search crafts.handsonhistory.com handsonhistory.com nameserver 10.0.0.10 nameserver 10.0.0.12 

In this example, the local domain is crafts.handsonhistory.com . If you try to contact a host by giving only its hostname (with no domain name), your computer can check in both crafts.handsonhistory.com and handsonhistory.com domains to find the host. If you give the fully qualified domain name, it can contact the name servers (first 10.0.0.10 and then 10.0.0.12 ) to resolve the address. (You can specify up to six name servers that your computer will query in order until the address is resolved. The total search line is limited to 256 characters , however.)

If your system uses DHCP, where another server on your network assigns your Linux system an IP address, your /etc/resolv.conf file can look more like the following:

 ; generated by /sbin/dhclient-script search ce1.client2.big_isp.com nameserver 10.0.0.10 nameserver 10.0.0.12 

In this example, the /etc/resolv.conf file was created by the DHCP client code, based on information from the DHCP server. Note that big_isp.com is an alias for a large communications company.

Tip 

Your resolver knows to check your /etc/hosts file first because of the contents of the /etc/host.conf and /etc/nsswitch.conf files. By default, the nsswitch.conf file has your resolver check local files first, followed by DNS to resolve addresses. The host.conf file indicates that local files (hosts) be checked first for the address, followed the DNS system ( bind ). You can change that behavior by modifying those files. See the resolv.conf man page for further information.

Routing

Knowing the IP address of the computer you want to reach is one thing; being able to reach that IP address is another. Even if you connect your computers on a LAN, to have full connectivity to the Internet there must be at least one node (that is, a computer or dedicated device) through which you can route network traffic that is destined for locations outside your LAN. That is the job of a router .

A router is a device that has interfaces connected to at least two networks and is able to route network traffic between the two networks. In my example of a small business that has a LAN that it wants to connect to the Internet, the router would have a connection and IP address on the LAN, as well as a connection and IP address to a network that provides access to the Internet.

A computer running Linux can act as a router between any two TCP/IP interfaces, for example, if the computer has two LAN cards or if it has a network interface card and a modem (for a dial-up connection to the Internet). Alternatively, you can purchase a dedicated router, such as Cisco ADSL routers, that can exclusively perform routing between your LAN and the Internet or network service provider.

Tip 

Unlike regular dial-up modems, xDSL routers or bridges have several different standards that are not all compatible. Before purchasing an xDSL modem, check with your ISP. If your ISP supports xDSL, it can tell you the exact models of xDSL modems you can use to get xDSL service.

Proxies

Instead of having direct access to the Internet (as you do with routing), you can have indirect access via the computers on your LAN by setting up a proxy server . With a proxy server, you don't have to configure and secure every computer on the LAN for Internet access. When, for example, a client computer tries to access the Internet from a Web browser, the request goes to the proxy server. The proxy server then makes that request to the Internet. A proxy server can also be used to filter undesirable Web sites from being accessed by users on your local network.

Using a proxy server, Internet access is fairly easy to set up and quite secure to use. Linux can be configured as a proxy server using several different projects, including Squid (as described later in this chapter).




Fedora 6 and Red Hat Enterprise Linux Bible
Fedora 6 and Red Hat Enterprise Linux Bible
ISBN: 047008278X
EAN: 2147483647
Year: 2007
Pages: 279

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net