General Configuration Steps
The configuration file used by Apache is /etc/httpd/conf/httpd.conf. As for most Linux applications, you must restart Apache before changes to this configuration file take effect.
Where to Put Your Web Pages
All the statements that define the features of each Web site are grouped together inside their own <VirtualHost> section, or container, in the httpd.conf file. The most commonly used statements, or directives, inside a <VirtualHost> container are:
servername: Defines the name of the Web site managed by the <VirtualHost> container. This is needed in named virtual hosting only, as I'll explain soon. DocumentRoot: Defines the directory in which the Web pages for the site can be found.
By default, Apache searches the DocumentRoot directory for an index, or home, page named index.html. So for example, if you have a servername of www.my-web-site.org with a DocumentRoot directory of /home/www/site1/, Apache displays the contents of the file /home/www/site1/index.html when you enter http://www.my-web-site.org in your browser.
Some editors, such as Microsoft FrontPage, create files with an .htm extension, not .html. This isn't usually a problem if all your HTML files have hyperlinks pointing to files ending in .htm as FrontPage does. The problem occurs with Apache not recognizing the topmost index.htm page. The easiest solution is to create a symbolic link (known as a shortcut to Windows users) called index.html pointing to the file index.htm. This then enables you to edit or copy the file index.htm with index.html being updated automatically. You'll almost never have to worry about index.html and Apache again!
This example creates a symbolic link to index.html in the /home/www/site1 directory:
[root@bigboy tmp]# cd /home/www/site1
[root@bigboy site1]# ln -s index.htm index.html
[root@bigboy site1]# ll index.*
-rw-rw-r-- 1 root root 48590 Jun 18 23:43 index.htm
lrwxrwxrwx 1 root root 9 Jun 21 18:05 index.html ->
index.htm
[root@bigboy site1]#
The l at the very beginning of the index.html enTRy signifies a link and the -> indicates the link target.
The Default File Location
By default, Apache expects to find all its Web page files in the /var/www/html/ directory with a generic DocumentRoot statement at the beginning of httpd.conf. The examples in this chapter use the /home/www directory to illustrate how you can place them in other locations successfully.
File Permissions and Apache
Apache will display Web page files as long as they are world readable. You have to make sure all the files and subdirectories in your DocumentRoot have the correct permissions.
It is a good idea to have the files owned by a nonprivileged user so that Web developers can update the files using FTP or SCP without requiring the root password.
To do this:
1. | Create a user with a home directory of /home/www.
| 2. | Recursively change the file ownership permissions of the /home/www directory and all its subdirectories.
| 3. | Change the permissions on the /home/www directory to 755, which allows all users, including the Apache's httpd daemon, to read the files inside.
The code you need is:
|
[root@bigboy tmp]# useradd -g users www
[root@bigboy tmp]# chown -R www:users /home/www
[root@bigboy tmp]# chmod 755 /home/www
Now test for the new ownership with the ll command:
[root@bigboy tmp]# ll /home/www/site1/index.*
-rw-rw-r-- 1 www users 48590 Jun 25 23:43 index.htm
lrwxrwxrwx 1 www users 9 Jun 25 18:05 index.html ->
index.htm
[root@bigboy tmp]#
|
Be sure to FTP or SCP new files to your Web server as this new user. This will make all the transferred files automatically have the correct ownership.
|
If you browse your Web site after configuring Apache and get a "403 Forbidden" permissions-related error on your screen, then your files or directories under your DocumentRoot most likely have incorrect permissions. Appendix II, "Codes, Scripts, and Configurations," has a short script that you can use to recursively set the file permissions in a directory to match those expected by Apache. You may also have to use the Directory directive to make Apache serve the pages once the file permissions have been correctly set. If you have your files in the default /home/www directory, then this second step becomes unnecessary.
Security Contexts for Web Pages
Fedora Core 3 introduced the concept of security contexts as part of the Security Enhanced Linux (SELinux) definition. (See Appendix I, "Miscellaneous Linux Topics," for details.) A Web page may have the right permissions, but the Apache httpd daemon won't be able to read it unless you assign it the correct security context or daemon access permissions. Context-related configuration errors will give "403 Forbidden" browser messages, and in some cases, you will get the default Fedora Apache page where your expected Web page should be.
When a file is created, it inherits the security context of its parent directory. If you decide to place your Web pages in the default /var/www/ directory, then they will inherit the context of that directory and you should have very few problems.
The context of a file depends on the SELinux label it is given. The most important types of security label are listed in Table 20.1.
Table 20.1. SELinux Security Context File LabelsHTTP Code | Description |
|---|
httpd_sys_content_t | The type used by regular static Web pages with .html and .htm extensions. | httpd_sys_script_ro_t | Required for CGI scripts to read files and directories. | httpd_sys_script_ra_t | Same as the httpd_sys_script_ro_t type, but also allows appending data to files by the CGI script. | httpd_sys_script_rw_t | Files with this type may be changed by a CGI script in any way, including deletion. | httpd_sys_script_exec_t | The type required for the execution of CGI scripts. |
As expected, security contexts become important when Web pages need to be placed in directories that are not the Apache defaults. In this example, user root creates a directory /home/www/site1 in which the pages for a new Web site will be placed. Using the ls -Z command, you can see that the user_home_t security label has been assigned to the directory and the index.html page created in it. This label is not accessible by Apache.
[root@bigboy tmp]# mkdir /home/www/site1
[root@bigboy tmp]# ls -Z /home/www/
drwxr-xr-x root root root:object_r:user_home_t site1
[root@bigboy tmp]# touch /home/www/site1/index.html
[root@bigboy tmp]# ls -Z /home/www/site1/index.html
-rw-r--r-- root root root:object_r:user_home_t
/home/www/site1/index.html
[root@bigboy tmp]#
Accessing the index.html file via a Web browser gets a "Forbidden 403" error on your screen, even though the permissions are correct. Viewing the /var/log/httpd/error_log gives a "Permission Denied" message and the /var/log/ messages file shows kernel audit errors.
[root@bigboy tmp]# tail /var/log/httpd/error_log
[Fri Dec 24 17:59:24 2004] [error] [client 216.10.119.250]
(13)Permission denied: access to / denied
[root@bigboy tmp]# tail /var/log/messages
Dec 24 17:59:24 bigboy kernel: audit(1103939964.444:0): avc: denied
{ getattr } for pid=2188 exe=/usr/sbin/httpd path=/home/www/site1
dev=hda5 ino=73659 scontext=system_u:system_r:httpd_t
tcontext=root:object_r:user_home_t tclass=dir
[root@bigboy tmp]#
SELinux security context labels can be modified using the chcon command. Recognizing the error, user root uses chcon with the -R (recursive) and -h (modify symbolic links) qualifiers to modify the label of the directory to httpd_sys_content_t with the -t qualifier.
[root@bigboy tmp]# chcon -R -h -t httpd_sys_content_t /home/www/site1
[root@bigboy tmp]# ls -Z /home/www/site1/
-rw-r--r-- root root root:object_r:httpd_sys_content_t
index.html
[root@bigboy tmp]#
Browsing now works without errors. User root won't have to run the chcon command again for the directory, because new files created in the directory will inherit the SELinux security label of the parent directory. You can see this when the file /home/www/site1/test.txt is created:
[root@bigboy tmp]# touch /home/www/site1/test.txt
[root@bigboy tmp]# ls -Z /home/www/site1/
-rw-r--r-- root root root:object_r:httpd_sys_content_t
index.html
-rw-r--r-- root root root:object_r:httpd_sys_content_t
test.txt
[root@bigboy tmp]#
Security Contexts for CGI Scripts
You can use Apache to trigger the execution of programs called Common Gateway Interface (CGI) scripts. CGI scripts can be written in a variety of languages, including PERL and PHP, and can be used to do such things as generate new Web page output or update data files. A Web page's Submit button usually has a CGI script lurking somewhere beneath. By default, CGI scripts are placed in the /var/www/cgi-bin/ directory as defined by the ScriptAlias directive you'll find in the httpd.conf file, which I'll discuss in more detail later.
ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
In the default case, any URL with the string /cgi-bin/ will trigger Apache to search for an equivalent executable file in this directory. So, for example, the URL http://192.168.1.100/cgi-bin/test/test.cgi actually executes the script file /var/www/cgi-bin/test/test.cgi.
SELinux contexts have to be modified according to the values in Table 20.1 for a CGI script to be run in another directory or to access data files. In the example case, the PERL script test.cgi was created to display the word "Success" on the screen of your Web browser.
#!/usr/bin/perl
# CGI Script "test.cgi"
print qq(
<html>
<head>
<meta http-equiv="Content-Language" content="en-us">
<meta http-equiv="Content-Type" content="text/html">
<title>Linux Home Networking</title>
</head>
<body>
<p align="center"><font size="7">Success!</font></p>
</body>
</html>
);
The ScriptAlias directive was set to point to /home/www/cgi-bin/ instead of /var/www/cgi-bin/.
ScriptAlias /cgi-bin/ "/home/www/cgi-bin/"
User root creates the /home/www/cgi-bin/ directory, changes the directory's security context label to httpd_sys_script_exec_t, and then creates the script /home/www/cgi-bin/test/test.cgi with the correct executable file permissions.
[root@bigboy tmp]# mkdir -p /home/www/cgi-bin/test
[root@bigboy tmp]# chcon -h -t httpd_sys_script_exec_t /home/www/cgi-bin/
[root@bigboy tmp]# mkdir /home/www/cgi-bin/test
[root@bigboy tmp]# ls -Z /home/www/cgi-bin
drwxr-xr-x root root root:object_r:httpd_sys_script_exec_t test
[root@bigboy tmp]# vi /home/www/cgi-bin/test/test.cgi
[root@bigboy tmp]# chmod o+x /home/www/cgi-bin/test/test.cgi
[root@bigboy tmp]#
Accessing the URL http://192.168.1.100/cgi-bin/test/test.cgi is successful. Problems occur when the same test.cgi file needs to be used by a second Web site housed on the same Web server. The file is copied to a directory /web/cgi-bin/site2/ governed by the ScriptAlias in the second Web site's <VirtualHost> container (explained later), but the security context label isn't copied along with it.
ScriptAlias /cgi-bin/ "/web/cgi-bin/site2/"
The file inherits the context of its new parent.
[root@bigboy tmp]# cp /home/www/cgi-bin/test/test.cgi /web/cgi-
bin/site2/test.cgi
[root@bigboy tmp]# ls -Z /web/cgi-bin/site2/test.cgi
-rw-r--r-x root root root:object_r:tmp_t
/web/cgi-bin/site2/test.cgi
[root@bigboy tmp]#
Permission denied and kernel audit errors occur once more; you can fix them only by changing the security context of the test.cgi file.
[root@bigboy tmp]# tail /var/log/httpd/error_log
[Fri Dec 24 18:36:08 2004] [error] [client 216.10.119.250]
(13)Permission denied: access to /cgi-bin/texcelon/test.cgi denied
[root@bigboy tmp]# tail /var/log/messages
Dec 24 18:36:08 bigboy kernel: audit(1103942168.549:0): avc: denied
{ getattr } for pid=2191 exe=/usr/sbin/httpd path=/web/cgi-
bin/site2/test.cgi dev=hda5 ino=77491
scontext=system_u:system_r:httpd_t tcontext=root:object_r:tmp_t
tclass=file
[root@bigboy tmp]#
|
If you find security contexts too restrictive, you can turn them off system wide by editing your /etc/selinux/config file, modifying the SELINUX parameter to disabled. SELinux will be disabled after your next reboot.
|
Named Virtual Hosting
You can make your Web server host more than one site per IP address by using Apache's named virtual hosting feature. You use the NameVirtualHost directive in the /etc/httpd/conf/httpd.conf file to tell Apache which IP addresses will participate in this feature.
The <VirtualHost> containers in the file then tell Apache where it should look for the Web pages used on each Web site. You must specify the IP address for which each <VirtualHost> container applies.
Named Virtual Hosting Example
Consider an example in which the server is configured to provide content on 97.158.253.26. In the code that follows, notice that within each <VirtualHost> container you specify the primary Web site domain name for that IP address with the ServerName directive. The DocumentRoot directive defines the directory that contains the index page for that site.
You can also list secondary domain names that will serve the same content as the primary ServerName using the ServerAlias directive.
Apache searches for a perfect match of NameVirtualHost, <VirtualHost>, and ServerName when making a decision as to which content to send to the remote user's Web browser. If there is no match, then Apache uses the first <VirtualHost> in the list that matches the target IP address of the request.
This is why the first <VirtualHost> statement contains an asterisk: to indicate it should be used for all other Web queries.
NameVirtualHost 97.158.253.26
<VirtualHost *>
Default Directives. (In other words, not site #1 or site #2)
</VirtualHost>
<VirtualHost 97.158.253.26>
servername www.my-web-site.org
Directives for site #1
</VirtualHost>
<VirtualHost 97.158.253.26>
servername www.another-web-site.org
Directives for site #2
</VirtualHost>
Be careful with using the asterisk in other containers. A <VirtualHost> with a specific IP address always gets higher priority than a <VirtualHost> statement with an * intended to cover the same IP address, even if the ServerName directive doesn't match. To get consistent results, try to limit the use of your <VirtualHost *> statements to the beginning of the list to cover any other IP addresses your server may have.
You can also have multiple NameVirtualHost directives, each with a single IP address, in cases where your Web server has more than one IP address.
IP-Based Virtual Hosting
The other virtual hosting option is to have one IP address per Web site, which is also known as IP-based virtual hosting. In this case, you will not have a NameVirtualHost directive for the IP address, and you must only have a single <VirtualHost> container per IP address.
Also, because there is only one Web site per IP address, the ServerName directive isn't needed in each <VirtualHost> container, unlike in named virtual hosting.
IP Virtual Hosting Example: Single Wild Card
In this example, Apache listens on all interfaces, but gives the same content. Apache displays the content in the first <VirtualHost *> directive even if you add another right after it. Apache also seems to enforce the single <VirtualHost> container per IP address requirement by ignoring any ServerName directives you may use inside it.
<VirtualHost *>
DocumentRoot /home/www/site1
</VirtualHost>
IP Virtual Hosting Example: Wild Card and IP Addresses
In this example, Apache listens on all interfaces, but gives different content for addresses 97.158.253.26 and 97.158.253.27. Web surfers get the site1 content if they try to access the Web server on any of its other IP addresses.
<VirtualHost *>
DocumentRoot /home/www/site1
</VirtualHost>
<VirtualHost 97.158.253.26>
DocumentRoot /home/www/site2
</VirtualHost>
<VirtualHost 97.158.253.27>
DocumentRoot /home/www/site3
</VirtualHost>
Virtual Hosting and SSL
Because it makes configuration easier, system administrators commonly replace the IP address in the <VirtualHost> and NameVirtualHost directives with the * wildcard character to indicate all IP addresses.
If you installed Apache with support for secure HTTPS/SSL, which is used frequently in credit card and shopping cart Web pages, then wild cards won't work. The Apache SSL module demands at least one explicit <VirtualHost> directive for IP-based virtual hosting. When you use wild cards, Apache interprets it as an overlap of name-based and IP-based <VirtualHost> directives and gives error messages because it can't make up its mind about which method to use:
Starting httpd: [Sat Oct 12 21:21:49 2002] [error] VirtualHost
_default_:443 -- mixing * ports and non-* ports with a NameVirtualHost
address is not supported, proceeding with undefined results
If you try to load any Web page on your Web server, you'll also see the error:
Bad request!
Your browser (or proxy) sent a request that this server could not
understand.
If you think this is a server error, please contact the webmaster
The best solution to this problem is to use wild cards more sparingly. Don't use virtual hosting statements with wild cards except for the very first <VirtualHost> directive that defines the Web pages to be displayed when matches to the other <VirtualHost> directives cannot be found. Here is an example:
NameVirtualHost *
<VirtualHost *>
Directives for other sites
</VirtualHost>
<VirtualHost 97.158.253.28>
Directives for site that also run on SSL
</VirtualHost>
 |