Beware of User Input - From the authors

One of the most common methods of exploiting CGI scripts and programs is used when scripts allow user input, but the data that users are submitting is not checked. Controlling what information users are able to submit will reduce your chances of being hacked through a CGI script dramatically. This not only includes limiting the methods that data can be submitted through a form (by using drop-down lists, check boxes and other methods), but also by properly coding your program to control the type of data being passed to your application. This would include input validation on character fields, such as limiting the number of characters to only what is needed. An example would be a zip code field being lim-ited to five numeric characters.

Tools & Traps…

common mistakes and following good practices when creating CGI scripts, you can write tighter code that prevents your system from being attacked. Some of the problems we’ll discuss here regard controlling per-missions, user input, and using error-handling code.

In creating CGI scripts, you will probably create an interface that will access your CGI program. In most cases, this will be a form that allows users to enter data on a Web page. Upon clicking Submit, data is then passed to the CGI program to be processed. However, while this is the common method used to access CGI programs, it is important to realize that users may be able to access the script directly if they know where it resides on the server.This can be a problem if a client-side script is used in the Web page to validate data before it is sent.The GET method sends data to the server as part of the URL. If users entered the URL into the address bar of their browser with any data they wanted, then they could bypass any client-side scripting that’s used to validate data. If the POST method is used, then this will make it more difficult to pass the data to a CGI script. However, this can also be bypassed if the user creates his or her own Web page to call your CGI script, and then enters any data he or she wants. Because client-side scripts can be viewed and possibly manipulated by users, you should write code into the CGI program itself that will validate the data it receives. Since the CGI script runs on the server itself, the user won’t be able to circumvent your data checking and pass improper data to the program.

You should never trust data being passed to your CGI program.This is particularly important to remember if you’re thinking of allowing users to enter the path to a file, or use hyperlinks to tell the CGI pro-gram to load a particular file. For example, let’s say you were going to add a Knowledge Base to your site, where users could open documents containing common issues with products your company sells. A Web page would allow users to open text files, which are then formatted using a CGI script.The argument passed to the CGI script would be the path to that file. If the page asked users to specify the text file to open by entering a path, they could conceivably open any file that the system is able to access, or enter the path into the URL in the address bar of their browser. If they entered the path and filename of a password file,

then the CGI script would display the contents of that password file to a user. For example, if your CGI program automatically looked for docu-ments in the /inet/docs directory, a user could enter the path

“../../etc/password” in the URL. For this reason, you should control where your CGI program will look for documents, and control permis-sions on that directory.To prevent users from looking higher than this directory in the document structure, you should ensure that “…” expres-sions aren’t permitted in a path, and that proper permisexpres-sions have been set on each directory to control access.

Another similar problem with bad data being passed to the program occurs when additional characters are added to a file that’s specified to open or be used by the CGI program. In a shell script, a semicolon (“;”) is used to specify the end of a command line.The script then considers what comes after the semicolon to be a new command, which is then executed. If users were allowed to open a document by specifying its name, it’s possible for them to enter a semicolon and then a second command. For example, if they were opening a document called help.txt, they could enter the following:

help.txt;rm -rf/

This code would open the document called help.txt. Once it is opened, the second command would execute, which would erase the hard disk without asking for confirmation. From this, it should become clear that there is a need to control user input, and limit what they do when accessing a CGI script.

It is important that you ensure that the form used to collect data from users is compatible with the CGI script.While mistakes happen, and you may enter the wrong name or value in a form, there are other situations where this may be a more common problem. In larger organizations or businesses that provide Web services, more than one person may be responsible for different aspects of a Web site. A team of people may create the Web site, with one person creating graphics, another writing CGI scripts, and yet another writing HTML.When this happens, errors may result. For this reason, it is important that you evaluate CGI scripts and forms on your site to ensure that the two work correctly together.

Checking code not only requires looking over the form to visually see that names and values are correct, but should also include imple-menting code in the CGI script that checks the data it receives.The CGI scripts you create shouldn’t be designed to assume that data passed to it is correct.To illustrate this, let’s say we have a form for collecting user surveys. On the form, a question is asked: “Do you drink coffee?”

Below this, there are two radio buttons to control user input, which allow the user to answer “Yes” or “No.” In processing this question, you might write the following code in your script:

if ($form_Data{"my_choice"} eq "button_yes") {

# Yes has been clicked }

else {

# No has been clicked }

You would assume that the user would answer one or the other, so that if one radio button is clicked, the other isn’t.That is the mistake that the preceding code makes. If the user failed to select one of the radio buttons, then neither would be selected. Another possibility might be the user clicking both radio buttons, and both options being selected.

Depending on the code used, a number of situations could result, ranging from the survey data being skewed to crashing the program.

To deal with such problems, your code should analyze the data it is receiving and provide error-handling code to deal with problems. Error handling deals with improper or unexpected data that’s passed to the CGI script. It allows you to return messages informing the user that cer-tain fields haven’t been filled out, or to ignore cercer-tain data. If we were to correct the previous code, and implement code that checks the data and provides a method for dealing with erroneous data, it might look like the following:

if ($form_Data{"my_choice"} eq "button_yes") {

# Yes has been clicked }

elsif ($form_Data{"my_choice"} eq "button_no") {

# No has been clicked }

else {

# Error handing }

In the preceding code, the data in my_choice is checked. If the Yes button is clicked, then the first section of code will execute. If the No button is clicked, then the second section of code will execute. If, how-ever, my_choice is equivalent to neither of these values, then error-han-dling code will execute. Because the code no longer assumes what data is being passed to it, the CGI script has become more stable and secure.

Dans le document From the authors (Page 168-172)