Advertisement

Tutorials

Home Tutorials Advanced Tutorial

CGI Environmental Variables

2.6/5.0 (5 votes total)
Rate:

Perlfect Solutions
March 08, 2006


Perlfect Solutions
Perlfect Solutions specializes in programming for the web or intranet. Our services focus on application oriented web design and programming to cover the needs of even the most demanding customers, ranging from small businesses to corporate sites. Whether you are looking for a simple cgi script, an administrative tool, a database-driven web application or a full-fledged professional web site design, we have the know-how and experience to provide you with modern and sophisticated solutions to meet and surpass your expectations.


http://www.perlfect.com
Perlfect Solutions has written 3 tutorials for CGIDir.
View all tutorials by Perlfect Solutions...

One of the methods that the web server uses to pass information to a cgi script is through environmental variables. These are created and assigned appropriate values within the environment that the server spawns for the cgi script. They can be accessed as any other environmental variable, like with getenv() (in C) or %ENV{'VARIABLE_NAME'} (in Perl). Many of them, contain important information, that most cgi programs need to take into account.

This list, highlights some of the most commonly used ones, along with a brief description and notes on possible uses for them. This list is by no means a complete reference; many servers pass their own extra variables, or having different names for some, so better check with your server's documentation. The purpose of this list is only to suggest some common good uses for some of the server-passed information.


CONTENT_LENGTH

The length, in bytes, of the input stream that is being passed through standard input.

This is needed when a script is processing input with the POST method, in order to read the correct number of bytes from the standard input. Some servers end the input string with EOF, but this is not guaranteed behaviour, so, in order to be sure that you read the correct input length you can do something like

read(STDIN,$input,$ENV{CONTENT_LENGTH})


DOCUMENT_ROOT

The directory over which all www document paths are resolved by the server.

Sometimes it is useful to know the server's document root, in order to compose absoulte file paths when all the script is eing given as a parameter is the relative path of the file within the www directory. It is also good practice to have your script resolve paths in this way, both for security reasons and for portability. Another common use is to be able to figure out what the url of a file will be if you only know the absolute path and the hostname. (there's another variable to find out the hostame)


HTTP_REFERER

The URL that the referred (via a link or redirection) the web client to the script. Typed URLs and bookmarks usually result in this variable being left blank.

In many cases a script may need to behave differently depending on the referer. For example, you may want to restrict your counter script to operate only if it is called from one of your own pages, to prevent someone from using it from another web page without your permission. Or even, the referer may be the actual data that the script needs to process. Extending the example above you might also like to install your counter to many pages, and have the script figure out from the referer which page generated the call and increment the appropriate count, keeping a separate count for each individual URL. A snippet for the referer blocking example could be:

die unless($ENV{HTTP_REFERER}=~m/http:\/\/(www\.)?$mydomain\//);


HTTP_USER_AGENT

The name/version of the client issuing the request to the script.

Like with referers, one might need to implement behaviours that vary with the client software used to call the script. A redirection script could make use of this information to point the client to a page optimized for a specific browser, or you may want to have it block requests from specific clients, like robots or clients that are known not to support appropriate features used by what the script would normally output.


PATH_INFO

The extra path information followin the script's path in the URL.

A URL that refers to a script may contain additional information, commonly called 'extra path information'. This is appended to the url and marked by a leading slash. The server puts this information in the PATH_INFO variable, which can be used as a method to pass arguments to the script.


PATH_TRANSLATED

The PATH_INFO mapped onto DOCUMENT_ROOT.

Usually PATH_INFO is used to pass a path argument to the script. For example a counter might be passed the path to the file where counts should be stored. The server also makes a mapping of the PATH_INFO variable onto the document root path and store is in PATH_TRANSLATED which can be used directly as an absolute path/file.


QUERY_STRING

Contains query information passed via the calling URL, following a question mark after the script location.

QUERY_STRING is the equivalent of content passed through STDIN in POST, but for script called with the GET method. Query arguments are written in this variable in their URL-Encoded form, just like they appear on the calling URL. You can process this string to extract useful parameters for the script.


REMOTE_ADDR

The IP address from which the client is issuing the request.

This can be useful either for logging accesses to the script (for example a voting script might want to log voters in a file by their IP in order to prevent them from voting more than once) or to block/behave differently for particular IP adresses. (this might be a requirement in a script that has to be restricted to your local network, and maybe perform different tasks for each known host)


REMOTE_HOST

The name of the host from which the client issues the request.

Just like REMOTE_ADDR above, only that this is the hostname of the remote machine. (If it is known via reverse lookup)


REQUEST_METHOD

The method used for the request. (usually GET, POST or HEAD)

It is wise to have your script check this variable before doing anything. You can determine where the input will be (STDIN for POST, QUERY_STRING for GET) or choose to permit operation only under one of the two methods. Also, it is a good idea to exit with an explanatory error message if the script is called from the command-line accidentally, in which case the variable is not defined.


SCRIPT_NAME

The virtual path from which the script is executed.

This is very useful if your script will output html code that contains calls to itself. Having the script determin its virtual path, (and hence, along with DOCUMENT_ROOT, its full URL) is much more portable than hard coding it in a configuration variable. Also, if you like to keep a log of all script accesses in some file, and want to have each script report its name along with the calling parameters or time, it is very portable to use SCRIPT_NAME to print the path of the script.


SERVER_NAME

The web server's hostname or IP address.

Very similarly to SCRIPT_NAME this value can be used to create more portable scripts in case they need to assemble URLs on the local machine. In scripts that are made publically accessible on a system with many virtual hosts, this can provide the ability to have different behaviours depending on the virtual server that's calling the script.


SERVER_PORT

The web server's listening port.

Complements SERVER_PORT above, in forming URLs to the local system. A commonly overlooked aspect, but it will make your script portable if you keep in mind that not all servers run on the default port and thus need explicit port reference in the server address part of the URL.


Add commentAdd comment (Comments: 0)  

Advertisement

Partners

Related Resources

Other Resources

image arrow