How do forms really work?
07-09-2005
by John Saya
Did you ever enter your name on a web page? How about a credit card number? Maybe you clicked a box
that asked if you were Male or Female? If you ever submitted any type of information while on a web
site, then you submitted a form.
When someone enters information into a form on a web page, it is usually sent to a script on the web site
server to be processed in some way. The script receives the form data as a set of name-value pairs.
The names are what you defined in the INPUT, SELECT, or TEXTAREA tags in your form, and the values are
whatever the user typed in or selected. Users can also submit files with forms, but we won't cover that
here.
Let's say you have a form that contains this information:
Name: <INPUT NAME=name VALUE="John">
Gender: <INPUT NAME=gender VALUE="Male">
Age: <INPUT NAME=age VALUE="25">
When the above form is submitted, the name-value pairs are sent back to the web server as one long string,
which you need to parse. It's not very complicated, and there are plenty of existing routines to do it
for you. The long string is in one of these two formats:
"name=John&gender=Male&age=25"
"name=John;gender=Male;age=25"
We just split the line between the ampersands or semicolons, and then on the equal signs.
You should be aware that the original long string is URL-encoded, to allow for spaces, equal signs,
ampersands, and so forth in the user's input. This means that certain characters are
converted to other characters. For example, a space is converted to a plus sign (+). So when we
parse the form information, we do two things to the name-value pairs:
1. Convert all "+" characters to spaces
2. Convert all hexadecimal characters to their original character. For example "%3d" would be "=".
But, where do you get the long string? Well, that depends on the HTTP method the form was submitted with.
There are two ways a form can be submitted. Either GET or POST can be used. For GET submissions, the
long string is in the environment variable QUERY_STRING. For POST submissions, it gets read from STDIN.
The exact number of bytes to read is in the environment variable CONTENT_LENGTH.
To understand this concept a little more, you need to know more about how GET and POST work.
They are two different methods defined in HTTP that do very different things, but both happen to be
able to send form submissions to the server.
GET is how your web browser downloads most files, like web pages and pictures. It can also be used for most
form submissions, if there's not too much data. The limit varies depending on your web browser.
Keep in mind that your web browser can cache GET responses too. So if you submit two identical GET
requests, the first will be sent to the web server, but the second may be displayed from your browser's
cache instead. This speeds up viewing web pages across the Internet, but it's not so good if you want to
log each request, store data, or otherwise take an action for each request. GET includes all of the form
information right in the URL. For example:
http://www.yourserver.com/cgi-bin/script.cgi?name1=value1&name2=value2
Everything after the question mark is sent back to the server for the script to read.
The below code will read and display all of the name-value pairs passed to the server when you call it
from your web browser. So, if you called the above URL using this code, it would display this in your
web browser:
name1 = value1
name2 = value2
Keep in mind that this demonstrates the GET method only.
#/usr/bin/perl
# parse_get.cgi
&parse_form;
print "Content-type: text/html\n\n";
foreach my $key (sort keys %FORM)
{
print "$key = $FORM{$key}<br />\n";
}
exit;
sub parse_form {
local($name, $value);
# First we split all name-value pairs
foreach (split(/[&;]/, $ENV{'QUERY_STRING'})) {
# Now convert all + signs to spaces
s/\+/ /g;
# Split the name-value pairs between the = signs
# Then assign to local $name and $value
($name, $value)= split('=', $_, 2);
# Convert all hexadecimal characters back to ASCII
$name =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/ge;
$value =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/ge;
# Assign the name-value pairs to the global hash %FORM
$FORM{$name} .= "\0" if defined($FORM{$name});
$FORM{$name} .= $value ;
}
}
POST is normally used to send a chunk of data to the server to be processed.
When an HTML form is submitted using POST, your form data is attached to the end of the POST request,
in its own object (specifically, in the message body). This is not as simple as using GET, but is more
versatile. For example, you can send entire files using POST. Also, data size is not limited like it is
with GET. Some advantages of POST are that you're unlimited in the data you can submit, and you can
count on your script being called every time the form is submitted.
The below code will process a GET or POST request that is sent to the server.
#/usr/bin/perl
# parse_form.cgi
&parse_form;
print "Content-type: text/html\n\n";
foreach my $key (sort keys %FORM)
{
print "$key = $FORM{$key}<br />\n";
}
exit;
sub parse_form {
# If it's a GET request use the QUERY_STRING variable
if ("\U$ENV{'REQUEST_METHOD'}\E" eq 'GET') {
# Split the name-value pairs
@pairs = split(/&/, $ENV{'QUERY_STRING'});
}
# If it's a POST request read from STDIN and get the length
# from the CONTENT_LENGTH environment variable
elsif ("\U$ENV{'REQUEST_METHOD'}\E" eq 'POST') {
# Get the input
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
# Split the name-value pairs
@pairs = split(/&/, $buffer);
}
else {
# If neither method is called, show an error message
&error('request_method');
}
foreach $pair (@pairs) {
# Split the name-value pairs and assign to $name and $value
($name, $value) = split(/=/, $pair);
# Convert + signs to spaces and hexadecimal characters to ASCII
$name =~ tr/+/ /;
$name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
# If they try to include server side includes, erase them, so they
# aren't a security risk if the html gets returned. Another
# security hole plugged up.
$value =~ s///g;
# Remove HTML Tags
$value =~ s/<([^>]|\n)*>//g;
# Assign the name-value pairs to the hash %FORM
if ($FORM{$name} && ($value)) {
$FORM{$name} = "$FORM{$name}, $value";
}
elsif ($value ne "") {
$FORM{$name} = $value;
}
}
}
sub error {
local($msg) = @_;
print "Content-Type: text/html\n\n";
print "<CENTER><H2>$msg</H2></CENTER>\n";
exit; }
The above code can be used to display all of the variables of any form
you create on your web site.
With this knowledge, there are many more things you can do with your web site.
Here's a great tip. Why not start asking for your visitor's e-mail address, so
you can keep in touch with them! You'll need a way to capture that information
from their web browser, but you know how that's all done now.
You can always use existing software to design and manage your forms.
Take a look at
FormPRO II or
FormSender for that.
[ Back ] [ Main Menu ]
Download Fuse Node.js Compiler