ASP, CGI And PHP Scripts And Record-Locking: What Every Webmaster Needs To Know
|
|
|
| 1.0/5.0 (2 votes total) |
|
|
|
Shelley Lowery, Sunil Tanna June 30, 2006
|
Many of us install server-side (ASP, CGI or PHP) scripts on our web
sites, and many of this scripts store data on the server. However,
poorly designed scripts can experience performance problems and
sometimes even data corruption on busy (and not so busy) web sites.
If you're not a programmer, why should this matter to you?
Answer:
Even if you're just installing and using server-side scripts, you'll
want to make sure that the scripts that you choose don't randomly break
or corrupt your data.
First, some examples of the types of scripts which store data on web servers include:
(Of
course, many scripts in each of these (and other) categories are
well-designed, and run perfectly well even on very busy web sites).
1.
Follow-up autoresponders typically store the list of subscribers to the
autoresponder, as well where in the sequence of messages, each
subscriber is. Examples of autoresponder scripts:
http://www.scriptcavern.com/scr_email_auto.php
2. Classified ad
scripts store (at least) a list of all the classified ads placed by
visitors. Examples of this type of script:
http://www.scriptcavern.com/scr_classified.php
3. Free for all
links scripts store a list of all links posted by visitors. See some
example scripts listed at: http://www.scriptcavern.com/scr_ffa.php
4.
Top site scripts usually store a list of the members of the top site as
well as information about the number of "votes" that each has received.
For examples of this type of script, see
http://www.scriptcavern.com/scr_topsite.php
So what kind of scripts have problems? And what sort of problems am I talking about?
Well
the principle problems all relate to what happens when bits of data
from multiple users needs to be stored on updated at the same time.
Some scripts handle these situations well, but others don't...
DATA CORRUPTION
Here's a common data corruption problem that can occur with many scripts:
1. When some bit of data needs to be updated, a copy of the server-side script starts running, and then starts updating it.
2.
If another user comes along and does an update before the first copy of
the script has finished, a second copy of the script starts running at
the same time.
3. There are a number of ways things can now go wrong, for example:
(a)
What if the first copy of the script reads in the data, then the second
copy reads the same data, then the first copy updates the data, then
the second copy updates the data? Answer: any changes made by the first
copy of the script can get lost.
(b) What if the first and
second copy of scripts are both adding multiple bits of new data to the
store at the same time? For example, imagine each needs to store the
headline, description and the name of the person posting a classified
ad. Well, what can happen (with some scripts) is the two classified ads
can get intermingled, so you might get (for example) HEADLINE-1,
DESCRIPTION-1, HEADLINE-2, PERSON-1, DESCRIPTION-2, PERSON-2. Or worse
yet, you might get bits of each part of each classified ad, mixed with
the bits of the other. This type of thing is usually really bad news,
as your data may consequently becoming unusable from that point on.
Does
this sound too unlikely a problem to worry about? Don't bank on it...
even if it happens only 1 time in 1,000, or 1 in 10,000, eventually it
will happen: You need a solution.
So the real question is: is it
possible for programmers to create scripts without these kinds of
problems? Fortunately the answer is yes, and there are a number of ways
that programmers can address it:
1. They can store each bit of
data in a separate file. This isn't necessarily a total solution by
itself (in particular, a script which just does this could still have
problems if multiple copies of a script update the same file at the
same time), but it does make data corruption less likely, and if
corruption does occur, at least it won't corrupt the entire data store
in one go.
2. They can use file-locking. This means that if one
copy of a script is working with a file, another copy of the script is
prevented from working on that file, until the first copy has finished.
File-locking works if done correctly, but programming it into a script
needs to be done very carefully and precisely, for every single
possible case... even a tiny bug or omission can allow the possibility
of data-corruption in through the backdoor!
3. They can use a
database (such as MySQL) to store the data. Provided the data is
properly structured in the database, the database handles the locking
automatically. And, as the programmer doesn't have to write their own
special locking routines, the possibility of bugs and omissions are
much reduced.
PERFORMANCE PROBLEMS
Of course, avoiding
having your data corrupted should be the paramount consideration in
choosing a script, but is there anything else we need to be concerned
about?
Answer: Performance
Of course, all webmasters are aiming to build busy high traffic web sites... but will your scripts be able to handle the load?
Go
back and re-read the paragraph on file-locking. Now think about what
would happen if all the classified ads on your classified page were
stored in a single file (or all the links on your top site, or all the
subscribers to your autoresponder, etc.).
What would happen?
Answer:
Because each update can only be performed after the previous update has
been completely finished, your site may be slow, or even unable to
handle all your users' requests.
So what's the solution?
There's two options that programmers can use:
1.
They can use lots of small files and file-lock each individually (for
example, one per classified, one per top site listing, etc.). Of
course, this needs to be handled very carefully...
2. They can
use a database (like MySQL), as databases allow any one individual
record ("row") to be updated, even when another is also being updated.
IN CONCLUSION
Now, let's summarise:
1.
Scripts that store data in files need to use file-locking to avoid
data-corruption, and they also need to break the data into separately
updateable chunks to avoid performance problems on busy web sites.
2.
Scripts that store data in databases (like MySQL), provided of course
that they have been properly coded, are usually less likely to suffer
from data-corruption or performance problems.
And one additional point:
3.
Even the best script is not immune to hard-disk hardware failures, your
web host being struck by lightning, and all the other snafus that can
happen. So, do take regular back-ups of any data that you can't afford
to lose!
In short, even if you're not a script programmer, you
need to be aware of data storage issues. In future, when considering a
script for your web site, don't be afraid to ask some hard questions
about how it stores data and how well it handles multiple users. |