Back to Programming Articles

[Part 3] Learning the Basics of Coding: Best Practices
by Pandemikk 19 Jun 2012

The third part of my four part article will focus on better coding: Improving the efficiency and logic of your existing code. This skill literally makes the difference between your novice vb.org hack and a full-fledged modification that people will pay for.

The Basics
I'm going to fully assume you have read or already have an understanding of [Part 1] Learning the Basics of Coding and [Part 2] Learning the Basics of Coding: Database Interaction. I'm going to assume you have knowledge of types, variables, constants, objects, resources, classes, methods, properties, table structures, etc.,. If you don't, you're probably going to get lost.

PHP
PHP is a fast language, capable of performing countless operators in microseconds. However, when poorly coded, something that should have executed in .006 seconds may end up taking .06 seconds. Don't think that's a big difference? Then leave the article - because that's slower by a factor of 10.

loops
Knowing which type of loop to use may seem trivial, but it's like using the wrong type of screwdriver or wrench. Sure, it'll get the job done, but it's going to take longer and waste more energy.
Here are some general rules when trying to figure out what type of loop to use:

for is for when you know how many times you want to execute a loop.
PHP Code:
for ($i 0$i 10; ++$i)
{
    echo 
$i;



while
is when you aren't sure how many times, but you want to loop until a condition is met.
PHP Code:
while ($row $db->fetch_array($mysql_result))
{
    
var_dump($row);

Don't know what var_dump is? Google it. It's very useful.

foreach is generally just for looping through an array.
PHP Code:
foreach ($array AS $key => $val)
{
    echo 
$key' - '$val;

Multiple parameters > string concatenation.

Though all loops have their differences, they all have their similarities too:

Do not use count() or its alias sizeof() in any loop expression.
PHP Code:
$array array_fill(0100'value'); # Creates an array
for ($i 0$i count($array); ++$i)
{
    echo 
$i;

Why are you calling count() at the beginning of each iteration?

PHP Code:
$array array_fill(0100'value');
$count count($array);
for (
$i 0$i $count; ++$i)
{
    echo 
$i;

count() is called once instead of 100 times. I just saved you some processing time.

Now you may have noticed I am consistently pre-incrementing instead of post-incrementing my $i. Why? It's about 10% faster. Pre-increment where possible!

Let's move on to another common loop problem. Can you spot it?
PHP Code:
for ($i 1$i 10; ++$i)
{
    for (
$j 1$j 10; ++$j)
    {
        
$troll_var = ($i 10) + ($j 10);
    }

The problem is the loop-invariant code. A loop-invariant is code that's true outside of the loop and, subsequently, true in each loop iteration. What does this mean? It means you're executing code multiple times when it only needs to be executed once!

PHP Code:
for ($i 1$i 10; ++$i)
{
    
$a $i 10;
    for (
$j 1$j <= 10; ++$j)
    {
        
$troll_var $a + ($j 10);
    }

By moving $i * 10, a value which will not change within the second loop, we are saving processing time. Why calculate ($i * 10) ten times when you can calculate it once? It's like not having memory and asked to do the same equation over and over. You just answered it, so why are you trying to find out the answer again?

functions
Calling user-defined functions is expensive. Whenever possible, don't be afraid to inline a function.

PHP Code:
function add_one($i)
{
    ++
$i;
    return 
$i;
}
   
$j 0;
for (
$i 0$i 1000; ++$i)
{
    
$j add_one($j);

You're calling a function a thousand times. Remember when I said count() inside a for was bad? This is even worse! Not just because it's being called 1000 times but because a user-defined function is much slower than a built-in function!

PHP Code:
$j 0;
for (
$i 0$i 1000; ++$i)
{
    ++
$j;

This will be exponentially faster.

Now I feel obligated to mention that you shouldn't inline all user-defined functions you possibly can. This is just silly. You should only do this in obvious cases where the function is either hardly used, or is so short it just doesn't make sense. Functions centralize your code and make life easier. Don't throw away a cup because you can drink from the tap! But DO get rid of that old cup your great-aunt gave you that no one ever uses yet it takes up the room of 3 normal cups. Secondly, don't get cocky and try to inline built-in functions, either, because that will be counter-productive. Built-in functions use optimized C code and you're not going to beat that with PHP.

Another problem is sometimes you don't need to use a function at all! You can use a language construct which are faster than functions!
PHP Code:
$foo true;
unset(
$foo);
   
if (
is_null($foo))
{
    echo 
'Foo is null';

While this code is correct and gets the job done, one must wonder why a function is being used?

PHP Code:
$foo true;
unset(
$foo);
   
if (!isset(
$foo))
{
    echo 
'Foo is null';

There we go. Drop the function and use a language construct instead. Not only is it faster but, in my opinion, it looks nicer, too.

use the right function
Don't use preg_split instead of explode when you don't need regular expressions. This goes for any function that supports regex. If you aren't using regex then use an alternative. e.g: str_replace instead of preg_replace.

freeing memory
This should be obvious, but it's something that isn't done as often as it should (sometimes I find code with arrays that should have been unset a long time ago, but that's not even the bad part - the bad part is it's my code.).

PHP Code:
$a = array();

$array array_fill(01000'value');
foreach (
$array AS $key => $val)
{
    
$a[$key] = $val;
}

// pretend there's hella more code here 
I've already got all the data from $array into $a so it's no longer needed. Why keep $array around?

PHP Code:
$unset($array); 
There we go. Memory freed!

initialize!
This is such an underestimated coding practice. Whenever you are using a variable or an array element you should always be sure it exists before doing anything besides assigning a value to it! Why? Keep reading!

PHP Code:
/*****
* Bad code
* @initalize - (bool) false
*/
// incrementing an undefined variable is much slower than incrementing a defined one
++$i;
   
// unnecessary else
if ($var == true)
{
    
$bool true;
}
else
{
    
$bool false;
}
   
/*****
* Good code
* @initalize - (bool) true
*/
if (!isset($i))
{
    
$i 0;
}
++
$i;
   
$var $bool false;
if (
$var == true)
{
    
$bool true;

It should also be mentioned that initializing your variables can play an importance in security as well. More in [Part 4]. Furthermore, attempting to do something to an undefined variable can also result in: Notice: Undefined variable: $variable_name with the right php.ini configuration (you shouldn't be outputting errors on a live site though).

logic
One thing novice coders often do is making a mess of their logic (I do that a lot, too, actually). Examples include evaluating the same expression multiple times in a script, within an if condition, redundant code, etc. There's a popular principle called "DRY"; it stands for don't repeat yourself.

PHP Code:
$i $j 5;
if ((
$i == AND $j == 5) OR ($i == AND $j == 5))
{
    echo 
true;

$j was checked twice. Why? Poor logic. Not only is poor logic a waste of processing time, but it's also harder to read and takes longer to type. There's not one benefit to checking $j twice so break a bad habit and rewrite that code.

PHP Code:
$i $j 5;
if (
$j == AND ($i == OR $i == 5))
{
    echo 
true;

Much better! Cleaner. Faster. Logicaler™.

Let's take it up a notch:
PHP Code:
if ($is_admin == true)
{
    
$access true;
}
else
{              
    if (
$blacklisted == true)
    {
        
$access false;
    }
    else if (
$has_access == true)
    {
        
$access true;
    }

How can I write this code better? I want to give admins access, prevent blacklisted access unless they're an admin, and give access to people that have access unless they're blacklisted.

Let's look at a few things first. We have $access = true twice! Certainly we can refactor the code to remove this redundancy? Secondly, it seems we still have not initialized $access! If a user meets none of those conditions then there's a possibility he may get unauthorized access!

PHP Code:
if ($access !== false)
{
    
// code here

Because NULL is not identical to false, this person has gained access to a place he shouldn't have!

Thirdly, look at the nesting. For such simple checks do we really need double nested code? Certainly not! So how would you rewrite this code better?

PHP Code:
$access false# Initialize
if ($is_admin === true OR ($has_access === true AND $blacklisted === false))
{
    
$access true;

That's how I would do it. Satisfies all the requirements and solves all the problems. I've taken it a step further and used identical operators because it's faster and more fool-proof.

misconceptions
I've taken the time to find out many optimization tips for PHP and have found that a few of the more popular ones are absolutely false. I'm going to include them in here because part of writing good code actually involves knowing what good code is and is not.

single quotes vs. double quotes
There's no noticeable difference in performance between using single and double quotes. Single quotes is indeed a bit faster but the percentage is < .01%. If you want to shave a nano-second in parsing time then by all means convert all double quotes to single quotes. Personally, I'm not going to. As a side note: I use single quotes whenever possible but if double quotes make my life easier I will not hesitate to use them.

PHP Code:
$do_this 'If it makes sense.';
$do_this "I have a $var I want parsed.";
$me 'I do this ' $out ' of habit not ' $out ' of micro-optimizing OCD'
get rid of comments and whitespaces
Don't. You'll waste far more time pressing the backspace button then the amount of processing time saved will ever come close to. In fact, I encourage consistent indentation, spacing and commenting. Keep your code clean, neat, and understandable.

PHP Code:
if($true==true){echo 'i just saved 15 femtoseconds by switching to annoyingly hard to read coding practices';} 
obfuscation
The practice of rewriting variables, constants, functions, classes, etc,. to use as little characters as possible. Similar to the above: Don't. "Keep your code clean, neat, and understandable."

general tips
Contrary to the above, I've also taken the time to confirm many common optimization tips.
  • Declare methods static when possible. Improves speed by a factor of 4.
  • Use echo and take advantage of its multiple parameters. In other words, use a comma instead of a period.
  • $row[’id’] is 7 times faster than $row[id]. Copypasta because everyone's read this one.
  • Never use short tags. Always use <?php and ?
  • For PHP > 5, unset() is faster and uses less memory than type casting null.
MySQL
Going to assume we're in a vB enviornment and that $db is an object, yes? Also going to assume you’ve read [Part 2] Learning the Basics of Coding: Database Interaction, yes? Good!

query_first
Remember that this doesn't add LIMIT 1. This means that unless querying only one result your query will be inefficient. Why? Imagine you want to buy a bottle of honey. You drive to the database (store), and then you go to the table (aisle), and then you SELECT honey. But wait, you didn't limit yourself to one bottle of honey; you grabbed ALL the bottles of honey. So now you're walking to the check-out with your arm full of bottles of honey and proceed to buy them. Then you drive back home with your bounty of honey and while putting away your groceries you throw out all those bottles of honey and put one inside your cabinet. Understand? If you're going to get more than one result add LIMIT 1!

PHP Code:
$derp $db->query_first("SELECT dateline FROM " TABLE_PREFIX "post WHERE threadid = 67 ORDER BY dateline DESC "); 
WRONG! What if this thread had tens of thousands of posts? Your server is going to hang! Hope you aren't using bluehost (zing!).

PHP Code:
$great_success $db->query_first("SELECT dateline FROM " TABLE_PREFIX "post WHERE threadid = 67 ORDER BY dateline DESC LIMIT 1"); 
Fast!

But to be terribly honest, if you want to select the last post in a thread you should be doing this:
PHP Code:
$derp $db->query_first("SELECT lastpost FROM " TABLE_PREFIX "thread WHERE threadid = 67 "); 
This eliminates the need to sort through the posts and since `threadid` is the primary key in table thread it's going to be extremely fast and efficient.

avoid filesort
Sometimes it's unavoidable, but most of the times it is (avoidable).

Not sure what I mean? Enable debug mode and look in the extra column. It may take you awhile, but you're bound to see "using filesort" or "using temporary".

using temporary
The temporary table where rows are being put in has gotten so big that it needs to be sorted on a disk.

using filesort
A sort that isn't being performed on indexes. Yes, it is horribly named.

To avoid these, in 99% of the cases, you need to either fix your query or fix your table structure. You should never be sorting on non-indexes unless the amount of rows in question is a relatively small amount: say < 1000. This is because with so relatively few rows it's going to be faster sorting 1 by 1 then hitting the indexes anyway.

PHP Code:
$filesort $db->query_read("SELECT threadid FROM " TABLE_PREFIX "thread ORDER BY replycount DESC LIMIT 50"); 
MySQL is going to have to go through ALL of those threads you know...

PHP Code:
$index $db->query_read("SELECT threadid FROM " TABLE_PREFIX "thread ORDER BY dateline DESC LIMIT 50"); 
You're hitting the index so you're good.

normalization
This is, truthfully, deserving of an article of its own. I may someday decide to make one, but, for now, I'll touch the basics of this VERY IMPORTANT practice.

What is normalization?

redundancy
Take, for example, the following table structure:
STUDENT (student_id, name, birthday, gender, class)

This is horrible. Multiple classes = multiple nearly identical rows. We want each table to be responsible for one set of data. Let’s rework this a bit.

STUDENT (student_id, name, birthday, gender)
CLASS (name, student_id)

This is better, but still terrible. What if the name changes? What if we want to add more data about the class?

We need three tables:
STUDENT (student_id, name, birthday, gender)
CLASS (id, name)
CLASS_STUDENT (class_id, student_id)

num_rows
Like in part 2, I mentioned that you should always check to make sure you have a mysql resource before doing anything to it. This is the ONLY way I know of to do this without causing a warning, notice, fatal error, blackhole, etc.

UPDATE: It seems the fetch_array function actually suppresses errors so num_rows is not TECHNICALLY needed to avoid that nasty error being shown. In fact, I now think it is more efficient to avoid num_rows in cases except where you wish to error out early. Reasoning? It's unnecessary overhead without any difference in actual results. If we weren't within a vB environment I'd fully recommend it but since we are there's no reason to call a function to avoid an error that would be suppressed anyway!

PHP Code:
while($result $db->fetch_array($results))
{
    
//  right

This is perfectly acceptable.

PHP Code:
if ($db->num_rows($results))
{
    while(
$result $db->fetch_array($results))
    {
        
//  less right
    
}

Contrary to what I thought, this isn't necessary to avoid a nasty error.

PHP Code:
if (!$db->num_rows($results))
{
    
// error out
}

while(
$result $db->fetch_array($results))
{
    
//  run code

Why error out? To avoid running code that has no purpose. Running code that has no purpose is a waste of resources and may even lead to unexpected results.

[Legacy info]
Now why would you want to use num_rows before fetch_array? Besides the error message, let's dig a bit deeper into these functions.

http://php.net/manual/en/function.mysql-num-rows.php
As you can see, num_rows either returns the number of rows in the result set or false on failure.

Now what about fetch_array?
http://php.net/manual/en/function.mysql-fetch-array.php
It returns an array. There is no false on failure. If you use this function on a non-result set you will get: Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in path/to/script.php on line x

It should be noted that part of being a good coder is to avoid notices, warnings, fatal errors, database errors, etc.
[/Legacy Info]

Welp! That’s about it. I probably should dive into MySQL more but nothing else comes to mind at this ungodly hour. Let me know what you think! Hopefully a few of the more infamous coders around here read this and soak it in.

vblts.ru supports vBulletin®, 2022-2024