Category Archives: Uncategorized

PHPCompatibility is ready for PHP 7.0

It took a while to get it finished, but I finally managed to finish writing the changes for PHPCompatibility to include PHP 7.0 support.

Thanks to financial support from the cool people at WPEngine, the complete set of sniffs for PHP 7.0 is now available on Github through the usual link.

As always, tests include both forward and backward compatibility. It is advisable to run PHPCompatibility on PHP 7.0, as some sniffs can not be run on older versions. You can use the ‘–runtime-set testVersion 5.x’ parameter (replacing the x with the number of your chosing) to test your code for older versions.

The sniff includes support for all of these changes :

  • Deprecated functionality :
    • PHP4 style constructors
    • password_hash salt option
  • Older version check (using –runtime-set testVersion)
    • scalar type declaration
    • return type declaration
    • null coalescing operator
    • spaceship operator
    • constant arrays in define
    • anonymous classes
    • unserialize filter variable
    • IntlChar class
    • Group use declaration
    • intdiv function
    • session_start options
    • preg_replace_callback_array function
    • random_bytes and random_int functions
  • Backward incompatibilities
    • Empty list assignments
    • global keyword with variable variables no longer allowed
    • Function parenthesis warning
    • Negative bitshifts
    • Removed functions call_user_method, call_user_method_array, mcrypt_generic_end, mcrypt_ecb(), mcrypt_cbc, mcrypt_cfb, mcrypt_ofb, datefmt_set_timezone_id, IntlDateFormatter::setTimeZoneID, set_magic_quotes_runtime, magic_quotes_runtime, set_socket_blocking, imagepsbbox, imagepsencodefont, imagepsextendfont, imagepsfreefont, imagepsloadfont, imagepsslantfont, imagepstext
    • Removed INI directives always_populate_raw_post_data, asp_tags and xsl.security_prefs
    • New objects assigned by reference removed
    • New reserved keywords bool, int, float, string, NULL, TRUE, FALSE, resource, object, mixed and numeric
    • Functions with multiple parameters with same name not allowed
    • Switch statements with multiple defaults not allowed
    • $HTTP_RAW_POST_DATA removed
    • mktime and gmmktime no longer support is_dst parameter
    • preg_replace no longer supports \e
  • Several new built-in functions, classes, interfaces and exceptions
  • Many new global constants
  • Many removed extensions
  • Loosening reserved word restrictions (in some places)

Busy busy + updates

My blog has been very quiet for about 10 months, so I thought I’d write a quick update on what I’ve been doing :
– Work has been extremely busy, with lots of new projects and lots of existing projects needing changes
– Not really helping my workload is my health, which has been slightly troubling since I got Pfeiffer’s disease last year. Although I’m mostly rid of it, it’s one of those annoying things that, even if your body has beaten it, it tends to cause some after effects.
– I’ve been speaking at a lot conferences in Europe and the US (check the conferences page on the right). All this travel hasn’t exactly been good for recovering from the above either 😉
– When I find the time, I try to put on my dancing shoes and have a good time dancing the stress away

So what’s coming up :
– More work, lots more actually. We’re about to launch a new IPv6 service and are working on exciting projects for several clients
– More conferences. In fact, I have 6 more planned right now. Looking forward to them though, as they’re in some of my favourite places (like San Francisco !)
– Planning on finally finishing some of the open source projects I’ve been working on, as well as starting a few additional ones
– Planning on working on some other projects, based on ideas I had years ago. Maybe I’ll finally get around to building them ?

And finally, I intend to write some more blog posts about various topics. Some of those posts are already partially finished. So expect more activity here 😉

Sporza.be zonder voetbal (for Belgian users)

For the English readers of my blog : this post is for Belgian users who want to use Sporza.be (a big Belgian sports site) without all the soccer content, since that usually fills over 80% of the site. So the content is in Dutch, sorry 😉

Sporza zonder al de voetbal… niet ideaal voor de voetbalfans, maar die hebben sites als Voetbalkrant om hun voetbalnieuws te halen. Dus er is geen enkele reden waarom 80% van Sporza.be vol moet staan met voetbal. Maar aangezien je dat niet zelf kan aanpassen, is hier de manier om Sporza.be te bekijken zonder voetbal.

Voor Firefox-gebruikers

  • Download Greasemonkey hier. Mogelijk moet je Firefox even herstarten.
  • Klik hier en kies Install/Installeren
  • Surf naar Sporza

Voor Chrome-gebruikers

  • Klik hier en kies Continue/Verder
  • Surf naar Sporza

Voor gebruikers van Internet Explorer

  • Surf naar hier
  • Download en installeer Chrome
  • Volg bovenstaande instructies

How a bad favicon.ico can cause a lot of trouble

Favicon.ico is a nice thing, but it can cause a whole lot of trouble when missing or not used properly…

 

What’s favicon.ico ?
Favicon on google.com

Favicon.ico is a Microsoft-invented icon that shows the logo for the Website in the browser’s address bar and next to the site name in the browser’s bookmarks. It was first added to Internet Explorer 4 in 1997 and has since been adopted by all browsers.

Since tabbed browsing was introduced, it’s used as the icon for the tabs as well.

 

So where is the file ?
A browser will, by default, look for it in the site’s root directory. So for http://www.google.com, that’s http://www.google.com/favicon.ico
However, its location can also be specified within the XHTML (of each page) by using one of the following :

(Last 3 not supported in Internet Explorer)

 

The not-so-catastrophic problems

There’s a number of problems associated with favicon.ico – the not-so-catastropic ones are :

  • Some favicon.ico files are located on a different URL and use redirects. This means the browser has to make multiple requests to get to the right location. It also means your server gets multiple hits.
    Example : www.wordpress.com/favicon.ico redirects to wordpress.com/favicon.ico, which redirects to en.wordpress.com/favicon.ico, which redirects to www.gravatar.com/blavatar/4e21d703d81809d215ceaabbf07efbc6?s=16&d=http://s2.wp.com/i/favicon.ico, which finally serves the icon – that’s 4 connections and 4 requests for an icon file
  • Some sites don’t send the correct mime type when sending the icon. The acceptable mime types are image/x-icon, image/vnd.microsoft.icon, image/png and image/gif. However some just send application/octet-stream or even text-plain. Most browsers seem to have no problem with this, because they use the extension to attempt to parse the type, but it goes against best practices.
    Examples :
    – wordpress.org and thepiratebay.org send an application-octet-stream header
    – ups.com sends a text/plain content-type header, but sends an icon file along – very bad practice !

 

The really bad ones

  • Some sites use real icon files, but they’re extremely large, although there’s really no good reason for it.
    Examples :
    – The biggest icon file for sites in the Alexa top 20.000 is www.marketingsherpa.com, providing a 554KByte file… Based on the fact they get about 2.7M pageviews per month (Alexa estimate), we can guestimate they’ll be sending out quite a few GBytes (50 ? 100 ?) of data every month !
    – Flickr.com (Alexa #33) sends a 90KByte .ico file (still over 1100 times larger than the smallest possible icon)
    – WordPress.com has an 11KByte .ico file
  • By far the most common problem is a missing favicon.ico file. Although that might not seem like a big problem, it can actually cause massive issues on a high-traffic site.
    Imagine this : if you get 10 pageviews/sec on your server (which is not that much) and your favicon.ico file doesn’t exist, your server will generate a 404 error for every first request. Luckily, browsers such as Firefox 3+ keep a list of which favicons are missing and don’t re-request them, but not all browsers follow this behaviour, meaning if those 404 pages aren’t cached, the icon is requested again on every pageview.
  • Let’s make it worse : if you’re using a framework like Zend Framework and you’re redirecting all requests to your framework bootstrap, you might be sending all 404 errors to the bootstrap, so you can show a fancy We’re sorry, that page doesn’t exist or even a page with Did you mean … where you do a search query for potential matches. So what happens when favicon.ico doesn’t exist and hits that search on every request to your site ? Exactly : you get 2 pageviews for every real pageview… and each pageview launches your entire framework bootstrap and in case you’re doing the search thing, it launches a search on your backend DB… ouch !
    Example :
    go.com favicon 404
    go.com sends a 22Kbyte page with Oops! We’re sorry, but we’re having technical problems. – luckily most subdomains (such as disney.go.com) do have a favicon.ico – otherwise the 46th largest Website in the world would have had quite a bit of traffic and load because of a missing file
  • Some sites use png or gif files, often the site’s main logo. Although using png or gif is supported by most browsers and in fact using png will produce the smallest possible icon files (see below), it’s not supported by Internet Explorer. Also, using your company’s main logo image file isn’t the right thing to do, since those files are usually quite large, which means the browser needs to resize the image to a 16x16px or 32x32px image. This doesn’t just use processing power, but it also means the image being sent is a lot larger than required.
  • Some sites will use all of the XHTML link tags, causing the browser to download the icon multiple times, especially when each tag refers to a different location (i.e. on a CDN network).

 

Who’s doing it wrong ?
To give you some idea of other big sites doing it wrong :

  • hp.com returns an application/octet-stream
  • aws.amazon.com uses the link tag implementation, but uses a malformed URL
  • citibank.com (and citi.com and many other Citibank domains) displays a 404 page, adding 15KByte. And since they’re using quite a few subdomains, the icon is requested a lot of times. (Note : online.citibank.com does have an icon, so why not copy it to the other subdomains ?)
  • apc.com (the UPS brand) shows a 404

 

Some are doing it right

  • facebook.com : 152 bytes with 0 redirects
  • yahoo.com : 318 bytes with 0 redirects
  • ibm.com : 318 bytes with 0 redirects

 

Some of the big shots can do better

It’s actually remarkable to see that sites like Google, Live, Twitter, LinkedIn, AOL, Adobe and Myspace (to name just a few) send out a 1150 byte icon.
Given that Google has tried everything to skim down its main page (including removing </body> and </html> tags, it’s odd they didn’t save the 239 bytes by creating a PNG file and providing that PNG to all non-IE clients (multiply it by 100 million or so hits/day and you get a nice 23TBytes…).

 

A word of advice

It’s quite simple actually : use a favicon.ico file on all your subdomains. If you don’t have an properly created icon for your site, put an empty icon or even just a plain empty 0 bytes file (be careful though, not all browsers like this and will request it over and over again).
In case you’re looking for a small blank icon file, I created a 79 bytes favicon.ico file (actually a PNG, so it won’t work on Internet Explorer) : here you go – they don’t get smaller than this !

In case you want an IE-compatible one : smallest favicon.ico

Presenting “Caching and Tuning Fun for High Scalability” at phpBenelux

On Jan 28-29 the second phpBenelux conference will be held in Edegem, a small town near Antwerp, Belgium. phpBenelux is the largest php usergroup for the Benelux (Belgium-Netherlands-Luxemburg) area.

I’ve been given the opportunity to give a 3-hour tutorial on the 28th. Here’s a quick rundown of what I will be diving into during that tutorial :
– General introduction into caching (what it is, what types of caching techniques there are, etc.)
– Some common caching implementations (the good, the bad and the ugly)
– How to keep your site scalable by caching
– How to bring down your site (how NOT to cache !) – i.e. commonly used techniques that lead to disaster
– Caching outside the box
– Spread, don’t duplicate
– Scalable alternatives to LAMMP (no, not a typo !)
– Applying Varnish
– Backend done, let’s go frontend !
– Tuning quickies
– Finally, a live demo with all of the described techniques applied while performing a stress-test – things might (will) most likely go horribly wrong here 😉

I’m looking forward to presenting this tutorial. If you want to be there, visit the phpBenelux Conference site or order your ticket while you still can !

See you all on the 28th !