The Ultimate Guide To Split Testing


Well, the last post I did on split testing things went over the heads of a lot of people, so I thought I’d take some time to go back and revisit the topic of split testing, covering what it is, the various versions, and how you should be using it.

If you’d rather just get the takeaway points and the files for this post, scroll down to the bottom.

Contents

  1. What Is It?
  2. How to Conduct a Split Test
  3. Using Orthogonal Arrays
  4. Choosing Your Test Sets
  5. Using Orthogonal Arrays
  6. Analysing the Data
  7. Takeaway Points and Downloads

What Is It?

A split test is where you have two or more versions of the same thing, and you then test both and see which one produces the better outcome. There are three basic kinds of split test, known as AB, ABA and multivariate. In an AB test, you have just two versions, one called A, which is your original, and one called B, which is the refined version. In an ABA, you actually test your original twice, hence the two A’s, which will give you an expected variance. This means you can better understand the level of variance you’d expect to see on your B version.

In a multivariate test, you pick a number of parameters, which could be things like PPC advert headline, copy and URL, or possible layouts for a form, or anything else with multiple parameters. You then test a number of levels for each of these things, so you might have three variations of each headline, copy and url in the PPC example, or in the case of the form, you might have thee variations of a form layout.

Whatever kind of split test you end up performing, the important thing is to make sure you know your interrogative statement (what do you want to find out), your data set reliability (are the people testing an indicative sample of the traffic you’re going to get in the future) and your result accuracy (how much variance you’d expect to see in your result if you kept on adding data).

How To Conduct a Split Test

Whether you’re conducting an AB or multivariate test, the methodology is pretty similar. As such, you should be able to take the following framework and apply it to pretty much any test you’ll ever run.

The first thing to do is define the question. What is it you want to know? What are you trying to find? It might be the optimum layout of a page, the best PPC ad for a campaign, or how to make the ultimate scrambled egg (I’m not joking either - you can apply this stuff to anything).

Essentially, whatever the question is, you’ll be looking to answer a “which” question. For instance,

  • Which headline will convert best?
  • Which form is best for converting signups?
  • Which offer will incentivise the best?
  • Which copy will read, reassure and reinforce best?

Choosing Your Test Sets

The process from this point is very simple. Once you’ve identified your “which”, you need to create your variations. Now, if you’re doing a simple AB test, that means simply taking your control, modifying it so you’ve got a refined element, and then throwing traffic at it. The process is pretty much the same if you’re doing an ABA test too. If you’re doing a multivariate test however, it’s slightly more complex.

The problem comes in how you choose the combinations of the parameters and levels you’re going to test. This happens because, when you’re doing multivariate tests, you can end up very quickly with more options than it’s feasible to study. For instance, if you had three parameters, each with five levels (not an uncommon set), you’d have 243 potential variations to test. Even worse, if you had seven parameters, each with five levels (the biggest I’ve ever done), you could construct 16,807 variations. To get around this, we employ Taguchi orthogonal arrays.

Using Orthogonal Arrays

Bear with me, because honestly what we’re about to do isn’t as scary as it looks… The way this works, is we pick the array with the right number of levels, and then make the number of columns equal to the number of parameters. So if you’ve got five parameters, you need the first 5 columns of the array. For example, if we start with the following array:

1111111
1112222
1221122
1222211
2121212
2122121
2211221
2212112

We could test up to seven parameters, each with two levels (because it has seven columns, and every number is a 1 or 0). If we now pick the first four rows…

1111
1112
1221
1222
2121
2122
2211
2212

We would be able to test a representative sample of all the possible variations. So instead of running 32 tests, we only run eight. Similarly, if we want to test three parameters, each with four levels, we would start with the array below:

11111
12222
13333
14444
21234
22143
23412
24321
31342
32431
33124
34213
41423
42314
43241
44132

And then pick the first three columns…

111
122
133
144
212
221
234
243
313
324
331
342
414
423
432
441

And then test these 16 combinations, instead of the 81 we could potentially construct. I won’t go into the math behind how you construct these arrays, as it’s frankly mind-bogglingly dull. But suffice to say, you can’t just pick random variations. So please stick to the arrays you’ll find in the zip file at the end of this.

Gathering Your Data and Validating the Results

As a general rule, a sample is statistically valid when it will result in variation of no more than 5% when the sample size is increased. This is where ABA tests really come into their own, as you’ve got a running tally in the form of your second A test, that shows you how accurate your data is, so when the two samples get to being consistently within 5% of each other, you know you’re done. If however you’re running a standard AB or multivariate test, simply graph your results, and when the line trends out to less than a 5% wobble when you compare 20% of the results against another 20% of them, you’re done.

Validation also tends to be fairly simple. You want to check for any extraneous or instrumentation based effects on your data. Extraneous effects include things like news events that might skew your data to include the wrong kind of people, online and offline mentions that send odd traffic, or anything else that might get people outside of your intended sample into the mix. Instrumentation effects include any problems in the sandbox area that can alter results, such as a problem with analytics implementation, or changing analytics services half way though the test.

Analysing the Data

When you’ve finished the test and collected the data, the only thing left to do is to work out which version performed best. Now, in the case of AB and ABA tests, that’s pretty simple; you just take whichever one worked best, and use that.

However, the multivariate tests make things a bit more complicated. Here’s what you do…

When you’ve got your data, take the lines from your array, and number them sequentially. So if we use the first array we had earlier:

1111
1112
1221
1222
2121
2122
2211
2212

We’d call the first row 1, the second 2 and so on. This gives us 8 numbers. Next to each one, write down the conversion rate. This will give you something like this:

1 3.9%
2 4.7%
3 2.1%
4 3.3%
5 5.5%
6 4.8%
7 2.5%
8 6.2%

Now we’re going to create a table. Write down your parameters along the top, and then the permutation numbers down the sides. So in our example, we’d have a table with four columns and eight rows. The table should then calculate the averages of where each level, by adding the results from each level of a given permutation, and dividing by the number of times it appears. For instance:

Add all the results of Permutation 1, Level 1, and divide by 4. Perm 1, Level 1 appears in tests 1, 2, 3 and 4. The total of these is 14%. Divide this by 4 and we get 3.5%

Now add all the results of Permutation 2, Level 1, and divide by 4. Perm 2, Level 1 appears in tests 1, 2, 5 and 6. The total of these is 18.9%. Divide this by 4 and we get 4.73%.

Keep doing this until you’ve gone through all the results, and you’ll be left with the best performing levels for each permutation. Stick them together, and that’s your perfect advert.

Takeaway Points and Downloads

It doesn’t matter what method you use to test. It only matters that you do
A result is only as good as the data that went into creating it
Multivariate tests may be sexy, but they take much more time. Don’t assume they’re always the way forward
Don’t rush in to anything. Make sure you do the legwork first, and get everything set up properly. A ruined test wastes time and money

Download the Taguchi Orthogonal Arrays
Download the article as a pdf

31 Tips for Copywriting Awesomeness


Copywriting isn’t a science, it’s an art. That said, there are certain rules you can follow that will help you write better. Here’s 31 tips and pointers to get you started…

  1. Always Use A Headline. Never, ever write a piece of sales copy without a headline. Ever.
  2. Always Use A Subhead. The headline will grab attention. Make sure it sticks with a good subhead.
  3. Break It Down. A sales letter should be divided into parts. Give each part it’s own subhead.
  4. Keep The Flow. Once you’ve got the person’s attention, keep it by getting to the point, and staying on topic in your body copy.
  5. Be Positive. Don’t show the customer their problems as they are now, show them how their life will be after they’ve bought what you’re selling.
  6. Be Rested. Don’t write when you’re tired; you won’t create your best work if you can’t string two sentences together.
  7. Be Happy. Don’t write when you’re depressed, or in shock, or hungry, or in anything other than a good state of mind.
  8. Be Personal. You’re selling to masses, but you’re writing to the person. Use words that will bring them in. Make it feel like you’re actually talking to them.
  9. Be Real. Don’t talk like a college professor. Use conversational tone, to make them feel at ease.
  10. Be A Service. Don’t ask them to buy the product, sell them on the reasons why they need your product or service.
  11. Be Interesting. Make sure what you’re writing is going to be interesting to the reader. There’s no copy too long, just copy that’s too dull.
  12. Be Passionate. Only write copy for a product you really care about. If you don’t care about your subject, it’ll come across, and it won’t sell.
  13. Be Specific. A specific stated point can only be right or wrong, and as such will give people a greater sense of faith in what you’re saying.
  14. Guide People. Tell them exactly what they need to do. Don’t assume intelligence on behalf of the reader.
  15. Eat Well. Don’t try and write on an empty stomach, and don’t try and write after eating junk food. Your brain needs a good source of energy to write.
  16. Use Correct Fonts. People who read newspapers and magazines will be more accustomed to Times New Roman fonts. People who read mostly web-based text will be more used to Arial fonts.
  17. Write To The Right People. Make sure you’re writing is aimed at the right kind of person. Don’t write PhD level copy to children.
  18. Don’t Be Distracted. If you’re interrupted, or distracted then you’re going to lose your train of thought, and your writing won’t flow.
  19. Long Copy Sells Better. Time and time again, it’s shown that to sell, you should use long copy sales letters.
  20. Short Copy Leads Better. If you want to get people’s contact details, or are giving away a free report or free package, then short copy will lead better.
  21. Use Curiosity Correctly. Curiosity is a good way to keep people interested, but don’t leave them curious. As soon as you’ve made them think, give them the answer.
  22. Write Good Bullets. Each bullet should tell the whole of it’s point. Don’t make people wonder what your bullet’s are on about.
  23. Know When Not To Use Pictures. Yes, pictures can be good for grabbing attention, and for selling, but use them sparingly. Make sure they’re relevant, and necessary.
  24. Have A Good Picture Of Yourself. At the start of your document, or near the top of the page, have a good photo of yourself. It makes the letter more personal, and helps people to relate to you more easily.
  25. Give People Feel Good Factor. If people get a good impression of your product, and they feel it’ll make them happier in some way, the sale is already made.
  26. Give Proof. Any time you make a claim, show it’s true. Give people real examples so they can see it in action.
  27. Test Your Copy. Find a young person, and ten people in the field you’re writing to, and get them to read it. Find out from the young person if the writing is too complex. Find out from the industry people if it’s constantly relevant, and will they buy.
  28. Tell Them To Buy. Yes, you need to do this. Tell them to click the button. If you don’t, they won’t.
  29. Sell Happy Feelings. People need to associate your product with pleasure; make sure they do so.
  30. Make The Price Obvious. Most people read the headline and subhead, and then scroll down to try and find the price and an order button. Make it easy for them. Put it in big letters, bold, and red. Help them buy.
  31. Proofread. For pities sake, have a look through and spell and grammar check everything, or it’ll come back to kick you in the hoonads!

How to teach your Girlfriend programming


Before we begin, there are a few things that must be kept in mind.

  • I’ve not got a Girlfriend
  • These tips should work for men too and can be applied to nearly any subject
  • This is not meant to be sexist in any way, if it comes across as such then that was not the intention
  • Your girlfriend wants to learn
  • I’ve got a face like a bucket of smashed crabs… (okay, maybe that isn’t relevant)

Preparation

Unlike mortals like myself, when you cook a meal for 30 you can do so with no preparation. However, when teaching someone how to do something new, it helps to have some idea of what you are going to. You do not need an extensive mission plan but a rough idea of the topics you’ll cover, the order you will cover them and maybe the type of alcoholic beverage you will use to scrub all memory of me from your mind.

If you are not sure of the topics that you need to cover, google tutorials for your chosen language(s) and look at the chapter titles. You may also find that 1 of the early topics on a site holds several sub topics that you may want to teach separately, much like I used to eat the blue smarties before everybody else.

Relevant and visible progress

No doubt you’ve got 5 different qualifications, one of which is for being totally cool. At some point while taking your degree is theoretical maths you might have thought “this is a total waste of time, I’ll never use this”. Fascinating though the structure of a for loop is, it will probably not be readily apparent in it’s use.

When explaining something, try to make it so that she can see the effects of what she does both clearly and quickly. Using a for loop to count numbers is visible but it’s about as relevant as the baking foil in my sock drawer. Likewise, using a for loop to perform a bubble sort will take a long time to implement and not focus on the for loop.

She types, not you

Yes you can type 31.4 words per minute with your nose and 200 normally, and you can lift man-sized weights with just your eyebrows. Your girlfriend can’t. You tell her to type in a simple little bit of stuff and it takes here AGES! But how will she get faster? Well, how did you get faster? You typed a lot and thus it stands to reason that given time she will also type faster.

But there’s more to it than that. Giving her 100% control of the computer allows her to know that it is not you achieving results but her. I know that when there’s some small thing and you just want it done quickly it’s so easy to “borrow” the keyboard for just a second. Do not. It will send a message that she cannot do something, much better to let her know that she can do anything you can, even if she needs a little instruction at the moment.

You are not somehow more intelligent

You may be able to judge the exact size of spanner required for any given nut. You may be able to change the oil in your car blindfolded with one hand tied behind your back. Neither of those make you smarter than someone else, even if they think the best way to remove a nut is with a spoon. What you need to keep in mind is that you probably don’t have a clue what the different types of pedicure are, or possibly even what a pedicure is.

So when you explain to her what an object is and she seems confused, remember that they think differently to you. Try to explain it in a different way, this will be easier to do with practice and knowing the person well will help you in this.

You know more than her

You know what an “object” is, you know why an error on line 24 may mean you forgot a semi-colon on line 23. You may not however know the difference between two pairs of apparently identical shoes. You know more than your girlfriend about whatever you are teaching, but not about everything.

Try not to skip over things that you take for granted, it may be obvious to you what curly braces, semi-colons and doctypes. By all means don’t go too far in the other direction and patronise her. When you skip over such a thing (as you near undoubtedly will), apologise and explain it.

Jargon is bad

XML, HTML, Ajax, CSS, Server-side, WoA and SrmzA, you know them all (except the last one which I made up). You know what they mean and probably what they stand for. You might even have made a little poster of them to put next to your poster of the periodic table (my PT poster has cool pictures). When you tell your girlfriend that you know all about Ajax she’ll ask why you never clean up after yourself if you know all about it.

Teaching someone something means you transfer information and you cannot do that by using things that she does not know about. And don’t assume she’ll remember them if you give her a list at the start, she’ll have much more to take onboard and lets face it, jargon makes you feel cool but does it really accomplish as much understanding for loops and image tags?

Praise achievement

While you clearly learnt everything you know by figuring it out for yourself while harvesting crops for starving children, everybody else had to learn bit by bit. There are more methods of teaching than there are tooth fragments of my defeated enemies on the necklace around my neck. The two popular ones seem to be punishing mistakes and praising success. Punishing mistakes is good fun but you are expected to use cliches such as “you have failed me for the last time” and build doomsday devices.

I much prefer the praising success, don’t overdo it or you’ll seem as pathetic as the lackeys of those that punish failure. I’m assuming that you want her to be confident in the knowledge you are trying to impart and the best way to instill confidence is to praise. Make sure to praise sincerely, if she can’t understand something then telling her she is really clever will make you look slightly more clueless than me.

I hope that these tips are useful to you and that you are successful in your endeavors. Please do leave comments on what you think I am wrong about, depending on how past it’s use by date my lunch was, you may well be right.

Who Else Wants Kick-Ass AdWords Quality Score?


Having great quality score can lower your costs per click (and by proxy, increase ROI), lower your bounce rate and increase conversions. How so? Because making changes to boost your quality score will generally mean making your site better. However, it’s not a particularly easy thing to do.

What is Quality Score?

Google defines quality score as:

“…the basis for measuring the quality and relevance of your ads and determining your minimum CPC bid for Google and the search network. To encourage relevant and successful ads within AdWords, our system defines a Quality Score to set your keyword status, minimum CPC bid, and ad rank for the ad auction”.

…and it gives you a good one based on:

  • The ads historical CTR on Google, the display URL and it’s “relatedness” to the keyword
  • The ads previous performance on the relevant (and related/similar) site(s)
  • Keyword relevance relative to the ads in its ad group
  • Landing page quality
  • How good you are (measured on the CTR of all the ads and keywords in your account)
  • Historical CTR of the ads in the ad group
  • Landing page load times

Which is all very well and good, but that doesn’t really tell you what you need to have a good one. So to save you thousands of hours testing and experimenting to find what works best, here’s our guide to getting a quality score that kicks ass.

Improving Your Quality Score: Building a Better Campaign

I’ll be doing a full post on how to build an awesome campaign structure when setting up from scratch in the near future, but for now we’re only interested in the bits that affect quality score. With that in mind, here’s how you do it:

  1. Keep your keyword groups focused
  2. . I’ve seen countless examples where people will create a group called “products” or something similar, and then lump in every keyword known to man. Instead, keep things tight. Have a group for blue widgets, and the 15-20 keywords that relate to them, one for red widgets with the keywords for that product, one for white widgets and so on. Get laser-targeted with what you’re bidding on.

  3. Keep your ad copy focused
  4. . Again, people are far too quick to rely on DKI (Dynamic Keyword Insertion) rather than actually doing things properly. DKI is fine, but make sure you’re using it where appropriate, or you can end up with some serious gaffs. Instead, actually write some proper ad copy, making sure that you get whatever the main keyword focus for that group is appears in the text in the title, description and URL.

  5. Keep your ad copy focused
  6. . Again, people are far too quick to rely on DKI (Dynamic Keyword Insertion) rather than actually doing things properly. DKI is fine, but make sure you’re using it where appropriate, or you can end up with some serious gaffs. Instead, actually write some proper ad copy, making sure that you get whatever the main keyword focus for that group is appears in the text in the title, description and URL.

  7. Be a “glass half empty” kinda person
  8. . Or to put it another way, be negative. Use your analytics to see which terms your ads are showing for that aren’t producing the goods. Negative keywords will help you filter those out. Common ones to put in are “free”, “sample”, “try”, “test” and other such terms.

  9. Test the keyword matching options
  10. . Rather than phrase-matching all the time, mix it up a bit. Try using exact and broad match to see what that does for you. Exact (provided you’ve got a relevant keyword) will generally give the best quality score, but at the expense of getting traffic from variations. Test to see what gives the best ROI.

Improving Your Quality Score: Crafting Better Landing Pages

Again, I’ll be doing a full post on what you want to be doing to create killer landing pages soon, but this will just be a short piece on sorting out the quality score elements.

  1. Title, description, URL and robots
  2. . Firstly, set up a folder for all your landing pages to sit in. Then, in your robots.txt file, add the following lines:

    User-agent: *
    Disallow: /ppc-landing-pages/

    User-agent: AdsBot-Google
    Allow: /ppc-landing-pages/

    That will stop anything other than the AdWords quality score bot from accessing those pages. That way, you won’t lose quality score, but you also won’t risk having pages that look like over-optimised spam that could get you penalised. Now make sure every landing page has a name that relates to the keywords in the ad group that target it, and also make sure the title tag is targeted to that group. Don’t go overboard, but make sure it’s there. Finally, set the meta description as whatever the best performing piece of ad copy is.

  3. Target your ad copy
  4. . Use the Site-Related Keywords Tool to make sure the copy on your page is in keeping with what Google thinks it should be. That way, you won’t be in for any nasty surprises later on, and have to re-do all your copy.

  5. Code properly
  6. . Now that page load time is a factor, you’re going to want to stay away from dynamically generated pages. Instead, have your CMS cache every landing page that a person generates, and output it as a real HTML file somewhere. That way, you can create lots of pages dynamically in a short amount of time, but have proper HTML there when the bot comes along, with no SSIs or dynamic scripts running to slow things down. If you want to go completely bonkers with this, it’s also worth learning how browsers actually load pages, so you can lay out your pages in a more spider-friendly fashion.

  7. Split test your landing pages
  8. . Use the orthogonal array spreadsheet tool to conduct large scale multi-variate split tests on your landing pages. Refine them over time, and you’ll see that make a difference too.

Abracadabra

It’s incredible what a difference putting all this into play can make. I’ve seen CPCs drop to 10% what they were pre-optimisation. I’ve seen conversion rates increase 500% as a result of putting this stuff into practice. This is real, serious PPC optimisation, not just something Google put in to piss you off. So take the bull by the horns and sort out your campaigns today.

If you think we’ve missed anything, or you want something explained further, let us know in the comments below.

If you found this article useful, please take a moment to vote it up on Sphinn, Reddit or StumbleUpon

Advanced PPC - Multivariate Testing Through Applying Pareto to Orthogonal Arrays


Much has been written on this subject… OK, I’m lying. Almost no-one has written anything on this. Which makes it all the more bizarre, when you consider just how powerful these arrays are as a tool for multivariate tests. If you just want the xls files so you can get on and play, you can get them here. If you also want to know how they work, what they’re doing and how to make the most of them, read on.

PAM-VAR Testing

The Pareto principle states that for any system, 20% of the outcomes generated will come from 80% of the population of variables. Or to put it another way, 80% of what you get comes from 20% of what you do.

This allows us to logically conclude that a representative sample of any specific population will allow us to estimate the results for the rest of it. In medicine, this is performing a biopsy, in mechanical engineering it’s called Taguchi testing, and in web based evaluation, I’ve termed it PAM-VAR testing (Pareto Analysis of Multi-Variate Array Results).

The key to this method is determining what constitutes a representative sample. For example, if we wanted a sample of numbers from 1 to 100, you wouldn’t pick 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10. Instead, you’d go for middle of range values, so you’d go for a set more like 3, 15, 24, 37, 45, 56, 67, 73, 86 and 95.

But how do we do that, when we’ve got a PPC campaign to test, with 5 variations of headline, body copy and URL? The simple answer is that we borrow from our friends in engineering.

Taguchi Arrays

Turns out some clever bugger called Genichi Taguchi had come across this problem some time before me, and invented something called the Taguchi orthogonal array. Taguchi’s brilliant line of thinking ran something like this:

“If we can make the results of a set of tests mimic the most extreme variations that we’d expect to get, we can estimate the top and bottom percentage of results by sampling a small number of the total possible potential variations.” Basically, Taguchi figured that if taking a small lump of someoneís liver could tell you about the rest of it, you could take a small sample of potential figures for variables in an equation, and use them to estimate the outcomes of all the others.

What Arrays Do We Use?

Personally, I like symmetry, so the arrays I use work on matched numbers of variables. That means that if I’m testing three headlines, I’ll also test five variates of body copy and three URLs. It’s perfectly possible to test a set of two three and three, or three four and five, or any other set you can come up with. However, the arrays are slightly harder to construct. Nevertheless, if you want to generate spreadsheets or applications to calculate these, it’s certainly doable, and you could reverse engineer it from the xls files available here, if combined with the arrays from FreeQuality.org.

Application in PPC Multivariate Tests

So, imagine we’ve got a keyword group in a PPC campaign we’ve set up. We’ve got five different versions of the title, copy and URL, and we want to test to find the optimum version for that set of keywords. If we wanted to test every single combination, that would mean running 125 different adverts. Now imagine if you wanted to do this across 5 sets of keyword groups. OR even worse, 3 groups of 5 sets of keywords. All of a sudden, you’re having to create and test 625 ads in the first instance, and 1,875 in the second. That’s simply not practical.

Fortunately, by applying the tools we’ve provided, you can trim this to a fifth of the normal figures. Obviously this is hugely beneficial, as it makes these kinds of large-scale tests practical, as you’re running 125 and 375 permutations, instead of the figures shown above. Whilst these are obviously still large numbers, they’re far smaller and more manageable.

Application in Other Online Multivariate Tests

Now, imagine if you were to run this same test across a web design, or a long copy sales letter, where you’ve got four, five or more variables. All of a sudden, you can be looking at huge values. For instance, a multivariate web design test with five variables, and four permutations of each would take just 16 tests using our system. That’s against a normal size of 1,024. Or if you wanted to test 5 variables with 5 options each, that would give 25 samples for testing, instead of the normal 3,125.

Again, you could work out how to construct the test array from the xls files available here, combined with the Taguchi arrays from FreeQuality. If however you’re completely lazy, we’ll be coming back to this next week, to show you how to do that. And we’ll probably build an online version of this in the future to make it even easier.

Talk To Us

If you’ve found this useful, please leave feedback in the comments below.