Click Fraud Fun

Tuesday, 31 July 2012

Limited Run and Facebook

So it's all over the news that Limited Run and Facebook are squaring off about click fraud. No big surprise, you can't be a large advertising platform and NOT eventually make the news. So let's do a bit of analysis on it. Of course we can only speak for the traffic we've captured through our beta testing program, we can't speak for Facebook or Limited Run.

Javascript vs. No Javascript

We are also interested in the javascript vs. no javascript end of things on our system. However, we determined that 1.4% of all traffic didn't have javascript enabled. We also determined that 1.5% of all pageviews had no javascript enabled. That's significantly different than what Limited Run was seeing. We all know that javascript and cookies are required for a large majority of websites that folks use these days, and fraudsters know that if you don't look like the vast majority....well you're not a good fraudster.

Secondly, if I were to write a good Facebook fraud engine that was bot style, it would be in pure javascript. Facebook is not a traditional linking system, with many things buried inside the DOM (Document Object Model) that give the fraudster the idea of where and how to click. The traditional scrape a web page and start clicking on links would be less effective, and no matter how shabby the Facebook click fraud detection was, it would stick out like a sore thumb.

Javascript vs. Conversions

So we dug a little deeper. We ran the numbers on the conversion rate for javascript disabled visitors. It turned out to be 2.2%. Not great, but not 0. We did however find a number of paid clicks coming from these types of visitors, so we dug a little further, and this is where it got interesting.

The ClickOptics Test

Using our own fancy shmancy scoring algorithm, we then took all of the traffic that didn't have javascript enabled, ran it through our scoring algorithm, and here's a high level breakdown, which oddly enough seems VERY close to the numbers Limited Run was seeing.

Good Traffic (you made more than you spent)        - 21.98%
Not So Bad (you spent a little you made a little)  - 2.99%
Bad (it cost you money, you made none)             - 75.01%

Interesting! So you can see that if you breakdown the actual javascript disabled traffic itself, you start seeing some quality issues, but this can be misleading when it's only making up a fraction of your total traffic. Could this be the issue? Are we crazy? Let us know in the comments.

Closing Thoughts

In our opinion, Facebook advertising is a tool to increase your social engagement (duh). It's not there to service people looking to actually buy product and services. Sure, you can get followers, likes and the rest of it, but that could translate to 0 visitors to your sites and 0 conversions. This is simply because your target market is not on Facebook for commercial reasons. Use Facebook as a part of your strategy, but if you're going to be advertising on it you better have a killer Facebook page, with some very compelling reasons to have people go ANOTHER click deeper to your website.

Sunday, 29 July 2012

Metrics - Acronyms for All

Interesting post on conversion rates and visitor engagement over on Inc.com. What I take away from this post is: know your site.

No one or two metrics (or acronyms) work for every single site in the same way. We keep beating this drum: know your site, know your site, know your site.

If you know your site, then you're going to boil down metrics, analytics, click fraud rates, conversions and the rest of it into your own formula that works, is repeatable and lends itself well to continuous experimentation.

Saturday, 28 July 2012

Make Your PPC Expensive to Attack

I came across a pretty good writeup, from 7 years ago. It nails some really great points.

The main takeaway I grabbed from this, and from the single comment that was posted is that it is key to make your PPC campaign expensive to attack. What does this mean exactly?

1. Be a Sniper

Use a tool (we're a bit biased here of course) to keep an eye on your traffic, visitor behaviour, social stickiness (we'll delve into this in a future post), and low quality paid traffic. Start excluding known bad actors, increase the bids on your socially sticky keyword sets. Your competitors will have no choice but to replicate you (fail) or outbid (triple fail) you. Don't try to do this without a tool that is paring down the data into a manageable format and amount, reams of data does not equate to actionable data.

2. Forget the 1%

We've already written that 80% of malware make it past anti-virus companies. You're never going to have a silver bullet for traffic quality and click fraud prevention, and there are always going to be attackers that are extremely skilled. Forget about them. Like now. If they're good enough to sneak past you day after day undetected, accept it. And move on. Make it so that it's only the top 1% (see point 1 above on how) that can do this to your online presence.

3. Stop Tracking Conversions from Multiple Sources

I never understood why people track conversions from AdWords, Google Analytics, and nine other tracking scripts. And then they try to compare each of them (which never jive) to how many conversions they actually had, and fairly soon they have no friction' clue what's going on. Use one very good conversion tracking system, and stick to it. This may be trial and error, but stop trying to do it from multiple locations, you're just confusing yourself, and your traffic quality program suffers. Conversions are the ultimate end goal of your online presence, make sure your conversion tracking is bang on.

Click fraud is an arms race, one that involves many players: you the advertiser, Google the provider, the fraudster and potentially a whole whack load infected machines running bots. It's key to find a defensive program that requires minutes a day to implement, is multi-layered (computer security folks call it "defence in depth") and effective. Let us know in the comments if you've found techniques that work for you, or whether you think we're full of it.

Tuesday, 24 July 2012

The Google Conspiracy

I recently had a conversation with a fellow webmaster and former AdWords advertiser. When I asked why he had stopped advertising, he gave me a rough rundown on how he wasn't getting results, and that he started to see a drop in conversions vs. money spent.

It was at this point in the conversation that the inevitable happened:

He figures it was attributed to either Google themselves generating clicks that looked to be from anywhere in the world, or it was known bad traffic that Google just ignored.

*eye roll*

I always wondered why people had these harebrained ideas about Google, and how they don't care if you get ripped off. It just never made sense.

There is the easy parallel to draw to the antivirus industry. Sure, they want to do just a good enough job to make you think they care, but not such a good job that they catch 100% of the malware because why would you buy updates in the future? ZDNet had reported the 80% of malware makes it past most AV's, but what if we saw click fraud rates that high? It would definitely make the bullshit Click Fraud Index look like a Strawberry Shortcake party on your AdWords account wouldn't it!

Then the next inevitable argument line is how Google has little transparency on how they handle this problem, so how do you REALLY know. But again, neither does the AV industry. We can only assume (or you can in theory reverse engineer their software, which you can't with Google) that they are doing everything in their power to protect your corporate and personal networks. But they fail, and they fail big.

Yet everyone keeps buying it.

Hopefully we can start to be the ones to put an end to this madness. Drop the Google conspiracy, and start paying attention to your website. Guard your PPC like you guard your network.

Friday, 20 July 2012

More Mobile Madness

If you didn't read our previous post here, then check it out for some background information.

We've also subsequently added Verizon and T-Mobile to the list of telcos that are allowing private information flow past their proxies.

In these carriers cases, they are pushing out the MSISDN header which gives away the end users's phone number. Now it's possible that the user's mobiles are giving this away, however, we definitely feel like the carriers should make an effort to protect the users that are coming through their proxies.

Here's an example request from T-Mobile:

Accept-Language: en-US
x-wap-profile: http://wap.samsungmobile.com/uaprof/SGH-T959V.xml
User-Agent: Mozilla/5.0 (Linux; U; Android 2.2.1; en-us; SGH-T959V Build/FROYO) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
Accept: */*
Accept-Charset: utf-8, iso-8859-1, utf-16, *;q=0.7
X-Nokia-MSISDN: 1XXXXXX0549
X-Nokia-sgsnipaddress: XXX.155.174.198
MSISDN: XXXXXX0549
X-Via: Harmony proxy
Connection: keep-alive

We found about 50 instances where folks numbers were being disclosed in this fashion, and they were exclusively from Verizon and T-Mobile.

Why Hasn't Anyone Said Anything?

Obviously analytics companies and click fraud companies who are doing deep analysis must have seen this before. We are a new startup, with very small (comparatively) traffic numbers. If after seeing some very small samples of our beta testing customers sites, and picking this up, we would be surprised that no one has noticed this before.

But here is the tough part of the situation. A good portion of analytics, and click fraud detection is being able to isolate a visitor on the other end of the HTTP stream. It is advantageous to have uniquely identifying information flowing in because it makes the job of the click fraud detection company a lot easier. If only we could see each user's phone numbers! Hopefully no one is heavily relying on this type of information to create visitor profiles or you're algorithm is going to be severely bitched after all the mobile companies (and their users) get wind of it.

This is precisely why we all need to do a better job of how we approach this problem, and why we feel our approach is different. Let's just hope these mobile carriers, and any others who are paying attention, fix these issues and be on the lookout for them in the future.

Monday, 16 July 2012

Head(er) Hunter - Beware the Misconfigured Proxy

As part of our beta testing program, we spend vast amounts of time manually poring over the traffic that flows into our tracking server. We're always looking for interesting traffic anomalies, unique signatures and in general we just find it neat to do technical deep dives on our client sites.

In a header hunting session about a week ago, we found some interesting requests, here is one of them with the incriminating stuff left out, and the other incriminating stuff snipped:

GET /[SNIPPED] HTTP/1.1
Host: [SNIPPED]
Accept: */*
Accept-Charset: *
Accept-Language: en-US
Cache-Control: no-cache
Pragma: no-cache
Referer: [SNIPPED]
User-Agent: Mozilla/5.0 (Linux; U; Android 2.3.5; en-us; ZTE-Z990G Build/GRJ22) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
Via: 1.1 [PROXY WE DON'T WANT TO SHOW]
x-up-calling-line-id: 1[MOBILE_NUMBER]4510

So this is interesting, a mobile customer through this particular carrier actually has their mobile number disclosed to our service (and every other site they browse to through this proxy). This should sound familiar if you follow mobile tech, as The Register wrote about this exact issue regarding O2 a UK-based carrier that was in trouble for leaking mobile customer's numbers. Read it here.

This is bad. And in this case, we disclosed this to the carrier and their security team has begun an investigation. We'll update when we hear back about the issue. This is not a new issue either, there was a presentation by Collin Mulliner (PDF) that described some of the data leaks that were occurring....four years ago! We however have a new twist on this, read on.

Hey Proxy, I Just Met You and This Is Crazy

All Carly Rae Jepsen references aside, we continued our header hunting session to see what other tidbits we could find. We soon came across some more badness, although less obvious. These were for MetroPCS and Cricket customers in the US:

Cricket Example:

GET / HTTP/1.1
Proxy-Authorization: Basic Q2xpY2tPcHRpY3MgU2F5cyBIZWxsbw==
User-Agent: Cricket-A210/1.0 UP.Browser/6.3.0.7 (GUI) MMP/2.0
Accept-Charset: utf-8, US-ASCII, ISO-8859-1
Accept-Language: en; q=1.0, es; q=0.5
x-wap-profile: "wapuaprof.wap.mycricket.com"
Referer: [SNIPPED]
Accept: application/vnd.oma.dd+xml, application/vnd.phonecom.mmc-xml, application/vnd.wap.wmlc;type=4365, application/vnd.wap.wmlscriptc, application/vnd.wap.xhtml+xml, application/xhtml+xml;profile="http://www.wapforum.org/xhtml", multipart/mixed, multipart/related, text/css, text/html, text/plain, text/vnd.wap.wml;type=4365, audio/mid, audio/midi, audio/x-mid, audio/x-midi, audio/qcelp, audio/vnd.qcelp, audio/mp3, audio/mpeg, audio/pmd, audio/x-pmd, audio/vnd.pmd, application/x-pmd, application/vnd.pmd, audio/evrc, application/vnd.oma.drm.message, image/bmp, image/gif, image/jpeg, image/jpg, image/png, image/vnd.wap.wbmp, image/x-up-wpng, text/css
Host: [SNIPPED]
Cache-Control: no-cache, max-age=43200
Connection: keep-alive

MetroPCS Example:

GET / HTTP/1.1
Proxy-Authorization: Basic Q2xpY2tPcHRpY3MgU2F5cyBIZWxsbyBBZ2FpbiE=
User-Agent: sam-r380 UP.Browser/6.2.3.8 (GUI) MMP/2.0
Accept-Charset: iso-8859-1
Accept-Language: en, es
x-wap-profile: "http://uaprof.metropcs.net/UAProf/sam-r380.xml"
Referer: [SNIPPED]
Accept: application/octet-stream, application/vnd.oma.drm.content, application/vnd.oma.drm.message, application/vnd.oma.drm.rights+wbxml, application/vnd.oma.drm.rights+xml, application/vnd.phonecom.mmc-xml, application/vnd.wap.wmlc;type=4365, application/vnd.wap.wmlscriptc, application/vnd.wap.xhtml+xml, application/xhtml+xml;profile="http://www.wapforum.org/xhtml", image/bmp, image/gif, image/jpeg, image/png, image/vnd.wap.wbmp, image/x-up-wpng, multipart/mixed, multipart/related, text/css, text/html, text/plain, text/vnd.wap.wml;type=4365, application/x-smaf, application/vnd.smaf, audio/mid, application/vnd.wap.co, application/vnd.wap.si, application/vnd.wap.sia, application/vnd.wap.sl, application/vnd.oma.dd+xml, application/vnd.oma.drm.message, application/vnd.oma.drm.rights+xml, image/bmp, image/gif, image/png, image/jpeg, image/vnd.wap.wbmp, image/x-up-wpng, text/vnd.wap.wml, text/plain, text/html, text/css
Host: [SNIPPED]
Cache-Control: max-age=43200
Connection: keep-alive

Now this doesn't look all that bad right? Except if you look a bit closer at the Proxy-Authorization header. This header is designed so that clients (i.e. your mobile phone) can sent their credentials to a proxy, and then have that proxy send the same credentials to the next proxy in the chain. So a fancy ASCII diagram would look like this:

Mobile Phone -> Credentials -> Mobile Proxy -> Credentials -> HTTP Proxy -> ClickOptics.com

Now there is nothing wrong with this, and this is how large networks can segment proxy traffic. Typically however, the next proxy in the chain has to send a Proxy-Authenticate header to request those credentials, and depending on the proxy configuration, and their ACLs (Access Control Lists) the originating proxy will either send or not send these credentials. In our case, we weren't acting as a proxy, nor do we send out a Proxy-Authenticate header, these misconfigured proxies just volunteered the information on their own. So let's break down these headers from the two requests above:

Cricket:

Proxy-Authorization: Basic Q2xpY2tPcHRpY3MgU2F5cyBIZWxsbw==

If we base64 decode the encoded part of the string above we get: "ClickOptics Says Hello". However, if we take the real value from the requests we trapped you get:

[mobilenumber]@mycricket.com:[password]

MetroPCS:

Proxy-Authorization: Basic Q2xpY2tPcHRpY3MgU2F5cyBIZWxsbyBBZ2FpbiE=

Again in this case you'll get: "ClickOptics Says Hello Again!". However, in the real request we get:

[mobilenumber]@aaah.mymetropcs.com:[password]

Implications for Click Fraud

So as always we look at how this applies to click fraud. Now because none of us wanted to break any laws, we didn't actually attempt to change any proxy settings on our phones (and we actually don't use these carriers). But under the assumption that these proxies are simply obeying a username/password style authentication scheme, it could be possible that a fraudster could harvest these credentials. Then by writing a simple Android/iPhone application, they could feed this list of credentials into the app and begin executing click fraud that would appear to be coming from multiple mobile clients.

This wouldn't prevent pure IP tracking to pick this up assuming that the proxy they exit from is the same. But we all know that you need more than just IP tracking to determine a visitor's behaviour. Also, depending on how the carrier tracks user usage, this might make it difficult for the carrier to trace back to who exactly was doing any of this nefarious business.

The other attack vector could be where a fraudster harvests these mobile numbers and stores them. Then by doing some simple math they can determine which AdSense ads would yield a higher return than the cost of sending an SMS. Then they can use a service like Twilio or any other SMS service to send out SMS messages with shortened URLs that appear to be coming from their provider ("You've Won XYZ from Carrier ABC Click Here!"). Because SMS messages have a direct cost to send (as opposed to email SPAM), folks would be pretty likely to click on them. Using a simple redirect, the unwitting mobile customer would be executing the click fraud. The redirection service could also track which mobile numbers were clicking on URLs and then re-market the SPAM to them. The unfortunate thing is that in pay-as-you-go situations, these messages will actually chew into the mobile user's budget.

Considering you can easily determine what type of phone a remote user has, sending out "Upgrade Your XYZ Phone Now" SMS messages would seem to be a pretty good attack vector as well.

Disclosure

Now we did try reaching out to MetroPCS and Cricket, in a variety of ways but to no avail. So we felt it was in the best interest of their customers to know that they need to enable Wifi as often as possible to avoid using these proxies until their carriers have fixed the problem. The third carrier did respond, and is actively investigating, so we'll leave them to it.

Hope you enjoyed our analysis, this is one of our first posts on some real technical behind the scenes stuff, and we hope to do more in the future!

Thursday, 12 July 2012

Opinions are Like ..... Everybody's Got One

Just took a peruse of a PPC Hero post here and ran into an older Inc.com posting here. Both are struggling to tackle the problem of traffic quality vs. business quality. Both are great posts that you should read.

What I take away from both is that everyone's opinions of how things should be tracked, and how metrics should be applied are different. Interestingly enough however, if you start to combine a lot of measures that both articles used, you can see that you can build a fairly good picture of how to score your traffic.

That's right, you, the owner. After all, you have an opinion too.