A couple of years back I had the opportunity to have a mail conversation with a person working for Bluecoats webfilter team.
I had raised a question regarding why all their checks was done from a IP range registered to them (truthfully is was a range belonging to Cerberian which was a webfilter company acquired by BC).
Their answer was not satisfying as they told me that they had not seen any reason or attempts to block their check based on their IP range. This was a round the time of the Storm botnet and my experience at that time was clearly different.
I'm in no aspect a professional malware "hunter", but I do basic malware analyses quite regularly. And I was forced quite early to start using multiple user agents and or IP addresses to be able to download a sample. So the use of a static IP range was strange to say the least.
As this was a couple of years ago and I have not really given it any thought on how the different commercial webfilter and online scanners, like then ones different AV vendors and security researcher make public works since then. Many whom I tend to use quite regularly. How they solve the challenges with malicious sites only serving content to specific user agents, geographic location etc.
Bottom line is that you now a days only get one shot to retrieve content from a malicious site from a single IP (OK not every time but to often to be ignored) also you must be using the right components (OS, Web browser, plugin etc) to be able to "access" the malware
So I decided to check how a handful of different online resources did when they verified the content of a site. I installed a basic web site on a server that I had to my disposal.
The next step was to scan the site from the different resources(14 different) and then verify the result using the web logs.
I did multiple scans with the same solution to see if I would get different results. And the result reviled that it would not require all that much work to assemble a working blacklist to prevents these solutions from looking at the content of my site. Against webfilters this could also be used in a less malicious way for site owners to display false content to them, thus circumventing their category blocking.
Result:
* Most used IP ranges registered to the company in question
* The solutions that was using Google infrastructure had appIDs that made them easy to fingerprint.
* Some had the service name in their user agent
* Most only used one user agent variant.
The problem as I see it, based on the result was that it was quite trivial to build an effective blacklist to prevent the companies from looking at your content. I did not see any evidence of a really efficient analyze with multiple IPs and multiple user agents to maximize the likelihood of being served malware.
This might this be a case of me wishing that these service would be something they aren't? Not sure
, but quite a lot of people(myself included) look to these services to either protect us (webfilter) or to give us a verdict if the site is OK or not.
My initial thought was that I might be missing something(this would not have been the first time..),
As a prof of concept I modified the site to show different content to different visitors (applying the blacklist I had created), this would be what I would have done if I was trying to increase the life span of my malicious site.
The longer it takes for me to get detected the more installs I get
And with the exception of bonus visitors from Virustotal "partners" I was able to apply the blacklist successfully.
Note: If you think that this is me giving malware distributors a nice manual on how to evade detection. Then you are "flattering" me. But this is hardly something "they" have not thought of before. In fact here is a nice write-up of a exploit kit that does exactly this: http://t.co/KRfPGQHh.
Tested services:
- vscan.novirusthanks.org
- IP: static
- User agent: "-"
- Note: Seems specialized to scan a direct file and not a whole "site" ..
- vurldissect.co.uk
- IP: The user can choose between a number of location from where the scan can be performed. Which is great, but in the end it only delays the blacklist creation as they are static
- User agent:
Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
- Note: Even thought you chose one location there will also be requests from "the main location" followed by request from the location you chose. Also some request will include "http://vurldissect.co.uk" in the referrer field.
- IP: static (registered to Comodo)
- User agent:
- onlinelinkscan.com
- IP: static
- User agent: "-"
- IP: Somewhat static (registered to Google)
- User agent:
- Note: The appID is a tell tell sign =). As a bonus, if you check your "malicious" site with Virustotal you do attract the attention of others. For example Panda AV(Pandasecurity) who also likes to scan from their own IP range. But there are others who use dialups and changing user agents, so that was positive. I even got one who seemed to be using SUSE which is novel(phun intended) and the person/script behind those requests did a full scan which was admirable!(but the result should be a bit limited due to the "lack" of Linux malware) but would catch a injected iframe or similar.
- zulu.zscaler.com/ (zscaler.com)
- IP: Multiple (one registered to zscaler.com) - but the IP addresses are static
- User agent:
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
- www.unmaskparasites.com
- IP: uses Google services
- User agent:
- Note: The appID is yet again a give away.
- Webpawet
- IP: Random /24 address registered to the University that hosts the solution
- User agent:
- Note: You are able to set not only referer but also headers
- URLQUERY.net
- IP: static and registered to a security company
- User agent:
- Note: URLquery allows you to set referer and change user agent
- Sucuri
- IP: Multiple IPs(2). IP number one, is most used (points to the scanning service). IP number 2 could be a manual verification attempt.
- User agent:
or via google
"googlebot"
- Note: The referrer field includes the name of the scanning service used.
- aceinsight.websense.com (Websense)
- IP: Multiple IP's (one more frequently used then the other, could be a manual verification that generated hits from IP number two)
- User agent:
- Bluecoat
- IP: static (static net) Registered to Bluecoat
- User agent:
- gred.jp
- IP: Static IP
- User agent:
- Note: Has a option to scan all links which is great.
- Trend Micro
- IP: Multiple IP addresses from a /16 registered to Trendmicro
- User agent:
- Misc
- The people at Bitdefender does site checks from IP's registered to their company..
- The same goes for AVG
So what can be done?
As a provider of services:
* Balance your checks between multiple IP addresses. Get your self a "large" amount of cheap connection (ones with dynamic IP addresses would be preferred, VPN services, TOR. The bottom line is to make creating a blacklist to hard/costly/ineffective.
* Be as compliant regarding user agents settings as possible, as the purpose is mimic regular user visits. Mix and shake well!
As a users:
* (Try to) Download the malware yourself and then use services like Webpawet or Virustotal to learn more.
Update 1: I was rightfully informed that webpawet allows you to set referer and headers. URLquery also allows you to set referar and change user agent (thanks @c_APT_ure for pointing this out)
/Micke