(713) 568-2763

How to Block the SEMALT crawler

How to Block the SEMALT crawler

What is SEMALT?

If you haven’t seen it already in your referral traffic, trust me it’s there. SEMAlt purports itself to be a “a professional webmaster analytics tool that opens the door to new opportunities for the market monitoring, yours and your competitors’ positions tracking and comprehensible analytics business information.”

What is it really? You’ll get various opinions as you search for answers across Google (this post from Nabble is the most comprehensive and researched I’ve found) – but the overall consensus is that it’s spam – more specifically, they are employing malware to crawl the web and spam server logs, potentially ruining your Google Analytics data. This referral spam is used by SEMAlt to drive traffic to their website, apparently to get users to sign up for their paid monthly service.

If you go to your Google Analytics and look at Acquisition > All Referrals, you will see SEMAlt referral traffic there. If you don’t, you probably will soon. In the image below – you’ll see the referral traffic data from a brand new website. SEMAlt is over 75% of the referral traffic to the website – with a 100% bounce rate. This is definitely screwing with our data!
SEMAlt Referral Traffic on a Brand New Website
Bottom line: It’s bad news and you should take steps to block it from your website.

How can you block SEMALT?

There are several ways to block the SEMALT crawler, but two that I’ve tested and can vouch for involve applying Google Analytics filters or editing the .htaccess file for each site. Do not try and use the tool provided by SEMalt, I’ve never seen it work.

How to Block SEMAlt Via Google Analytics

Blocking SEMAlt via Google Analytics doesn’t really “block” the crawler – this method simply filters the crawler out of your analytics data. It will clear all the future referral traffic from your view, giving you a clearer picture of your referral traffic (and overall traffic).

Here is a quick video that shows you how you can remove this referral spam from your future data in less than 2 minutes:

Steps to Block SEMAlt via Google Analytics

  1. Login to you Google Analytics account.
  2. Click on ‘Admin’ along the menu at the top of the screen.
  3. Click on the Views on the far right – create a new view, labeled “SEMAlt Exclusion”
  4. Click ‘Filters’ which is located in the far right-hand column. If you cannot see Filters this means you don’t have administrative access rights and that’s a different issue.
  5. Click ‘New Filter’
  6. Make sure that ‘Exclude’ is selected and ‘semalt.com’ is entered into the Filer Pattern field. The filter will also block all sub domains of SEMAlt such as 34.semalt.com as well as the main domain.

Click Save and you are done! You have now excluded SEMAlt from your referral traffic data. Once you have your filter live, keep in mind that it will only filter the data from this point forward. It does not retrospectively filter the visits out. This is why we created a new view with this filter first – this allows you to ensure that it’s a valid filter before applying it to your other Profile views.

How to Block SEMAlt via the .htaccess file

This method is my most used method as it blocks any and all referrals from the SEMAlt domain from accessing your site. This method will remove all of the future referral traffic from your analytics without the need for the filter outlined above. I prefer this method as it prevents the spammers from accessing your site at all, versus simply stripping the data out of the analytics.
To do this you add the following code to your .htaccess file for your site:

Add this to your .htacess file

# block visitors referred from semalt.com
RewriteEngine on
RewriteCond %{HTTP_REFERER} semalt.com [NC]
RewriteRule .* – [F]

If you have a WordPress site, there are plugins that will allow you to edit the .htaccess file in the Theme editor – otherwise, you will most likely need to access your .htaccess file via FTP.
Once the code is added to your .htaccess file & uploaded, you are all done. Check your analytics moving forward and you’ll see a decrease in the referral traffic & overall bounce rate.

Aftermath of Blocking the SEMAlt Crawler

SEM Alt Crawler - Bounce Rate Improvement

In this example, one day after blocking the SEMAlt crawler via .htaccess we saw huge improvements in the data. The site bounce rate went from an average of 80% when SEMAlt was hitting the site to an average of 10%! That’s an improvement of over 700% in bounce rate data alone – all thanks to removing one pesky crawler from the data.


Nick Lindauer
Nick is the vice president, client services and operations at Forthea. He’s been working in internet marketing since 2002, when – ironically – he answered an ad in the newspaper. When he’s not at work, he’s off spending time with his family, working on his house, building furniture, cooking on his two Big Green Eggs & brewing hot sauce.

5 Comments

  1. Boris 2 years ago

    Hmm, I used the htaccess part of the script and now my site is down….

    • Nick Lindauer Author
      Nick Lindauer 2 years ago

      Probably a typo in the htaccess file – delete the file via FTP and start over – your site will come right back

  2. Alex 2 years ago

    Hey Nick,

    Read your article, I had the same thoughts as you about blocking the results with a filter. I mean the way GA works it is suppose to capture the data, and process it against any filters/rules/views before reporting it. For some reason I was still seeing the spammers referral traffic. I also tried the .htaccess trick and it worked for a bit but not entirely. I then did some more digging and think the spammers are just using a script to randomly send HTTP requests to random users UA tracking ID’s. There are a few blogs out on the web reporting no instances of spammers showing up in the logs. I wrote up a blog post similar to yours and then found more info. alexzerbach.com/how-to-remove-darodar-spam/. This means, they are sending referral traffic but not actually visiting your site, thus tricking GA. Their intentions? Probably just to Google their name (as they redirect most of their traffic). What do you think of all of that?

    • Nick Lindauer Author
      Nick Lindauer 2 years ago

      That’s exactly what they are doing, they never hit the site – but they are trying to get you to go to theirs in order to sell you a tool. They continue to change domains/referrals and I’m doing an updated post soon to cover that

  3. Mitchell Krog 5 months ago

    Instead of .htaccess try this approach of mine. It uses one centralised .conf file which is loaded into memory by Apache once and therefore does not place load onto Apache by having multiple .htaccess files all over the place. This is also easier to maintain and I update it almost daily with new bad referers and user-agents found in my logs across 3 web servers and 27 sites.

    https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker

    I also have a fully extensive SEMALT block in my conf file.

Leave a reply

Your email address will not be published. Required fields are marked *

*