I’m doing a bit of research – well a lot of research really – about various bots and how to keep the bad ones out of my site.
Apart from stealing content from the site, bots can have a serious and adverse effect on bandwidth, running it through the roof at times. Which in turn can have a costly effect on the site. Although the blog pages have spambot protection – there are other types of bot that can cause problems as well.
So I want to try n implement as much bot protection as I can without affecting genuine visitors to the site. Fortunately there appears to be a plethora of info and advice out there, I just need to trawl through it to find effective protection for the site. Even if I end up having to pay for a piece of software it could in the long run be valuable.
Most decent robots will read and obey the ‘htaccess’ file and robots.txt file – only the bad ones will ignore these instructions, even some legit bots will try and circumnavigate the robots.txt. So I need to try and bone up on what to put in that txt file. Also find out what I can implement safely via the ‘htaccess’ file as well.
Anyway, I’ve found some very useful sites, and copied the various ways of combating bots, and saved them as text files. Also saved a whole web page on this subject too. Will try and read and digest them at a later date, tomorrow if I can.
One in particular does have a very concise way of trapping bots, so I think I’ll have a look at that first because it makes fun of the bots and gives them a lot of work to do, while not affecting the server.
Gonna be a long trawl…….







