Ban Spiders by User Agent
What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent! How to install Simply import the product ban_spider, the mod is active by default but none of the other options are turned on. What is a UserAgent? https://en.wikipedia.org/wiki/User_agent Understanding a UserAgent string http://user-agent-string.info/parse Genuine User Getting Blocked? Spoiler (click to open)
This mod can ONLY block those useragents that you have entered in the list, firstly get your user to go here http://whatsmyuseragent.com/ (via his phone) and find out what his useragent is then you go here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and paste his UA string in and test it to see if you get denied or not.
Something in his useragent string is in your list so it's not the mods fault as it's banning what you ask it to Close
Tools to help http://whatsmyuseragent.com/SwitchingUserAgents.asp http://www.botsvsbrowsers.com/SimulateUserAgent.asp FAQ Spoiler (click to open)
Quote by ForceHSS
It wont add prefixes as they are added when the forum loads, your actual url stays the same, a prefix is never added to them - have you ever seen a url like this http:http://www.mysite.com/showthread?t=[solved]12345 ???
Quote by ozzy47
You added your site monitoring service as a bad bot? bad move!, remember we're sending them a 301 which is a permanent redirect, if you don't see them back in a week check with them, you may ask for your url to crawled again.
Quote by GreyGhost
Right, firstly, thanks they're now doing great , your "Track Guest Visits" mod will ALWAYS show the spiders but your native vBulletin WOL will not, the reason why the TGV mod picks them up is because they are actually accessing your site (so that mods doing it's job and recording them) but my mod prevents them from having their request completed i.e direct request for a url is a forum access but they are redirected permanently before the thread loads (so my mod is ALSO doing its job )
Hope that clears things up for you all. @GreyGhost i'll PM you details of the beta Close
What's a bot? https://en.wikipedia.org/wiki/Spambot How do i ban a bot? Spoiler (click to open)
Blocking spiders is all about personal choice, do a little research and find out whether you want to cater for that country and whether they add value to your site!, when Deepnet Explorer are visiting go to who's online and at the bottom there's a dropdown box for "Show Useragent?" select Yes, then check out their useragent, you can enter any or all of the UA string, so if they actually do have Deepnet in the UA then you just enter that on its own line in the list
Close
Spoiler (click to open)
as a side note you don't need the full useragent string anymore to ban them, you can now enter any part of the string:
e.g bai will result in baidu being banned just as will any string containing "bai" Entering Mozilla will result in every useragent string containing that to be banned. So, entering the full bot name but not useragent string will do, enter Baidu for that spider, dont enter Ya as something to ban as Yahoo will be banned just as Yandex will. Close
Where's output.txt located? Spoiler (click to open)
The output.txt is generated as bots found in your list attempt to call a forum or thread, there's no time lag and the file should be created straight away. If you have no cms then the file should be available at http://www.mysite.com/output.txt if forum is in a folder then something like http://www.mysite.com/forum/output.txt
Any issues post back and i'll deal with them for you Close
Bad bot lists Spoiler (click to open)
Try these:
http://www.forumpostersunion.com/index.php?t=1644 http://www.vbseo.com/f34/how-create-vbulletin-bot-scraper-trap-47378/index4.html But to be honest it's nothing that a little googling or binging wont reslove Close
Spoiler (click to open)
try this list works well for me
Baidu almaden Anarchie ASPSeek attach autoemailspider BackWeb Bandit BatchFTP BlackWidow Bot\mailto:email Buddy bumblebee CherryPicker ChinaClaw CICC Collector Copier Copyscape Crescent DIIbot DISCo DISCo\Pump dotbot Download\Demon Download\Wonder Downloader Drip DSurf15a eCatch EasyDL/2.99 EirGrabber EmailCollector EmailSiphon EmailWolf Express\WebPictures ExtractorPro EyeNetIE FileHound FlashGet FrontPage GetRight GetSmart GetWeb! gigabaz Go\!Zilla Go!Zilla Go-Ahead-Got-It gotit Grabber GrabNet Grafula grub-client HMView HTTrack httpdown .*httrack.* ia_archiver Image\Stripper Image\Sucker Indy*Library Indy\Library InterGET InternetLinkagent Internet\Ninja InternetSeer.com Iria JBH*agent JetCar JOC\Web\Spider JustView larbin LeechFTP LexiBot lftp Link*Sleuth likse //Link LinkWalker Mag-Net Magnet Mass\Downloader Memo Microsoft.URL MIDown\tool Mirror Mister\PiX Mozilla.*Indy Mozilla.*NEWT Mozilla*MSIECrawler MS\FrontPage* MSFrontPage MSIECrawler MSProxy Navroad NearSite NetAnts NetMechanic NetSpider Net\Vampire NetZIP NICErsPRO Ninja Nutch Octopus Offline\Explorer Offline\Navigator Openfind PageGrabber Papa\Foto pavuk pcBrowser Ping PingALink Pockey psbot Pump QRVA RealDownload Reaper Recorder ReGet Scooter Seeker Siphon sitecheck.internetseer.com SiteSnagger SlySearch SmartDownload Snake sogou Soso SpaceBison Spinn3r sproose Stripper Sucker SuperBot SuperHTTP Surfbot Szukacz tAkeOut Teleport\Pro URLSpiderPro Vacuum VoidEYE vBSEO Web\Image\Collector Web\Sucker WebAuto [Ww]eb[Bb]andit webcollage WebCopier Web\Downloader WebEMailExtrac.* WebFetch WebGo\IS WebHook WebLeacher WebMiner WebMirror WebReaper WebSauger Website Website\eXtractor Website\Quester Webster WebStripper WebWhacker WebZIP Wget Whacker Widow WWWOFFLE x-Tractor Xaldon\WebSpider Xenu Yandex Yeti YOUDAOBOT Zeus.*Webster Zeus Close
Spoiler (click to open)
Quote by meaters
And with only the addition of, per line:
MSIE 1 MSIE 2 MSIE 3 MSIE 4 MSIE 5 MSIE 6 You end 99.9% of all spam bot registration attempts and cut garbage traffic even further. Here's my entire ban list for this Mod: baiduspider beta.statsit.com statsit SiteIntel Yandex GomezAgent FunWebProducts MSIE 1 MSIE 2 MSIE 3 MSIE 4 MSIE 5 MSIE 6 w3m Close
VB4.x Version of Ban Spiders By User Agent Tested on vb3.7.x, vB3.8.x but should work on any version. ____________________________________________________________________ Special thanks to: Lior KH99 BoP5 for helping me sort out a few issues ...and beta testers ForceHSS (Special thanks to Force for latest testing) ozzy47 GreyHost If you use this please mark as INSTALLED History 9th June 2011 Orginal xml added 12th June 2011 Added both email notification and text file logging 22nd June 2011 Version 2.0.0, Added create thread on activity
08th October Beta testing started for thread creation. 20th October Beta testing started for emailing. 21st October Beta testing complete Ver 3.0.0 uploaded 29th October minor fix added to cope with empty userid on thread creation 30th October Beta testing automatic redirection to spiders/bots IP 31st October New xml uploaded with automatic redirect to IP 25th November Minor fix for blank forumid fixed 26th November 2011 Fixed version check & create thread Off by default 17th December 2014 Version 3.1.0 uploaded, Extra logging and statistics added by Ozzy47 (Chris) 18th December 2014 Version 3.1.2 uploaded, due to rogue process from other mod 18th December 2014 Version 3.1.3 uploaded, due to previous one being VB4 mistakingly uploaded The Bad Bots list is now included in the product Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" : Support will now only be given to those who have this mod marked as INSTALLED Download
product-ban_spider.xml (30.5 KB, 148 downloads) Supporters / CoAuthors
|