Automatic Thread Tagger
This modification is in the archives.
Automatic Thread Tagger
Description When a user submits a new thread this modification will automatically take keywords from the thread title and use these as tags. You can use Automatic Thread Tagger to propose the user AJAX tags for his new thread, or it assigns new tags after saving the new thread. It can add the translated thread prefix to the tags. Additionally, you can tag existing threads via maintenance and also scheduled tasks. This modification is a successor to the terminated Automatic Thread Tagger by MrEyes: Automatic Thread Tagger (Project Terminated) As an example, if a user submits a thread with a title of: "Fish Food for Cats!" The thread will be automatically tagged with: - Fish - Food - Cats If the user also submits an actual tag of "Fish" this will not be duplicated. Any rules you have setup for tagging will be respected. If you choose to do so this product will also automatically tag threads created by incoming RSS feeds. Demo I cannot show you the process of creation, but here is a list of tags generated by Auto Thread Tagger: http://www.insideearth.net/tags.php?langid=5 http://www.insidesupcom.de/tags.php?langid=1 Automatic Tagging of existing threads You can tag existing threads via maintenance or scheduled task/cron. They will be created with a special flag so they can be easily identified and deleted. Manual assigned tags are not touched. Maintenance is also working if Automatic Tagging is disabled via settings. Great if you want to test some settings. Automatic Tagging will take the date of the thread creation and also the userid of the creator. This process can be automated by running a scheduled job once a night. Please keep in mind that tags that were proposed via AJAX are not tagged as auto tagged and therefore cannot be identified as such (and therefore not deleted automatically). If you want to retain the auto tagged flag you should disable AJAX and enable the tagging after the thread has been saved. As an alternative way you can also disable this and let new threads be tagged in the night from the scheduled job. Installation / Upgrade 1. Upload all files from "upload" to your server, take care of the directory structure 2. Import "product-auto_thread_tagger110.xml" as a product, overwrite if it's already installed 3. Check settings 4. Run maintenance / Auto Tag Threads to tag existing threads (needed if you want to use the cron) After install, and by default the modification is disabled, this will allow you to play around with configuration before switching it on. Troubleshooting If you report a bug please post the thread title that created it, without this I cannot test it and improve the language parsers. * If no threads are tagged you will have to check the following: - Is the modification enabled? Is the action you are testing enabled? (vBulletin tagging, whole auto thread tagger system, AJAX, new threads) - Are the words you are using badwords or filtered out? * Cron/Scheduled Task is not tagging all threads. - The cron is limited to 500 (you can change this via settings) threads per run to avoid heavy impact on server. Make sure you run maintenance auto tagger before this to tag old threads. You can check the scheduled tasks log to see if it is running correctly. Important: If a thread title does not meet minimum requirements to be included in tags (f.e. one word thread titles, too short words), it will be forever in this queue. * I'm using polish, arabic, turkish, etc.. language and the tagger is not working like it should. - If not already replaced, replace the filter replacement '&'=>'and' with ' & '=>'and' (a space before and after &) Todo What comes next? You decide. Tell me what you are missing and I'll look if it can be integrated. Why thread title and not thread text? Parsing the thread text for tags is an extremely unlikely addition as this would require some fairly heavy processing to ensure quality of tags. What are Stopwords? Stopwords is the name given to words which are filtered out prior to processing of tags. The user Hostboard on vBulletin.org posted some resources regardings this: Spoiler (click to open)
Quote by Videx
https://en.wikipedia.org/wiki/Stop_words
http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words http://www.webconfs.com/stop-words.php While Google ignores certain words, I am not certain that other search engines might ignore the same words thus putting you at a disadvantage. Also the list changes so you could be at a disadvantage. If I was going to use this sort of list I would only use the one Google publishes. More info: http://www.seobythesea.com/?p=1109 http://searchengineland.com/080118-083645.php More threads on stop words : http://www.webmasterworld.com/forum89/1484.htm http://www.webmasterworld.com/forum89/1158.htm Welcome to the new can of worms... Close
History 1.2.0, 9th August 2008 - Fixed error with missing threadid's - Fixed error with AJAX and prefix - Fixed error with not indexing tags via cron - Added polish, spain, english stopwords - Compatibel with vBulletin 3.8 Download
This modification is archived, downloads are still allowed. |
Similar Mods
New Posting Features Automatic Thread Tagger (Project Terminated) | Modification Graveyard |