Unfortunately I find myself in a position where continued development of this modification is not possible. So as of today (29/05/08) this project is terminated. If you are a coder feel free to take this project on and take it from beta to gold.
You will be please to learn that this modification has been taken over by Phalynx.
Automatic Thread Tagger v1.0
Beta 3
THIS IS A BETA MODIFICATION
Please read the entire post before diving in
If you report a bug please post the thread title that created it, without this I cannot test it and improve the language parsers
When a user submits a new thread this product will automatically take keywords from the thread title and use these as tags.
As an example, if a user submits a thread with a title of:
The thread will be automatically tagged with:
If the user also submits an actual tag of "Fish" this will not be duplicated. Additionally this product hooks in before vBulletin processes tags so any rules you have setup for this will be respected. It also checks if you have tagging enabled, if not the product does nothing.
If you choose to do so this product will also automatically tag threads created by incoming RSS feeds.
And if all that wasn't enough there is a whole load of different configuration options.
Installation
If you have a previous beta version of this modification install this MUST be uninstalled before proceed to install this version
- Download the modification zip file to your local machine and extract the contents.
- Upload the functions_autotagger.php to your includes folder (the same directory as your config.php)
- Goto "ACP -> Plugins & Products -> Manage Products".
- Click "Add/Import Product".
- In the "Import Product" section, click the browse button and select the downloaded XML file (product-auto_thread_tagger.xml).
- Click "Import"
After install, and by default the modification is disabled, this will allow you to play around with configuration before switching it on.
Configuration
Configuration for this product can be accessed via:
ACP -> vBulletin Options -> Automatic Thread Tagger
Each configuration entry has a description against it, so I won't rehash it all here. However the following configuration items are worth mentioning:
Disable Auto Tag if Tagged?
If this option is enabled and a user submits tags with their post then the title keywords will not be added.
Use Smart Quotes?
If this option is enabled (default) then the auto tagger will treat quoted terms as a single tag. For example if a user submits a title of:
This is a great technical website
The tags will be:
However if they submit a title of:
This is a great "technical website"
The tags will be:
Filter / Replacement Characters
This is the heart of the modification, as we are dealing with human input it can be rather complicated for a machine to handle and parse meaningful keywords. This configuration area allows you to configure filter and replacements for incoming thread titles. For example, you might not want
& in your tags, through the options here you can filter it out or replace it with
and.
In addition to this you can also use filter/replacements to extend tags. So if you run a motorbike website you might want to extend the term "GSXR" to "Suzuki GSXR" or change it completly to "Fast Bike".
Filters/replacements do not change titles, they only effect the resulting auto tags.
This configuration area is pre-populated with a series of default rules, these rules are based on my experience during development and should cover most things. However as mentioned this is a beta modification and we are dealing with human input so with your help and input this list will be improved.
Composite Tags
With composite tags you can define a collection of words to use a single tags. So if for example you specify a composite tag of "brown dog", and the user submits a title of "I love my brown dog" the tags will be "love, brown dog" without this composite tag the tags would be "love, brown, dog"
RSS Feed Additional Tags
If you use this modification to tag RSS created threads, you can also specify additional tags to add for each RSS feed. So for example if you have a feed from BBC News, in addition to the title based tags, you could also tag each thread with "General News".
In summary the other configuration values allow you to:
- Globally enable or disable the product (default is off).
- Exclude forums by id
- Exclude usergroups by id
- Exclude user by id
- Exclude words from the search words stop list (default is on). This excludes words like can, by, do, etc (open includes/searchwords.php for a full list). I would strongly recommend not disabling this, however you could and then use the following:
- Define your own list of exclude words.
- Exclude RSS feeds by feed ID.
Using this to tag RSS feeds
Unfortunately vBulletin does not include (that I can see) a hook that is called when an RSS thread is created. This means that it is not possible to hook into the code and perform auto tagging. All is not lost though, however it does require an edit to one of the default vBulletin files.
This is what you need to do:
- Open includes/cron/rssposter.php
- Find the following line:
Code:
$itemdata->set('ipaddress', '');
- On a new line immediately after this paste in:
Code:
require_once(DIR . '/includes/functions_autotagger.php');
$autotags = GetAutoTags($itemdata->fetch_field('title'), "", true, $itemdata->fetch_field('forumid'), $feed['userid'], $item['rssfeedid']);
$itemdata->set('taglist', $autotags);
- Save the file and place on you server.
Once this is done, and you have enabled RSS tagging in the configuration then RSS threads will be tagged.
I will find a solution to this as I really don't like the idea of having to edit files in this fashion. So if anybody can suggest a better way of doing this please speak up as it will mean I can get the solution out quicker.
Other Information
At the moment this product is most definitely in beta testing, once I and the people using it are happy with the product then I may consider releasing a hack that will tag all existing threads. Releasing this right now would probably cause more trouble than it is worth.
Parsing the thread text for tags is an extremely unlikely addition as this would require some fairly heavy processing to ensure quality of tags.
At the moment I am primarily concentrating on the parsing of human language and refining the filters/replacements. So adding Ajax support (i.e. auto populate the tag box as user enter words into the thread title box) is on the todo list however this won't happen until we are happy that the parsing system is as good as it can be.
One thing the product does not do which you may want to consider is change the "Separate tags using a comma." phrase used on the new thread post screen. You might want to consider changing this to something like "Separate tags using a comma, do not include words from your title". This depends on your personal preference and how you have the product configured.
Disclaimers
Obviously the input this product uses comes from human beings, therefore it is almost impossible to code for every single possible scenario. As such there is always a possibility that a specifically crafted submission may cause problems. I strongly suggest running this through some tests before committing to installing this on a live forum.
Ahh yes, one final thing - this is the first time I have created a product XML file, I have tested and install this and all seems fine - but it would be great if somebody in "the know" could cast an eye over it for obvious errors/issues.
Actually, one more one final thing, if you install it please click install
Hold on, one more one more one final thing, if you really like this mod then why not nominate it for MOTM
Release History
1.0 Beta 1
First Release
1.0 Beta 2- Added filters/replacements
- Added RSS support
- Added RSS inclusion/exclusion support
- Added composite tags
- Added ability to tag all RSS threads with a specific tags (per RSS id)
1.0 Beta 3
Bug fix release
- Fixes issues with escaped characters
- Removed option to process filters on full title rather than potential tags.