Forum:Multiple mini-article returns; smarter searching?

Forum: Index > Multiple mini-article returns; smarter searching?

So I was thinking about how I most often use search tools like Google, and I thought of a way that Wikia might be able to improve the quality of its search results. When someone searches on a specific phrase (for example, "Blacksburg VA"), a mini article is either found, or the searcher is prompted to add one.

But sometimes what I'm looking for is multi-related... so I may be looking for "Chinese Food", and I might want to find it in "Blacksburg, VA". As it stands, there would need to be a mini article specifically for "Chinese Food in Blacksburg, VA" for a searcher to obtain user-submitted data on the subject.

What if, however, the search was smart enough to draw associations? The term "Chinese Food" is in the search, so a high-traffic mini article of same name would be a valid return. And "Blacksburg, VA" would also be a high-traffic mini-article, so that should also be returned. ... But suppose there is a mini-article titled "Blacksburg, VA, Cuisine, Oriental" -- it's not a perfect word-match, but it's absolutely a relevant result!

Now, if that last article doesn't exist, the system should be smart enough to say "if 'chinese food' and 'blacksburg va' are both high-traffic articles, I should check to see if there is an article which combines these or other similar words and return that to the user as well".

Basically, I'm proposing that the "search" tool also crawl the content and strength of mini-articles and return multiple hits for the user to sort through. And of course, a relevance rating system is applicable here also; this way, the system learns whether or not it presumed the correct association between "Chinese food" and "Chinese cuisine" or "oriental cuisine". To you and I (humans), the connection is obvious. A tool like Wikia has to learn that, and I think...

  • returning multiple mini-articles
  • letting users rate them
  • building connections between the articles

is a step in the right direction.

--Vtmemo 21:39, 10 January 2008 (UTC)

This could be very nice. However I think it makes sense to have a single main mini-article related to each set of search terms as this encourages people to contribute the results of their searching by making it very easy to create a mini-article - i.e requiring no thought or knowledge. If one were to add the feature to search mini-articles it would have to be done so as to

  • Not interfere with the displaying and editing of the one main mini-article for each search.
  • Not interfere with the displaying of the search results.

Also is it likely that other mini-articles will be relevant? Miniarticles, I'm my mind at least, are intended to be short and very specific to the search terms. So the mini-article for "Chinese Food in Blacksburg, VA" will be completely different from the results for "Chinese Food" or "Blackburg, VA.", and it's unlikely that they will be relevant in any way.

Though, on the other hand, I see that identifying "Chinese Food in Blacksburg, VA" , "Oriental Food in Blacksburg, VA", "Thai Food in Blacksburg, VA", "Blacksburg, VA Chinese Restaurants" etc would be very useful - and there is a potential problem with some searches with identical indent having very different search terms. E.g "Removing packages with apt-get", "apt-get removing packages", "how do I remove packages with apt-get", "packages removing apt", "Packages removing apt-get ubuntu". I suppose it all depends if the articles presented were relevant. (Taw 3:15, 11 January 2008 (UTC)))

Taw - my hypothetical probably wasn't the best choice for an example. Perhaps someone searching on your terms would do well to receive results tailored for "apt-get", the act of "removing packages", and "apt-get ubuntu"... if those topics existed in three mini-articles, they would all be pretty relevant returns.
At any rate, I completely agree with the "synonymous search terms" idea. And I think this is probably going to come down to how well the rating system works. However, since we can add categories to mini-articles, it could be left to the users to manually tag mini-articles with keywords. As it stands, the search only finds an exact match in the title of the mini article, but humans can specify keywords pretty easily. Kind of like YouTube, but... better. *chuckle*
What do you think about giving users the ability to manually draw connections between articles of similar content or intent?
--Vtmemo 15:49, 11 January 2008 (UTC)

My point was more that just because a mini-article was on a similar topic doesn't mean that is relevant for the search term. So for example if you type in 'chinese food' you get information which is in no way relevant to your query, because you aren't at all interested in finding out what chinese food is, only where you can get it near Blackwell. But it might be that your system could be intelligent enough to make sure the only suggestions provided were actually similar.

Having human specified connections between similar articles would definitely be useful when searching. One option might be to allow the user to specify a set of equivalent search terms when they edit a mini-article. So when editing the article for 'area of a triangle' someone could manually enter the equivalent terms 'area triangle' 'triangle's area' etc. How would this work together with an automated system? Also do you think that the relationships of "similar search terms" and "search terms with identical intent" should be distinguised or should you just have shades of grey? Also do you think that synonymous search terms should have a single mini-article?

My first point was that it's important to have a single mini-article for each query so that users can enter the that they find without having to think - if another miniarticle replaced the missing one this might be less inclined to happen. (Taw 20:44, 11 January 2008 (UTC))

I see your point when it comes to multiple mini-articles for otherwise irrelevant topics (e.g., "history of chinese food" wouldn't be a good return).
For your second point, there's the question of disambiguation; what if I search on "ford" - am I looking for Ford Motor Company? Gerald Ford? Ford theater? Fording a river? Here's a case where multiple mini-articles would be a benefit, and Wikia could simply allow users to rate the strength of returns and thereby minimize erroneous results. If 80% of searchers said "ford motor co.", 19% said "Gerald Ford", and 1% said "fording a river", the last result could be omitted, and the former two could be ordered respectively according to their rated strength.
It's the data-equivalent of Google asking "did you mean to say... ______?"
--Vtmemo 20:58, 11 January 2008 (UTC)

This seems quite cool. It's a kind of automated disambiguation - using more complete search terms for the more exac meaning. How would this play with other forms of disambiguation - like disambiguation in mini-articles? (Taw 21:52, 11 January 2008 (UTC))

The mini-articles would simply have to be more specifically-named, and the disambiguation would be automatic. So, if a user searched on a term that was echoed in multiple mini-articles, the multiple returns would act as disambiguation. The articles themselves would have to be made more specific, and disambiguation would not rest in the article itself, but in the search tool (this would also serve to centralize the process of searching to one, simple user input - "I want to know more about ________.")
I'm not saying that we have to have a full disambiguation page for every subject -- which Wikipedia doesn't anyway -- but if disambiguation needed to take place, it could occur automatically at the user end, and in a more sophisticated way than just a list of possible hits.
--Vtmemo 22:01, 11 January 2008 (UTC)

Retrieved from "http://search.wikia.com/wiki/Forum:Multiple_mini-article_returns%3B_smarter_searching%3F"

This page was last modified 22:01, 11 January 2008. GFDL