Posts Tagged ‘web’

Semantic Web Features in Real-life

September 15, 2008

There is a lot of discussion about the semantic web and how it will change the way we use the internet. At Groupswim, we have been considering how to integrate semantics into our feature set. Although we have not promoted it aggressively, we have integrated some features and we have also done some prototyping for future features. That said, we are always concerned about overwhelming the users with incomprehensible features, and we rather try to hide the complexity.

Semantics mean for us that users can associate meaning with items such as discussions, files, wiki pages, tags etc. to improve search and comprehension of information. As an example, GroupSwim currently has a feature called “tag training”. The name is somewhat minimizing the potential of the feature. The feature allows a GroupSwim site to create an ontology of concepts. For example, if you have a site concerned about collaboration and information in the mobile telephony sector, you can relate different terms to each other. “Sony-Ericsson” is a “company”, “Sony-Ericsson” is a “mobile-manufacturer”, Nokia is a “mobile-manufacturer”, “Nokia-N95″ is a “mobile-phone”, “mobile” is a synonym for “mobile-phone”, “Motorola” is a “mobile-manufacturer”, “mobile-produce” is synonym to “mobile-manufacturer”, “mobile-company” is synonym for “mobile-producer” etc.

We realize building a full world encompassing ontology is nearly impossible. So our focus is to enable a site to create ontology for its specific topic area. This ontology then helps the users of the site find information. Let’s say you look for posts about mobile companies. You happen to know about Nokia so you search for that. Obviously you will hit all information tagged with “Nokia”. Thanks to the aforementioned ontology, Groupswim can propose widening the search to “mobile-manufacturers” and “mobile-producers”. GroupSwim can also infer that “Sony-Ericsson” is also a mobile company. So by building up semantic ontologies, GroupSwim can greatly enhance the user’s ability to find related information. Below is a screen from Groupswim showing parts of the ontology we have built up for the GroupSwim internal development collaboration site. It greatly enhances search and discovery by relating otherwise unrelated terms.

Simple Ontology for Groupswim site

Simple Ontology for Groupswim site

These definitions enable search for any information related to competitors. If you search for competition, you will find information about competing companies even if they are not tagged or annotated with the term “competitor” or “competition”. GroupSwim can use the ontology to automatically find the related information.

Going forward, we have plenty of ideas on how to incorporate more semantic knowledge into GroupSTowim. In order to not reveal too much of our trade secrets, I will just give you a flavor of some simple things we are prototyping while also releasing new major functions. One such function is that we can integrate with external semantic databases and allow users to make associations between terms and elements in GroupSwim and semantic definitions outside of GroupSwim. As an example, let’s assume you type in the tag “Volvo”. By retrieving semantic information, we can ask the user to clarify whether they intend the company ‘Volvo Cars” or a specific Volvo car, or perhaps the Latin word “Volvo”. Such an association will clearly help users to search and find information in GroupSwim.

Semantics for Volvo

Semantics for Volvo

Looking forward to hearing how you think semantics can help us improve your GroupSwim experience.

User value and complexity in community search.

March 14, 2008

Search is a continuously evolving field of innovation. Within the last 10 years, mainstream search has improved significantly through the introduction of statistical methods and collective intelligence. Google is of course the main example of how successful an improved search service can be. Improved collective intelligence functions, such as those introduced by Amazon, show how much users appreciate getting relevant and related information automatically. As the amount of information available on the web increases, so does the importance of search and filtering. This development shows in the number of new search companies popping up and starting to apply advanced techniques. The approaches vary widely. Some companies such Qihoo mine communities for questions and answers. Other services such as Quintura focus on cognitive models and visually driven search functions. Yet others such as Hakia and PowerSet focus on natural language analysis. In addition, new services such as Twine and Freebase attempt to add semantic meta data so that applications can intelligently find semantically related data. These companies all hope to be the next revolutionary search function, providing a user experience that will overshadow existing search services.

I have always been a proponent of applying techniques and technologies based on the problem at hand. I remember as a student when I met researchers who insisted on developing everything in Prolog or Scheme regardless of the problem at hand. Or object fanatics who could not accept any other tool than Eiffel or Smalltalk. In my mind, the best solutions are built by combining the techniques to highlight the best aspects of each one. I think most developers would agree with me here, but it can be very difficult to build such systems unless you have a good understanding of the problems you are trying to solve. The trick is to delay selecting the specific tool or technology until you know how you want to approach the problem.

In GroupSwim, we use a toolbox including proximity search, natural language processing, tagging and semantic web components to implement search functions. Each of them is used to solve specific aspects of our search problem, and we often combine them. Tag search is obviously a very popular search method in GroupSwim. To facilitate and improve tag search, we automatically perform natural language analysis on the data that is not tagged. We then auto-tag the data so that it will be included in tag search. The quality of these generated tags depends on both the language analysis, techniques from text summarization, language analysis using synonyms and hypernyms as well as community specific ontologies. We use the same techniques to do real-time natural language analysis and suggest appropriate tags to simplify the task for users. But it does not stop there. We use semantic web and linguistic information to find and suggest related information helping users narrow or widen their searches along semantically meaningful dimensions. Communities in GroupSwim are able to create their own ontologies for that purpose, and even if they do not, we apply universal ontologies to help users.

GroupSwim is also different in how we apply search in very problem-specific ways. Our focus is not on the general search problem.  Rather, our objective is to help communities for organizations and companies find information related specifically to their organizational and business needs. This makes it easier for us to apply natural language techniques and semantic web technologies. It also means we will be able to provide a superior search experience.

The figure below positions GroupSwim’s search function with respect to other available search functions. To do this, we identify three dimensions along which we characterize the functions. The first dimension is the degree of semantic awareness that the search functions have. We simplified the diagram so that the dimension has basic text search on one end and artificial intelligence based search on the other extreme. The second dimension concerns whether the search is intended to address the universal search problem at one extreme or one specific problem at the other. The third dimension concerns the domain specificity of the search function. A search such as Yahoo is very general and applies to any domain. One could easily imagine searches that are very specific to a problem domain such as medicine using domain knowledge to improve the search. GroupSwim enables communities to continuously add domain data to their community and thereby improve the search over time as the domain knowledge evolves.

http://img266.imageshack.us/img266/7060/semanticsfocussx8.gif

In summary, GroupSwim offers a balanced approach to search technologies, focusing on solving specific business related search problems in a superior way. We also enable communities and the system itself to improve and evolve search by building up and leveraging semantic web data created within or outside of GroupSwim. We believe this is the best way to introduce next generation web search and discovery techniques for our customers.