The website http://www.gokera.ch proposes a list of events in the French-speaking part of Switzerland. Its purpose is to recommend events for users taking into account the ratings of other events. In this project, we are interested only in event ratings and use a hybrid recommendation system based on two approaches which mixes the user similarity (collaborative filtering) and the decomposition of a product in features (content-based). In the case where we do not have user’s data, it is impossible to realize an efficient recommendation. This is the reason why we experiment with adding a social dimension, based on Facebook. The addition of this social dimension allowed us to get information on the user related with our features found during the event analysis process and this gave us the possibility to recommend events even if the rating history is empty. In a general way, the system works but has to be adjusted to run in a production environment, especially the text analysis.
Crawling event websites in order to have more information
The events available on Facebook didn't get often a good enough description. One solution to this problem was to visit all the pages of the specified website. We create a specific crawler for this task which analyses each page and creates indexes that will be used as features later.
Hybrid recommender system
The idea of collaborative filtering is to find other persons having similar tastes in order to propose to the user some similar products that might interest him. One weakness is that the user has to rate a lot of items in order to relevant neighbours. Another problem occurs when a new item is available: as long as nobody has rated it, it will be never recommended. The content-based filtering is focused on features of items which allows to define product similarity. One problem of this method is the over-specialization. In order to combine both advantages, we propose an hybrid recommender system where we combine the products rating of the user and all the features related as well as similar tastes of users.
Natural language processing
Analysing event description on Facebook as well as whole dedicated website is not a piece of cake. Lexicon, language detection, website analysis, n-gram, tokenization, lemmatization, features extraction vector space model (tf-idf) etc. were used in this project.