Tuesday, November 17, 2009

PAWS meeting - Nov 10, 2009

In today's meeting, there were two presentations. In the first one Rosta presented an introduction to the theory behind people's participation in organizations, which comes mainly from the organizational domain. Following, she presented a study, conducted in two steps, about socialization tactics on several WikiPage projects. The study was focused on the participation of contributors (not on viewers). In the first part of the study it was observed the behavior of the users. in the second part was measured the impact of socialization techniques specially on newcomers. One interesting result was that personalized messages produced more participation than standardized messages on newcomers.

In the second turn, Denis presented the paper "The Effect of Correlation Coefficients on Communities of Recommenders". This paper was written by Lathia, Hailes and Capra from the University College London. They compare different measures of similarity showing their distribution (using MovieLens as dataset) and comparing their accuracy (MAE) and coverage results.

They show an interesting result on their study: that the similarity coefficients don't have a significant impact on the accuracy metrics compared to the neighborhood size. The experiments show that in some cases, using a random similarity measure between users can result in better accuracy of item prediction whether a large number of users have been used in the neighborhood. At the end of the paper there's a discussion about these non-expected results. Between 3 of them (criticism to accuracy metrics, data sparsity and the use of user-based similarity) they highlight that their results show a lack of support to the user-based similarity as a measure to capture an important factor on providing recommendations. So far, different similarity metrics show different rankings and distributions, but none of them extensively outperform any other.

One open question is: Should this result be extrapolated to any other dataset? Other questions: Which kind of similarity measure could help to capture better the concept of "word-of-mouth"? Do we really know what the ratings mean and how to use them to provide better recommendations?

Labels: ,