Parallelizing ListNet Training using Spark

With the increase in the size of training datasets for machine learning algorithms, scalability of learning has become increasingly important to achieving continuing improvements in ranking accuracy. As part of class project in my Concepts of Information Retrieval course, I provided a parallel implementation of ListNet, a Learning to Rank algorithm. I demonstrated how training times of machine learning algorithms can be reduced using parallelism from distributed cluster computing systems like Spark. I will be presenting a poster based on this work at SIGIR 2012.
[report | code]

Twitter Sentiment Analysis

twitter With the advent of Web 2.0, there has been an upsurge in the amount of user generated content. Sentiment analysis attempts to identify the view point in a text span. As part of a group project in my Data Intensive Computing for Text Analysis course, we used machine learning techniques to do sentiment analysis of a Twitter dataset to assess people's opinion about President Obama's job approval. We saw that semi-supervised learning approaches performed the best. We also learned that Twitter tweets might be a good indicator of immediate reaction to policy decisions etc. but tweets are not that good a reflector of long term job approval of the President.
[report | code]

Interaction Design for an eNotice Board

Stuffboard Physical cork boards have traditionally been a good way to advertise. However, there are certain problems that arise with cork boards such as the management of postings, space constraints, and providing a stable store for information that a user wants to collect. Even with these issues, cork boards provide some unique attributes such as locality-based advertising and a social point of interaction. As part of a group project in my Human Computer Interaction course, we tried to model the metaphor of a cork board using a digital interface which captures the physical aspect of the cork boards and combines it with the powers of a digital electronic display board. The project involved design interactions using which users could pull content from and push content to the board.
[ report | code]