29 June 2007

Interim objectives

I know I've been a little late setting these, but now it's kinda' settled. So, here's what I ought to do by July 9th:
  • implement retrieval of favicon.ico retrieval and using it as a contact image for the feed
  • implement HTML rendering of the feed body (this should be paired by a GUI plugin for a more flexible configuration, but I doubt I have the time to finish it by the first deadline)
  • squishing any bugs I found on the way (it would be nice if I could get rid of the most of the exceptions raised on different RSS feeds - news.google.com was just one of them).
Wish me luck or „ca va chier” as Emil puts it :))

25 June 2007

Google politics

One of the problems well known in the current implementation of the RSS support in SIP Communicator was the inability to retrieve feeds from news.google.com (and you must admit, news.google.com is quite a source of news ;) ).

A brief look at the exception returned by the plugin points out the problem. The server is sending a HTTP/403 (Forbidden) response code instead of the HTTP/200 (OK) response code. This indicates that, for some reason still unknown to me, the news.google.com server totally dislikes our client. Having found that out, i started searching for the "guilty".

After examining a few Wirechark dumps, I had a clear idea of how the HTTP request we were sending looked like. I've tried it once more, by manually connecting to the news.google.com server on port 80 using telnet, and typing the request (not very complicated, as HTTP is, as you know, a plain text protocol). The result was the same HTTP/403. I've tried then the most simple "equivalent" HTTP request:

GET /?output=rss HTTP/1.0
Host: news.google.com
Connection: close


To my surprise, I received a nice XML file along with the necessary headers. So I started changing and adding headers to get as close to the request SIP was issuing that would still work. After about a dozen tries, I found out that the User-Agent SIP was sending along with the HTTP request (that is Java/1.6.0 on my machine) is somewhat on the blacklist of the news.google.com servers. I verified this assumption by changing the User-Agent for lynx and wget. As soon as the original User-Agent was replaced by a more exotic value (be it a totally dummy User-Agent like 'Kikiriki' or a well-known User-Agent like 'Mozilla/4.0'), thigs worked like a charm.
For now I've fixed that by adding a

    System.setProperty("http.agent",
System.getProperty("sip-communicator.version"));


in the start() method of the RssActivator.

I still have some problems, but I don't know if they are caused by my pretty lame internet connection or there are other bugs in the code.
Right now I'm trying to figure if there's any other bug to catch and how to implement RSS contacts icons. But more on this later on today :)

05 June 2007

Happy hacking!

Finally the second semester is over, and the summer exams sessions is about to begin (that means I have the first exam tomorrow - Digital Computers 2). I've made it through the pending homeworks and projects, at least for now.
Emil has been very supportive and now I can draw an outline of things that need to be done by the interim report on 9th July. So, here goes (in a preliminary order, not necessarily the final one):
  • thoroughly go through the developer documentation. Getting familiar with unit-testing and JUnit in particular is a must
  • have a working copy of the latest CVS sources for SIP and RSS4SC. I've made it once, I can do it again. I'll work with Eclipse under Linux. Last time I checked the Eclipse how-to was a little out-of-date, but I think things got better with the beggining of SoC.
  • go through the current implementation of RSS4SC to get an idea of how things ar done (reinventing the wheel is bad)
  • write unit tests for testing the RSS plugin for SIP Communicator
  • complete the current RSS4SC implementation with some other features that I mentioned in my application (for instance, using a site's favicon as a contact image for a RSS feed)
  • bug-fixing (one annoying bug found at the moment is the inabilty to read the news.google.com feeds).
And in between all this, I should also get my exams going :)