I am a Pubmed junkie. My searches often return hundreds of articles and I tag and save the interesting results both locally and on Connotea. But everytime I perform a new search and start looking through the articles, I would give a lot to know which of these articles I have already tagged and saved, and what those tags were. I would also like to sort through the articles automatically and automatically find relationships between them (like whether they are related, are they from the same group etc… if they are from the same group, perhaps the most recent article from the group is of most interest).
I knew nothing about web programming or scripting languages (Perl/Python/PHP/Javascript/CSS), so I decided to start learning how to do this myself, since none of the existing solutions seemed very good. Searching Pubmed from an existing bibliography program (Endnote or Jabref of Bibus) seems ideal in principle, but searching from within these programs is extremely clumsy and it is not easy to follow links to the actual journal articles or to Pubmed’s “related articles”. In Endnote, the situation was virtually impossible since the source code of Endnote is not available, but I imagine that it would be possible to incorporate this functionality into Bibus or Jabref. I didn’t want to learn Java (Jabref is written in Java), and I decided I didn’t want to figure out how Bibus was set up right now (in Python) and instead looked for a simpler solution that didn’t necessarily provide database integration.
First, I thought I could do it with Greasemonkey (Firefox) or UserJavascript (Opera, my preferred browser). But I needed a way to store the list of records I had already looked at and their associated tags and it wasn’t obvious how I could read the data in, save the data out, or whether Javascript would be fast enough. OK, so one could come up with some ugly hacks like opening a page in the browser with a long list of “already-looked-at” articles from a local file and then having javascript reading the records from this page in the browser, but this sounds terrible. Also, I found out that javascript loaded from a local file could indeed read from and write to the local disk (see the TiddlyWiki project and the discussion I initiated on the tiddlywiki-dev mailing list), but there were still several concerns about speed and browser compatibility. Besides, the whole solution appeared rather limiting.
So I finally settled on running a tiny web-server on my computer (Abyss is working very well) and then installed PHP since it was supposed to be very easy to learn (it was !!). Now I am routing all my Pubmed search requests through a little PHP script on my computer. At this point, I have figured out how to get the results from Pubmed (100 articles returned by default) in full XML format and dump them into my browser window. Now I have to figure out how to use XPath to filter the XML and then use XSLT to display only the interesting fields in the browser. Processing the articles further for interesting relationships with the articles already in my local library and for relationships among each other, adding “add to local library” and “add to connotea” buttons and many other things are on my to-do list…
ps. It turns out that Bibus (written in Python, which I like) can run on a SQLite datasbase, and PHP interacts very well with SQLite. So it may be possible to use the same database for my local references and for searching. An alternative would be to use Wikindx or Aigaion, both of which are PHP/MySQL based systems; in which case the framework I am setting up could eventually be merged into these packages. Bibus has nice integration with Open office and Word though.. but the way I see, only a small set of references are needed while writing papers and Endnote is good enough for this. For large scale database handling, Endnote scares me, with the frequency with which it tells me to repair its database files even when they contain less than 10 documents in them. Maybe it is something I am doing wrong, but I am still scared 🙂