Triple Buttons with Firefox 3.0 Beta 5

Just upgraded to Firefox 3.0 beta 5… do you think I have enough back/forward buttons? Revert!

Just upgraded to Firefox 3.0 beta 5… do you think I have enough back/forward buttons? Revert!
I’ve been noodling on this feature for a while — how can I find “more links like this one” in LinkRiver. Putting on my machine learning hat, I contemplated link-to-link co-visitation schemes, semantic indexing, various clustering algorithms… but all approaches were too data-heavy, at least for now. There had to be an easier way…
LinkRiver has allowed full-text searching links (by river and stream) for a while now. The link title and host (i.e. www.techcrunch.com) are both a part of the index. Could the full-text search engine help out here? Let’s try it out.
One popular link today was a story on news.com about the possibility of eBay selling Skype to Google. What if I send the link host and title to the search engine? Are the results relevant?
Try it yourself: Click to see similar links
In most cases this works really well…
Twobile-Twitter for Windows Mobile
FriendFeed Has Search
But sometimes, the results are not so great:
TechMeme Leaderboard: Six Months In
Options - one thing I may do, depending on feedback, is stop including the link host as a part of the search query. Play around (click similar, then re-run the search after removing the link host from the search box) and let me know what you think.
All lifestream and link-sharing aggregators use an RSS/ATOM parser to help power their service.
I built LinkRiver using Ruby on Rails and would have preferred to use a parser built in Ruby. However, Mark Pilgrim’s Universal Feed Parser is rock-solid and very well tested, so I use UFP for feed parsing. LinkRiver controls UFP via a memcached-based message queue. Some UFP-Python glue posts new shared links via a simple HTTP API.
A while back RSSMeme’s Benjamin Golub tweeted that he also uses UFP, so I thought I’d ask around to see what some of the other aggregators are using.
Bret Taylor from FriendFeed told me they use UFP as a fall-back but rely primarily on a custom parser that uses much less memory.
ReadBurner developer Alexander Marktl replied to say that he uses a MagicParser, a commercial parser for PHP.
After testing a bunch of options and finding none that worked, Tumblr’s Marco Arment wrote his own parser for PHP “with regular DOM functions”.
Google’s Chris Wetherell has blogged about the history of Google Reader and mentioned that UFP was involved, at least in the early stages.
Any others?
Updated: See comments — Gabe Rivera from Techmeme built his own in Perl.

One feature I’ve missed since abandoning NetCaptor for Firefox a few years ago was the ability to open new tabs next to the current tab instead of at the end of my tab stack. I spent an hour white-boarding this with Firefox dev Ben Goodger, and I gave up trying to do this myself after finding the Firefox tab-ordering code to be a spaghetti-mess of independent arrays.
I don’t remember how I stumbled on Tabs Open Relative… but all is well in my tabbed browsing world again — as if some annoying background music is gone. Ahhhhh.
Why is this feature so important? Context. When you open new tabs, they tend to be related to the current tab. If I’m searching Google for digital camera reviews and open the top five links as separate tabs, those tabs should be close to the “starter” tab, not lost at the end.
This happens to me all the time. I’m in super-productive mode and I run across an article or blog post that is interesting but entirely outside the context of what I’m doing. I need to stay on task - no tangents allowed.
I’ve tried a few things… a ‘To Read’ folder in my browser’s bookmarks or tagging links ‘toread’ on del.icio.us, but these methods were either too disruptive or difficult to manage.
I tried out InstaPaper the other day and loved it - one-click and a link is saved for later. It worked great, but it didn’t help me if I found something to ‘later’ when in Google Reader. Still too much friction.
Inspired by InstaPaper, I added a ‘Save for Later’ feature to LinkRiver.
Links you mark ‘Later’ show up under your ‘Later’ tab in LinkRiver. These links are private and not shared with your followers unless you choose that explicitly.

There are three ways to add links to your ‘Later’ stream.
First - there is a new one-click bookmarklet you can add to your browser toolbar. One-click — boom — you’ve saved the link for later without leaving the page you are on. Look for these in your sidebar after logging in to LR.

Second - links inside LinkRiver now have a ‘later’ option in addition to the ’share’ option that’s been there for a while. Again - one-click and its saved for later.
Third - this one is probably the most powerful of them all - you can import an external feed into your ‘Later’ stream.

I setup LinkRiver to import my Google Reader shared items into my main stream and my starred items into my Later stream. This works beautifully, especially when using Google Reader on my iPhone. Just click ’share’ in GR to share on LinkRiver, or ’star’ to save it for later. Sweet GTD goodness!
Google Reader creator Chris Wetherell is writing a great series on the birth of Google Reader. In the latest, Chris mentions LinkRiver and others when he talks about “services aggregating shared items”.
Gotta say I’m honored. That’s kind of like UCLA basketball coach Ben Howland mentioning me, a church-league pee-wee basketball coach, in a post-game news conference.

I saw this on the wall at Claire’s preschool today. They had asked the kids to name things that are big - elephant, train (2x), giant, big big truck, and t-rex all made the list. What comes to Claire’s mind when you ask her to name something big? Dad. I love it!
I wear flip flops 365 days a year and my Reefs need replacing after 18 months of near-constant use. I ordered some Tevas from Zappos.com yesterday at 2 PM. I paid for two day shipping, but because I missed the “1 PM ship the same day cutoff”, I didn’t expect to see them until Thursday.
Guess whose doorbell just rang? UPS just dropped off my new flips less than 20 hours after I ordered them. That’s ridiculous! Do they always ship this fast? I may never buy shoes in a store again, especially because most stores don’t carry my size 14/15 anyway.
Updated: I added support for APML (attention profile markup language) for your attention data. Mine is here.
Above is a screenshot of a new feature I’ve been playing around with on LinkRiver: attention data. Clicking on a user’s “attention” tab will show the top sites and the top keywords from links shared by that user. Click through about to see my attention data — I’m interested in Ruby, Barack Obama, MySQL, Nginx, Twitter, the iPhone, etc. My friend Chris, a chemist for biofuel startup PrimaFuel, has different interests: energy, solar, Barack Obama, and energy. What do you think?
LinkRiver displays favicons next to most links to help users recognize link targets. Those favicons are served separately from the main LinkRiver server. This post describes some of the design decisions and approaches I took when building the FI server.
My most important requirement for the favicon server (FI) was that it be loosely coupled to the LinkRiver (LR) server and reusable for other applications. The LR server could link to a favicon for *any* page without worrying about whether the icon exists on the FI server. If the FI server already had the icon, great - it would serve it up. If not, it would send back a default icon. This requirement ruled out Amazon’s S3 service because it won’t allow you to return a default image/page in response to “404 Not Found” errors.
When LR wants to display the favicon for a site like Twitter, it generates a URL like this:
http://favicons.linkriver.com/f1/25/twitter.com.ico
LR knows how to “map” host names to the directory structure (f1/25 in the example above). Keeping icons in a two-tiered directory system likes this makes it easier to manage the large number of cached files (its bad to have zillions of files in one directory). It also serves as a minor obstacle to others hotlinking to these favicons.
Behind the scenes it would work like this. A fast/lightweight web server like lighttpd or Nginx would sit in front of all requests to serve already-cached static files. When an uncached icon is requested, the FI server queues it up for later download. I have a lightweight non-persistent message queuing class built on memcached and Ruby that would be perfect for this. All the FI server has to do push the request values onto memcached and then tell the web server to send back the default icon.
LinkRiver is written in Ruby on Rails using Nginx as a load balancer and static page server with mongrel as the Ruby app server. I love working in Ruby, but for this app, rails would have been overkill. I wasn’t familiar with ways to run Ruby using a faster/lighter server so I dusted off my trusty/rusty PHP skills. Remember - the only thing PHP had to do was push request values to memcached and tell the web server to return the default icon. Something like this:
< ?php
Header('HTTP/1.1 200 OK');
Header("Content-Type: image/x-icon");
Header("X-LIGHTTPD-Send-File: /path/to/default.png");
//
// A few more lines to push the request
// values onto memcached
//
?>
X-LIGHTTPD-Send-File header tells lighttpd to return a static file to the browser — this is much faster than having PHP do it. I banged this out in about an hour and it worked great.
My PHP+Lighttpd version of the FI server worked just fine but I didn’t like supporting both Nginx and lighttpd. I also prefer coding in Ruby whenever possible. Was there a lightweight way to run Ruby on a web server? That’s where Thin and Rack come in.
Thin is a wicked-fast Ruby web server that’s perfect for what I was trying to do — run a fairly simple Ruby script on a web server. Thin is the web server itself - Rack is an interface that defines how Ruby interacts with the server.
Thin runs a Rack config file that looks something like this:
require 'favicon' require 'mcqueue' q = MCQueue.new(QUEUE_SERVER, QUEUE_NAMESPACE) map '/' do run FaviconAdapter.new(q) end
For all requests that make it to Thin (remember - all cached icons are served by Nginx directly and never reach Thin), Thin creates an instance of my FaviconAdapter class and “runs” it, which means it will call the FaviconAdapter’s “call” method and pass in information about the request. Our call method parses out some request information (the hostname for the favicon), pushes it to memcached, and returns an HTTP status code, headers and body, just like the PHP version.
require 'rubygems'
require 'thin'
DEFAULT_HEADERS = {
'Content-Type' => 'image/x-icon',
'X-Accel-Redirect' => '/protected/default.png'
}
class FaviconAdapter
def initialize(queue)
@queue = queue
end
def call(env)
req = Rack::Request.new(env)
//
// A couple of lines removed to parse the request and
// push it to memcached...
//
[200, DEFAULT_HEADERS, ['']]
end
end
The X-Accel-Redirect does the same thing for Nginx that the X-LIGHTTPD-Send-File header does for lighttpd: it tells the web server to return the file directly instead of streaming it through our Ruby or PHP code.
The new Ruby FI version has been solid and stable like the PHP version before it. The new version should scale better too — in my tests, Nginx handles high load better and serves static files at the same high speed at lighttpd. My Ruby code is outperforming the PHP code by about 30%, but that’s not quite a fair comparison. The Ruby version caches its connection to memcached while the PHP version must reconnect for each request.
That’s all for now.