booklamp posts on CNET

booklamp

BookLamp is a 'Pandora for books'

BookLamp is a project that the people at Amazon.com would be idiots to pass up buying.

It's a machine learning tool that's been designed to go through books and analyze not only how they're written, but also help group together novels that share similar structures and styles. The hope is to help people discover books they may like based on previously read novels, or what kind of reading experience they're going for. Internet radio recommendation service Pandora does something similar, employing a thumbs up and down system combined with listening history.

Because BookLamp's system uses machine learning, it skips the three major aspects of each book that humans usually tally: story line and plot, the characters, and writing style. Instead, it figures out bits of these three items by using written cues and quantifiers like word density, pacing, action, character dialogue (as noted by quotations), and level of description. The system also blends in one to five star ratings from Amazon.com.

So far, the database has 179 books, but is tracking more than 700,000 data points over 30,000 scenes from those titles. If it were to scale to track more works, in theory the results for related items would be even more precise. In its current state, users can go in and pick from one of the titles and get recommendations for similar titles, or view the graphs of what the system has recorded for its pacing, density, and other characteristics.

One of the coolest features, and the one I think is the killer app is the pacing analysis. It will go through and figure out when the pace of a book speeds up or slows down.

In the video demo (embedded after the break), creator Aaron Stanton picks Michael Crichton's Jurassic Park as an example, and demonstrates that BookLamp was smart enough to detect when the pace ramps up, including on what page that change occurs. I could see this being a great way to check and see if you're wasting your time on a read that's off to an incredibly slow start and potentially going nowhere. Instead of giving up, you could simply give the chart a quick look.

The project has been around since 2003 and continues to build up its database. There's a sign-up form to request a work to be added. You can also play around with the browsing and stats tool by registering. Be sure to hit the read more button to check out the video walk-through.

[via Digg]

Read more