Tag Archives: Perl

Homebrewing Knowledge-Base from HBD Archives?

Uh-oh!

Started thinking again. This time about a way to repurpose messages on the HomeBrew Digest into a kind of database of brewing knowledge. I can just see it. It’d be ah-some!

Anybody knows how to transform email messages from well-structured digests into database entries? Seems to me that it should be a trivial task, especially for someone well-versed in Perl and/or PHP. But what do I know?
That venerable HBD mailing-list contains a wealth of information about pretty much every single dimension of beer homebrewing. For a large number of reasons, content from the HBD.org site turns up quite often in Web searches for brewing terms.

One issue with the HBD, though, is that it’s a bit hard to search. There used to be a custom-built search feature on the site but we now need to rely on Google and AltaVista. This wouldn’t be too much of an issue if not for the fact that those engines search complete digests instead of individual messages. So the co-occurrence of two terms in the same digest can be due to two messages on completely different subjects.

Another issue with the HBD (as with many other mailing-lists) is the relatively high redundancy in message content. Some topics came cyclically on the mailing-list and though some kind souls were gracious enough to respond to the same queries over and over again, the mailing-list often looks like an outlet for FAQs. Among HBD “perennials” (or cyclical topics) are discussions of the effects of HSA (hot-side aeration), decoction mashing, and batch sparging, to name but a few technical issues.

Unfortunately, it looks like the HBD might need to be retired at some point in the not-so-distant future, at least for lack of sponsorship. Also, Pat Babcock, the digest’s “janitor,” recently asked for mirror space and announced the retrieval of some of the older digests (from the late 1980s).

Of course, there are lots of other brewing resources out there. So many, in fact, that it can be overwhelming to the newbie brewer. One impact of having so much information so easily available about homebrewing (and commercial brewing, for that matter) is a “democratization of beer knowledge.” Contrary to brewing guilds of medieval times, brew groups are open and free. Yet a side-effect of this is that there isn’t a centralized authority to prevent disinformation. Also, because the accumulated knowledge is difficult to peruse, people tend to “reinvent the wheel.”

In Internet terms, the HBD is the closest equivalent to a historical source. Few other mailing-lists have been running continuously since 1986.

Luckily, all the digests since October 1988 are available as HTML files. And the digest format has remained almost unchanged since that time.
All of the content is in plain ASCII. Messages never exceed a certain
length. IIRC, line length is also controlled. And HTML was officially
not admitted. Apparently, some messages did contain a bit of HTML
code
, but that shouldn’t be an issue.

Here’s what I imagine could be done:

  1. “Burst” out digests into individual messages (with each message containing digest information)
  2. Put all the individual messages (350MB worth) into a Content Management System
  3. Host the archived messages in the form of a knowledge-base
  4. Process those entries for things like absolute links and line breaks
  5. Collect messages in threads
  6. Add relevant del.icio.us-like tags and slashdot- or digg-like ratings
  7. Use this knowledge-base for wiki-like collaborative editing
  8. Assess some key issues to be taken up by brewing communities
  9. Add to the brewing knowledge-base
  10. Build profiles for major contributors and major groups

Because I couldn’t help it, I started writing down some potential tags I might use to label messages on the HBD. It could be part “folksonomy,” part taxonomy. For one thing, it’d be useful to distinguish messages based on “type” (general queries about a brewing technique vs. recipe posted after a competition) since many of the same terms and tags would be found in radically different messages.

Ethnography and Technographics

This one certainly made the rounds among observers of online activities, but I only just got the link through a comment by Martin Lessard, the insight-savvy YulBlogger and “Internet culture” describer.

The Groundswell (Incorporating Charlene Li’s Blog): Forrester’s new Social Technographics report

Many companies approach social computing as a list of technologies to be deployed as needed – a blog here, a podcast there – to achieve a marketing goal. But a more coherent approach is to start with your target audience and determine what kind of relationship you want to build with them, based on what they are ready for.

Sounds obvious, doesn’t it? I get from it the same reaction as from effective ethnography. Not really a “Eureka!” moment. More of a “Doh!” moment, when you suddenly realise what was really happening around you.

This ethnography-like insight is even more obvious in the report itself (a review copy of which I got through email, thanks to Forrester’s excellent policy for content use). In that report, Li et al. define different user types in a manner not incompatible with our tendency to classify, in ethnography as in cultural life. Like ethnography, the report is showing the relationships between those different profiles (instead of stereotyping or “profiling”).

Sure, the proportion of creators is an important factor for Old School market research. But, what’s more important, is that different people adopt different behaviours in different contexts. Obvious, but important.

The report talks about age and gender differences, provides evidence for the changes in the Internet 6 ecology, and manages to treat Internet users as human beings. “All in fifteen pages or less!”

Again, this report isn’t groundbreaking. But it can be really useful as a representation of cultural patterns for technological adoption (MS Word document). (As it turns out, this issue came up in an exam I gave today… Wish I could share the textbook page on early-adopters in cultural change.)

There are other blog posts about this report, including some advice for marketers:

Companies seeking to engage customers with these new tools need to understand where their audiences are with this categorisation and then create bespoke programmes for them.

As per Larry Wall’s ethnographic training, diagonal thinking. “There’s More Than One Way to Do It.”