Tag Archives: academic citations

"To Be Verified": Trivia and Critical Thinking

A friend posted a link to the following list of factoids on his Facebook profile: Useless facts, Weird Information, humor. It contains such intriguing statements about biology, language, inventions, etc.

Similar lists abound, often containing the same tidbits:

Several neat pieces of trivial information. Not exactly “useless.” But gratuitous and irrelevant. The type of thing you may wish to plug in a conversation. Especially at the proverbial “cocktail party.” This is, after all, an appropriate context for attention economy. But these lists are also useful as preparation for game shows and barroom competitions. The stuff of erudition.

One of my first reflexes, when I see such lists of trivia online, is to look for ways to evaluate their accuracy. This is partly due to my training in folkloristics, as “netlore” is a prolific medium for verbal folklore (folk beliefs, rumors, urban legends, myths, and jokes). My reflex is also, I think, a common reaction among academics. After all, the detective work of critical thinking is pretty much our “bread and butter.” Sure, we can become bothersome with this. “Don’t be a bore, it’s just trivia.” But many of us may react from a fear of such “trivial” thinking preventing more careful consideration.

An obvious place to start verifying these tidbits is Snopes. In fact, they do debunk several of the statements made in those lists. For instance, the one about an alleged Donald Duck “ban” in Finland found in the list my friend shared through Facebook. Unfortunately, however, many factoids are absent from Snopes, despite that site’s extensive database.

These specific trivia lists are quite interesting. They include some statements which are easy to verify. For instance, the product of two numbers. (However, many calculators are insufficiently precise for the specific example used in those factoid lists.) The ease with which one can verify the accuracy of some statements brings an air of legitimacy to the list in which those easily verified statements are included. The apparent truth-value of those statements is such that a complete list can be perceived as being on unshakable foundations. For full effectiveness, the easily verified statements should not be common knowledge. “Did you know? Two plus two equals four.”

Other statements appear to be based on hypothesis. The plausibility of such statements may be relatively difficult to assess for anyone not familiar with research in that specific field. For instance, the statement about typical life expectancy of currently living humans compared to individual longevity. At first sight, it does seem plausible that today’s extreme longevity would only benefit extremely few individuals in the future. Yet my guess is that those who do research on aging may rebut the statement that “Only one person in two billion will live to be 116 or older.” Because such statements require special training, their effect is a weaker version of the legitimizing effect of easily verifiable statements.

Some of the most difficult statements to assess are the ones which contain quantifiers, especially those for uniqueness. There may, in fact, be “only one” fish which can blink with both eyes. And it seems possible that the English language may include only one word ending in “-mt” (or, to avoid pedantic disclaimers, “only one common word”). To verify these claims, one would need to have access to an exhaustive catalog of fish species or English words. While the dream of “the Web as encyclopedia” may hinge on such claims of exhaustivity, there is a type of “black swan effect” related to the common fallacy about lack of evidence being considered sufficient evidence of lack.

I just noticed, while writing this post, a Google Answers page which not only evaluates the accuracy of several statements found in those trivia lists but also mentions ease of verifiability as a matter of interest. Critical thinking is active in many parts of the online world.

An obvious feature of those factoid lists, found online or in dead-tree print, is the lack of context. Even when those lists are concerned with a single topic (say, snails or sleep), they provide inadequate context for the information they contain. I’m using the term “context” rather loosely as it covers both the text’s internal relationships (the “immediate context,” if you will) and the broader references to the world at large. Without going into details about philosophy of language, these approaches clearly inform my perspective.

A typical academic, especially an English-speaking one, might put the context issue this way: “citation needed.” After all, the Wikipedia approach to truth is close to current academic practice (especially in English-speaking North America) with peer-review replacing audits. Even journalists are trained to cite sources, though they rarely help others apply critical thinking to those sources. In some ways, sources are conceived as the most efficient way to assess accuracy.

My own approach isn’t that far from the citation-happy one. Like most other academics, I’ve learned the value of an appropriate citation. Where I “beg to differ” is on the perceived “weight” of a citation as support. Through an awkward quirk of academic writing, some citation practices amount to fallacious appeal to authority. I’m probably overreacting about this but I’ve heard enough academics make statements equating citations with evidence that I tend to be weary of what I perceive to be excessive referencing. In fact, some of my most link-laden posts could be perceived as attempts to poke fun at citation-happy writing styles. One may even notice my extensive use of Wikipedia links. These are sometimes meant as inside jokes (to my own sorry self). Same thing with many of my blogging tags/categories, actually. Yes, blogging can be playful.

The broad concept is that, regardless of a source’s authority, critical thinking should be applied as much as possible. No more, no less.

Less Than 30 Minutes

Nice!

At 20:27 (EST) on Saturday, November 17, 2007, I post a blog entry on the archaic/rare French term «queruleuse» (one equivalent of “querulous”). At 20:54 (EST) of the same day, Google is already linking my main blog page as the first page containing the term “queruleuse” and as the fourth page containing the term “querulente.” At that point in time, the only other result for “queruleuse” was to a Google Book. Interestingly enough, a search in Google Book directly lists other Google Books containing that term, including different versions of the same passage. These other books do not currently show up on the main Google search for that term. And blogs containing links to this blog are now (over two hours after my «queruleuse» post) showing above the Google Book in search results.

Now, there’s nothing very extraordinary, here. The term «queruleuse» is probably not the proper version of the term. In fact, «querulente» seems a bit more common. Also, “querulous” and “querulent” both exist in English, and their definitions seem fairly similar to the concept to which «queruleuse» was supposed to refer. So, no magic, here.

But I do find it very interesting that it takes Google less than a half hour for Google to update its database to show my main page as the first result for a term which exists in its own Google Books database.

I guess the reason I find it so interesting is that I have thought a bit about SEO, Search Engine Optimization. I usually don’t care about such issues but a couple of things made me think about Google’s PageRank specifically.

One was that someone recently left a comment on this very blog (my main blog, among several), asking how long it took me to get a PageRank of 5. I don’t know the answer but it seems to me that my PageRank hasn’t varied since pretty much the beginning. I don’t use the Google Toolbar in my main browser so I don’t really know. But when I did look at the PR indicator on this blog, it seemed to be pretty much always at the midway point and I assumed it was just normal. What’s funny is that, after attending a couple Yulblog meetings more than a year ago, someone mentioned my PageRank, trying to interpret why it was so high. I checked that Yulblogger’s blog recently and it has a PR of 6, IIRC. Maybe even 7. (Pretty much an A-List blogger, IMHO.)

The other thing which made me think about PageRank is a discussion about it on a recent episode of the This Week in Tech (TWiT) “netcast” (or “podcast,” as everybody else would call it). On that episode, Chaos Manor author Jerry Pournelle mused about PageRank and its inability to provide a true measure of just about anything. Though most people would agree that PageRank is a less than ideal measure for popularity, influence, or even relevance, Pournelle’s point was made more strongly than “consensus opinion among bloggers.” I tend to agree with Pournelle. 😉

Of course, some people probably think that I’m a sore loser and that the reason I make claims about the irrelevance of PageRank is that I’d like to get higher in a blogosphere’s hierarchy. But, honestly, I had no idea that PR5 might be a decent rank until this commenter asked me about. Even when the aforementioned Yulblogger talked about it, I didn’t understand that it was supposed to be a rather significant number. I just thought this blogger was teasing (despite not being a teaser).

Answering the commenter’s question as to when my PR reached 5, I talked about the rarity of my name. Basically, I can always rely on my name being available on almost any service. Things might change if a distant cousin gets really famous really soon, of course… ;-) In fact, I’m wondering if talking about this on my blog might push someone to use my name for some service just to tease/annoy me. I guess there could even be more serious consequences. But, in the meantime, I’m having fun with my name’s rarity. And I’m assuming this rarity is a factor in my PageRank.

Problem is, this isn’t my only blog with my name in the domain. One of the others is on Google’s very own Blogger platform. So I’m guessing other factors contribute to this (my main) blog’s PageRank.

One factor is likely to be my absurdly long list of categories. Reason for this long list is that I was originally using them as tags, linked to Technorati tags. Actually, I recently shortened this list significantly by transforming many categories into tags. It’s funny that the PageRank-interested commenter replied to this very same post about categories and tags since I was then positing that the modification to my categories list would decrease the number of visits to this blog. Though it’s hard for me to assess an actual causal link, I do get significantly less visits since that time. And I probably do get a few more comments than before (which is exactly what I wanted). AFAICT, WordPress.com tags still work as Technorati tags so I have no idea how the change could have had an impact. Come to think of it, the impact probably is spurious.

A related factor is my absurdly long blogroll. I don’t “do it on purpose,” I just add pretty much any blog I come across. In fact, I’ve been adding most blogs authored by MyBlogLog visitors to this blog (those you see on the right, here). Kind of as a courtesy to them for having visited my blog. And I do the same thing with blogs managed by people who comment on this blog. I even do it with blogs by pretty much any Yulblogger I’ve come across, somehow. All of this is meant as a way to collect links to a wide diversity of blogs, using arbitrary selection criteria. Just because I can.

Actually, early on (before I grokked the concept of what a blogroll was really supposed to be), I started using the “Link This” bookmarklet to collect links whether they were to actual blogs or simply main pages. I wasn’t really using any Social Networking Service (SNS) at that point in time (though I had used some SNS several years prior) and I was thinking of these lists of people pretty much the same way many now conceive of SNS. Nowadays, I use Facebook as my main SNS (though I have accounts on other SNS, including MySpace). So this use of links/blogrolls has been superseded by actual SNS.

What has not been superseded and may in fact be another factor for my PageRank is the fact that I tend to keep links of much of the stuff I read. After looking at a wide variety of “social bookmarking systems,” I recently settled on Spurl (my Spurl RSS). And it’s not really that Spurl is my “favourite social bookmarking system evah.” But Spurl is the one system which fits the most in (or least disrupts) my workflow right now. In fact, I keep thinking about “social bookmarking systems” and I have lots of ideas about the ideal one. I know I’ll be posting some of these ideas someday, but many of these ideas are a bit hard to describe in writing.

At any rate, my tendency to keep links on just about anything I read might contribute to my PageRank as Google’s PageRank does measure the number of outgoing links. On the other hand, the fact that I put my Spurl feed on my main page probably doesn’t have much of an impact on my PageRank since I started doing this a while after I started this blog and I’m pretty sure my PageRank remained the same. (I’m pretty sure Google search only looks at the actual blog entries, not the complete blog site. But you never know…)

Now, another tendency I have may also be a factor. I tend to link to my own blog entries. Yeah, I know, many bloggers see this as self-serving and lame. But I do it as a matter of convenience and “thought management.” It helps me situate some of my “streams of thought” and I like the idea of backtracking my blog entries. Actually, it’s all part of a series of habits after I started blogging, 2.5 years ago. And since I basically blog for fun, I don’t really care if people think my habits are lame.

Sheesh! All this for a silly integer about which I tend not to think. But I do enjoy thinking about what brings people to specific blogs. I don’t see blog statistics on any of my other blogs and I get few enough comments or trackbacks to not get much data on other factors. So it’s not like I can use my blogs as a basis for a quantitative study of “blog influence” or “search engine relevance.”

One dimension which would interesting to explore, in relation to PageRank, is the network of citations in academic texts. We all know that Brin and Page got their PageRank idea from the academic world and the academic world is currently looking at PageRank-like measures of “citation impact” (“CitationRank” would be a cool name). I tend to care very little about the quantitative evaluation of even “citation impact” in academia, but I really am intrigued by the network analysis of citations between academic references. One fun thing there is that there seems to be a high clustering coefficient among academic papers in some research fields. In some cases, the coefficient itself could reveal something interesting but the very concept of “academic small worlds” may be important to consider. Especially since these “worlds” might integrate as apparently-coherent (and consistent) worldviews.

Groupthink, anyone? 😉