Wednesday, February 03, 2010

Could Google's focus on real-time search screw up finding older news?

I recently wrote a piece for Poynter.org on Google's new focus on real-time search--which, if you're looking for the most up-to-date news, esp. on an emergency, is probably a good thing. But today, as I looked for articles on the 2008 comments controversy at the Hartford Courant (which had Mayor Eddie Perez standing on the steps of the Courant giving the publisher hell) I found zip, zero, nada....

I was looking for that info so that I could put it in a piece I was working on about Engadget closing off comments due to incivility (which it had also done in 2005.) I figured it'd be good to reference both the 2008 Courant controversy, as well as the 2006 WaPo controversy that had newspaper commenters all up in arms about their First Amendment Rights (which they really don't have when they foul-mouth any publication's comments section.)

But if I can't find the articles on the previous incidents, then I can't really have a coherent article now, can I? All I'm really doing then is parroting back information on what Engadget's doing without giving any context to a discussion about how it might be different for newspapers, and how newspapers haven't had as easy a time of handling their comment boards as Engadget has...

I think I tried every single permutation of search terms I could think of, and I got all sorts of unrelated junk. Seriously unrelated junk. Junk about Tiger Woods --which for the life of me I couldn't figure out how that got into a search on "hartfor courant turns off comments."

Maybe it was the "turned off" that did it? who knows.

I tried the search in quotes, and Google found nothing. I tried it without quotes, and came up with a whole bunch more junk on what the new Miss America said about turning off TV, and something about Obama saying something about Democrats turning off TV news...

But my search had nothing to do with TV. It had to do with comments.

Even when I tried "news forum comment controversy" I got nothing.

Oh, but I did get something about a guy being shot by a cop. I have no idea what that had to do with my search, but I will say it's a timely article.

Even Wikipedia was no help: it had links only to the most recent controversies at the Courant.

It's bad enough that the collective memory of people who use the Internet is fairly short: lord knows so many don't even get that there have been huge debates going on her regarding "online civility" for around 7 or 8 years now. The issue seems to go nowhere because no one seems to recall the old conversations and that those conversations were built on....

When we are unable to access information from the past, we lose context. If we lose context, we eventually will repeat the same stupid stuff from the past. Even for the Internet, as it is evolving mores and codes and such, there needs to be context. Nothing exists in a vacuum. Not even choices to close of comments sections.

7 comments:

Bill Dusty said...

Here's an Urban Compass story on the Courant comment thingy... http://urbancompass.net/?p=1490

- you even commented on it! ;-)

Anonymous said...

Tish, I would suggest you to use aafter.com to find the most relevant news for your work. It would be better, if you type *** (3 stars) and paste the article you have written in the search box. It will let you get the best and most related contents to your article. Or, you can paste any piece of article after typing *** (3 stars) in the search box to get relevant guesses. You may also visit its How to Use link to find things that would help you solving your problem.

Tish Grier said...

@Bill--I remember that post! and yes, I thought of going over to Urban Compass and seeing if I could find it. UC has excellent SEO and comes up just as high as the Courant in searches on Hartford stuff.


@Anon--will take a look at aafter.com to see how it works. thanks for the tip!

jpo said...

A few points on this:

- As always, search results are highly dependent on the input parameters, and sometimes that's a hit-or-miss proposition. FWIW, my first attempt at Google was "eddie perez" courant comments, and this turns up at least three relevant hits on the first page of results. Other searches do even better, like "eddie perez" courant racial comments, which even turns up a hit for your original blog post on the story in 2007. Similar searches, though, like the ones you did, can produce nothing.

- In this particular case, searching for "comments" is a tough task, because of the ubiquity of this term on the web (just about every one of the web's nine kajillion blog posts contains the word "comment" or "comments"). Of course, search engines can and should (and do) weight terms based on their frequency and location of occurrence, but searching for things like "web" and "blog" is inherently going to generate a lot of results.

- I'm not buying the speculation that this is related to the growth of real-time search. I think it's more just a characteristic of the always-difficult task of finding the right needle in an ever-increasing haystack. RTS just makes the haystack a little bit bigger, but ultimately that's a good thing.

Tish Grier said...

that's intersting, as I tried eddie perez courant comments and got nothing--with and without quotes. Perhaps it was that the quotes weren't in the right spot?

I'll def. agree about the ever-increasing haystack--and yes, it's not a bad thing. That being the case, filtering will be more important than ever.

Still, I'm always concerned about the collective short memory of the web. I always think of those now-you-see-'em-now-you-don't pics of certain individuals with Mao and Stalin. if it gets too hard to find the originals, will people give up?

Wendell said...

Was always like this?

My small public library has old records of local newspaper stories, but no search-able index other than date. Some other pubic records are around... On the other hand, each year more old books go into the dustbin - lost for good I assume.

What of all the daily news film and video for the past 60 years... Much of it must be lost by now as well?

Interesting to consider what the life-expectancy of a post or local news story is today compared to times past.

Tish Grier said...

When I was talking with some folks at the Software and Information Industry Assoc. meeting a couple of weeks ago, I was talking with two librarians who were discussing the challenges of indexing when so many different libraries keep their own indexing systems...

I've also been doing some checking into content re selling sites, mostly for newspapers, that are taking over the task of indexing--but at a price. It's definitely scary how we could, possibly, lose a lot of important connections to the past because the information either isn't digitized, is digitized and available for a price, or isn't cross-indexed on the hard-copy/local level. truly a connundrum