Google at SMUG

My previous post was getting a bit long, so here, in its own entry, is the scoop on Google Scholar and SFX.

Roy Tennant gave a presentation called “Is Metasearch Dead?” Tennant says no, and he highlighted some of the weak points of Google Scholar. Coverage and timeliness are two of the big issues, as are the lack of metadata and sophisticated search features. Nevertheless Tennant is very glad to see Google working with libraries.

Anurag Acharya (AKA Mr. Google Scholar), the principal engineer of GS, spoke later in the day. According to Acharya, GS’s goal is to be a single place for all scholarly material in all research areas, sources, and languages. Some of the products/platforms covered are: “all major publishers except Elsevier and ACS,” HighWire, Allen Press, MetaPress, Atypon, Ingenta, Muse, and public A&I databases such as PubMed. He admitted that there is a timeliness issue that they are working to address.

In addition to the usual OpenURL set up, Google requires libraries to provide a holdings file so that it can add more prominent links for those items that a library has in full text. This has been a hot topic of discussion among librarians since Google added the OpenURL capability. Acharya clarified what holdings information Google requires for sites that want OpenURL enabled. Only coverage information is sent to Google, with no indication of who the title is licensed from. For example, Google would see that we have coverage of Callaloo from 2002 to the present, but they would not know whether that coverage was through ProQuest, Muse, etc. The holdings file is automatically produced by SFX for libraries that opt-in and is crawled weekly by Google. Google hopes to crawl this file more frequently in the future.

During beta testing, Google tested OpenURL with both 1 link and 2 links, and saw a dramatic increase in sustained usage when they included both links. He believes the added prominence of the full-text link is not enough to explain the sustained usage over time.

Acharya pointed out two ways in which GS departs from Google’s normal MO: 1) It provides links to content it has not crawled (because the content isn’t freely accessible) 2) It allows libraries control over how the OpenURL links are labeled on the page. He hopes that just as Google has moved outside its comfort zone, libraries will be willing to move outside their comfort zone and try Google’s way of providing links.

Acharya believes Google will support GS for the “foreseeable future” without “monetization.” He believes the most likely way Google would try to make money is through ads.

John Regazzi’s presentation, “The Battle for Mindshare,” was much cited at the conference as a reason libraries should seriously consider working with Google. In a survey Regazzi conducted, scientists named Google, Yahoo!, and PubMed, while librarians named ScienceDirect, WOS, and Medline, as the top 3 online scientific search resources they used or were aware of. Slide twenty shows the sobering numbers.