Wednesday, September 30, 2009

Unit 5 Readings

Witten
2.2
I like Witten's emphasis on presenting an image of stability and continuity to the user, despite the impermanence and fluidity of digital documents. I hadn't previously considered the comparison between different editions of a physical work and different modifications of a digital document. The discussion on authority control is not new information, but I wonder if it is more or less relevant/necessary in the context of digital libraries with sophisticated search functions. The LCSH information is also not new, though it's a good overview of basic cataloging practices. It is interesting to contrast the linear arrangement of books on shelves with the endless rearrangement possible in a digital collection.
5.4-5.7
Good overview of MARC and DC. BibTeX was new to me; I like that it seems compatible with XML. Refer seems, on the surface, a little more comprehensible to me, but maybe I'm just drawn to the alphabetical ordering of categories. The section on TIFF was enlightening; I had no idea the format carried so much metadata. I'd like to explore that capability further. MPEG-7 seems extremely useful for digital libraries; the projected capabilities mentioned are pretty amazing, although the technical stuff was a little over my head.
Automatic extraction of metadata makes me a little nervous, perhaps because I am a firm believer in authority control. Many of the methods discussed seemed iffy, though key phrase assignment/extraction seems useful. With the volume of information out there it's useful to have some automated methods, but there will always be a need for manual assignment of metadata. Phrase hierarchies actually seem pretty interesting, though all the talk about complex algorithms is beyond me.

Gilliland
The table on typology of data standards is a really useful way of thinking about this stuff (e.g. LCSH vs. AACR, MARC fields vs. MARC21). Tables 2 and 3 are useful as well. She makes an important point that metadata for digital objects needs to exist independent of the current storage/retrieval system in order to survive migration.

Weibel
It's encouraging to read the reflections of someone who was involved with Dublin Core from its very first development. I laughed a little when he mentioned that "participants spent an hour of scarce plenary time talking about Type before realizing that the librarians and the computer scientists had been talking about completely different concepts" -- but the important thing is that they produced a functional standard in the end. Weibel questions the possible future effect of "folksonomies" on the metadata landscape, and I am both interested and discomfited by the notion. I liked his anecdote about the China-Mongolia railroad gages.

Wednesday, September 23, 2009

Week Three Muddiest Point

I was really interested in the discussion of different file types and their different features and uses. I'm wondering if there's a good, clear, concise resource online somewhere that outlines all of them? I'd love to have a quick comparison chart or something to refer to when choosing which to use.

Assignment 2: Flickr

I chose to scan the front pages of various national newspapers from the 2008 election. They can be found here.

Friday, September 18, 2009

Week Two Muddiest Point

I could still use some practical examples/demonstrations of interoperability between separate digital libraries to understand how that functions.

Week Three Readings

Lesk
It's interesting to read an author's musings about how to retain control of the layout of a document while still allowing for flexibility of viewing options, when said musings are being displayed as a PDF. Likewise the discussion of different scanning options when reading a scanned page. The discussion of how OCR technology has progressed is encouraging, and it seems like the ease and cost-effectiveness of converting texts to sophisticated digital format with high functionality will only increase with time. Pitt subscribes to a historical newspaper collection (can't recall the name) that solves beautifully the display problems mentioned in 3.4. I didn't realize that CMU had such a large book-scanning project underway. Lesk reiterates the crucial point, however, that reading off paper is still preferable to most people than reading from a screen.

Arms
This chapter deals with a lot of the same issues I picked up in Lesk. I appreciate the in-depth discussion of Unicode, which I know is a vital standard for displaying diverse languages but I don't know much about (e.g. UTF-8 encoding). Ditto the explanation of DTDs and SGML. The section on XML is extremely helpful, as it's a concept I've had trouble grasping in the past.

Lynch
Identifier systems are a critically important aspect of digital libraries--and really any digital/networked collection--that I haven't given much thought. I think laypeople (like myself) tend to think that you have a URL or filename and that's all you need, but Lynch identifies several contexts and usages that require a different approach. Also, though I've seen them before (I believe through the Government Printing Office), I didn't realize that PURLs were an OCLC creation.

Tuesday, September 8, 2009

Week Two Readings

Suleman and Fox
Not having a background in computer science, some of this was a little hard for me to follow, but it seems to be advocating an open, simple, and customizable protocol for use in DLs, which seems like a great thing. Interoperability and extensibility are going to be increasingly desirous to DL user populations, and it seems like that's what this system is attempting to do.

Arms, Blanchi, Overly ("Architecture")
There's a lot of discussion of client services in this article, and I think I need clarification about what it is and how it functions--I'm not quite grasping it at this point. I like the point about how the organization of information should not be biased by expectations about how users will approach the material. I've also never really considered some of the levels of complexity discussed here--like illustrations within a text being created as separate digital objects, or a meta-object constituting various resolutions of a single photo.

Payette, Blanchi, Lagoze, Overly ("Interoperability")
To rephrase for my own reference, the key to extensibility is clean separation of object structure, extensible interfaces, and mechanisms that implement extended functionality. The concept of Disseminator Types seems to make a lot sense for adding new functionality to an object. My problem with this article, however, is that I could use a clearer illustration of what the authors mean by "interoperable."

Saturday, September 5, 2009

Week One Readings

Candela et al.
The authors make a legitimate point about the "terminological imprecision" in the literature. Their differentiation between Digital Libraries, Digital Library Systems, and Digital Library Management Systems is a useful one; so too is their discussion of the interaction between the four categories of actors (end-users, designers, system administrators, and application developers).

Borgman
Borgman highlights the problematic nature of the term "digital library," which "obscures the complex relationship between electronic information collections and libraries as institutions." I think this is an important distinction/relationship to keep in mind, and efforts to create digital libraries should strive to bridge the gap between them. I also agree with Borgman that librarians tend to--or, I think, should--"take a broad view of the concept of a library." It will ultimately make them more useful and more relevant to a wider range of users. Borgman also traces back to one of the earliest definitions of a digital (or "electronic") library in 1992, which included the key elements of services, architecture, content, enabling technologies, users, and content. I think it is still useful and applicable to consider all those features when talking about DLs.

Paepcke et al.
Digital libraries being a conduit for funds that libraries normally wouldn't have access to is a point I hadn't considered before, though I wonder if this is the case in practice. This is a good discussion of the tensions between the library science and computer science fields, and the way that relationship has been impacted by technological developments (e.g. the advent of the Internet). I appreciate the insistence that "the core function of librarianship remains" in a world where people ask, "Aren't libraries kind of irrelevant since Google?" (Someone really said that to me. I was speechless, but I guess I should have a ready answer for the next time...)

Wednesday, September 2, 2009

Week One Muddiest Point

First post for the class, and my muddiest point is a question that may not have an answer. What is the functional difference between a digital library and a digital collection? Is there a difference, or are they two terms describing essentially the same thing? If a digital collection is one aspect of a larger physical library, can it be considered a digital library? I guess the the first assignment is intended to address this ambiguity.