Physical libraries in a digital world
November 22nd, 2011 by davidw
I’m at the final meeting of a Harvard course on the future of libraries, led by John Palfrey and Jeffrey Schnapp. They have three guests in to talk about physical library space.
NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people. |
David Lamberth lays out an idea as a provocation. He begins by pointing out that until the beginning of the 20th century, a library was not a place but only a collection of books. He gives a quick history of Harvard Library. After the library burned down in 1764, the libraries lived in fear of fire, until electric lights came in. The replacement library (Gore Hall) was built out of stone because brick structures need wood on the inside. But stone structures are dank, and many books had to be re-bound every 30 years. Once it filled up, 25-30 of Harvard libraries derived from the search for fireproof buildings, which helps explain the large distribution of libraries across campus. They also developed more than 40 different classification systems. At the beginning of the 20th C, Harvard’s collection was just over one million. Now it adds up to around 18M. [David’s presentation was not choppy, the way this paraphrase is.]
In the 1980s, there was continuing debate about what to do about the need for space. The big issue was open or closed stacks. The faculty wanted the books on site so they could be browsed. But stack space is expensive and you tend to outgrow it faster than you think. So, it was decided not to build any more stack space. There already was an offsite repository (New England Book Depository), but it was decided to build a high density storage facility to remove the non-active parts of the collection to a cheaper, off-site space: The Harvard Depository (HD).
Now more than 40% of the physical collections are at HD. The Faculty of Arts and Sciences started out hostile to the idea, but “soon became converted.” The notion faculty had of browsing the shelves was based on a fantasy: Harvard had never had all the books on a subject on a shelf in a single facility. E.g., search on “Shakespeare” in the Harvard library system: 18,000 hits. Widener Library is where you’d expect to find Shakespeare books. But 8,000 of the volumes aren’t in Widener. Of Widener’s 10K Shakespeare, volumes, 4,500 are in HD. So, 25% of what you meant to browse is there. “Shelf browsing is a waste of time” if you’re trying to do thorough research. It’s a little better in the smaller libraries, but the future is not in shelf browsing. Open and closed stacks isn’t the question any more. “It’s just not possible any longer to do shelf browsing, unless we develop tools for browsing in a non-physical fashion.” E.g., catalog browsers, and ShelfLife (with StackView).
There’s nobody in the stacks any more. “It’s like the zombies have come and cleared people out.” People have new alternatives, and new habits. “But we have real challenges making sure they do as thorough research as possible, and that we leverage our collection.” About 12M of the 18M items are barcoded.
A task force saw that within 40 years, over 70% of the physical collection will be off site. HD was not designed to hold the part of the collection most people want to use. So, what can do that will give us pedagogical and intellectual benefit, and realizes the incredible resource that our collection is?
Let me present one idea, says David. The Library Task Force said emphatically that Harvard’s collection should be seen as one collection. It makes sense intellectually and financially. But that idea is in contention with the 56 physical libraries at Harvard. Also, most of our collection doesn’t circulate. Only some of it is digitally browsable, and some of that won’t change for a long long long time. E.g., our Arabic journals in Widener aren’t indexed, don’t publish cumulative indexes, and are very hard to index. Thus scholars need to be able to pull them off the shelves. Likewise for big collections of manuscripts that haven’t even been sorted yet.
One idea would be to say: Let’s treat physical libraries as one place as well. Think of them as contiguous, even though they’re not. What if bar-coded books stayed in the library you returned to them to? Not shelved by a taxonomy. Random access via the digital, and it tells you where the work is. And build perfect shelves for the works that need to be physically organized. Let’s build perfect Shakespeare shelves. Put them in one building. The other less-used works will be findable, but not browsable. This would require investing in better findability systems, but it would let us get past the arbitrariness of classification systems. Already David will usually go to Amazon to decide if he wants a book rather than take the 5 mins to walk to the library. By focusing on perfect shelves for what is most important to be browsable, resources would be freed up. This might make more space in the physical libraries, so “we could think about what the people in those buildings want to be doing,” so people would come in because there’s more going on. (David notes that this model will not go over well with many of his colleagues.)
53% of library space at Harvard is stack space. The other 47% is split between patron space and space staff. About 20-25% is space staff. Comparatively, Harvard is lower on patron space size than typical. The HD is holding half the collection in 20% of the space. It’s 4x as expensive to store a work on a stack on campus than off.
David responds to a question: The perfect shelves should be dynamic, not permanent. That will better serve the evolution of research. There are independent variables: Classification and shelf location. We certainly need classification, but it may not need to map to shelf locations. Widener has bibliographic lists and shelf lists. Barcodes give us more freedom; we don’t have to constantly return works to fixed locations.
Mike Barker: Students already build their own perfect shelves with carrels.
Q: What’s the case for ownership and retention if we’re only addressing temporal faculty needs?
A lot of the collecting in the first half of the 20 C was driven by faculty requests. Not now. The question of retention and purchase splits on the basis of how uncommon the piece of info is. If it’s being sold by Amazon, I don’t think it really matters if we retain it, because of the number of copies and the archival steps already in place. The more rare the work, the more we should think about purchase and retention. But under a third of the stack space on campus ideal environmental conditions. We shouldn’t put works we buy into those circumstances unless they’re being used.
Q: At the Law Library, we’re trying to spread it out so that not everyone is buying the same stuff. E.g., we buy Peruvian materials because other libraries aren’t. And many law books are not available digitally, so we we buy them … but we only buy one copy.
Yes, you’re making an assessment. In the Divinity library, Mike looked at the duplication rate. It was 53%. That is, 53% of our works are duplicated in other Harvard libraries.
Mike: How much do we spend on classification? To create call numbers? We annually spend about 1.5-2M on it, plus another million shelving it. So, $3M-3.5M total. (Mike warns that this is a “very squishy” number.) We circulate about 700,000 items a years. The total operating budget of the Library is about $152M. (He derived this number by asking catalogers who long it takes to classify an item without one, divided into salary.)
David: Scanning in tables of contents, indexes, etc., lets people find things without having to anticipate what they’re going to be interested in.
Q: Where does serendipity fall in this? What about when you don’t know what you’re looking for?
David: I agree completely. My dissertation depended on a book that no one had checked out since 1910. I found it on the stacks. But it’s not on the shelves now. Suppose I could ask a research librarian to bring me two shelves worth of stuff because I’m beginning to explore some area.
Q: What you’re suggesting won’t work so well for students. How would not having stacks affect students?
David: I’m being provocative but concrete. The status quo is not delivering what we think it does, and it hasn’t for the past three decades.
Q: [jeff goldenson] Public librarians tell us that the recently returned trucks are the most interesting place to go. We don’t really have the ability to see what’s moving in the Harvard system. Yes, there are privacy concerns, but just showing what books have been returned would be great.
Q: [palfrey] How much does the rise of the digital affect this idea? Also, you’ve said that the storage cost of a digital object may be more than that of physical objects. How does that affect this idea?
David: Copyright law is the big If. It’s not going away. But what kind of access do you have to digital objects that you own? That’s a huge variable. I’ve premised much of what I’ve said on the working notion that we will continue to build physical collections. We don’t know how much it will cost to keep a physical object for a long time. And computer scientists all say that digital objects are not durable. My working notion here is that the parts that are really crucial are the metadata pieces, which are more easily re-buildable if you have the physical objects. We’re not going to buy physical objects for all the digital items, so the selection principle goes back to how grey or black the items are. It depends on whether we get past the engineering question about digital durability — which depends a lot on electromagnetism as a storage medium, which may be a flash in the pan. We’re moving incrementally.
Q: [me] If we can identify the high value works that go on perfect shelves, why not just skip the physical shelves and increase the amount of metadata so that people can browse them looking for the sort of info they get from going to the physical shelf?
A: David: Money. We can’t spend too much on the present at the expense of the next century or two. There’s a threshold where you’d say that it’s worth digitizing them to the degree you’d need to replace physical inspection entirely. It’s a considered judgment, which we make, for example, when we decide to digitize exhibitions. You’d want to look at the opportunity costs.
David suggests that maybe the Divinity library (he’s in the Phil Dept.) should remove some stacks to make space for in-stack work and discussion areas. (He stresses that he’s just thinking out loud.)
Matthew Sheehy, who runs HD, says they’re thinking about how to keep books 500 years. They spend $300K/year on electricity to create the right environment. They’ve invested in redundancy. But, the walls of the HD will only last 100 years. [Nov. 25: I may have gotten the following wrong:] He thinks it costs about $1/ year to store a book, not the usual figure of $0.45.
Jeffrey Schnapp: We’re building a library test kitchen. We’re interested in building physical shelves that have digital lives as well.
[Nov. 25: Changed Philosophy school to Divinity, in order to make it correct. Switched the remark about the cost of physical vs. digital in the interest of truth.]
I would vote ‘yes’ to “Perfect-Shelf” in “Find-ability Systems” whether we have enough resources and metrics to develop the “perfect shelves” under the following conditions:
I. SERVICE CAPABILITY READY FOR CRITICAL MASS & CONTEXT-AWARENESS in WEB-SCALE MANAGEMENT:
Ensure researchers are armed with critical mass of info needed for every stage of their research context sensitively, a.k.a.
1a) Research question formulation ->
1b) Research design and methodology ->
1c) Research processes, e.g. data collection, coding and analysis ->
1d) Research output and outcome measurements, e.g. compilation in terms of bibliographic lists, shelf-list in study carrels, papers, publishing in terms of integrated publishing platform, sharing in terms of professional and social networking, configurability in terms of reuse and remote deployment of research processes and artifacts, impact factor analysis, etc.
It’s not a static lib guide, but a dynamic one where updates are sent into users’ experience context-sensitively like what has been done with mobile services delivered via mobile application stores.
As mobile services become the way of a researcher’s digital life, mobile learning and mobile resources from libraries make the “perfect shelves” digitally possible.
From library collection development and management perspective, that services are called intentional discovery services with critical mass of info. From library supporters and users’ perspective, that is “Info Spotlight,” where bar-coded items with RFID tags pulled dynamically from the “Perfect Shelf” which is:
1) Developed by library collection managers;
2) Organized in a way of classification, clustering, or fine-grained topics, and meta-data conformed to resource sharing standards as defined by libraries, museums and cultural heritage communities in the capacity of catalogers, indexers, content analysts, info architects, or domain experts, automated means, or others;
3) Refined by the following:
3a) the activities of communities and readers imposed upon library resources, e.g. CIRC stats, search logs, comments, social proximity measures, question points, etc.;
3b) author’s prestige and popularity; and
3c) publication impact factor measurement;
3d) reference look-ups, etc.;
II. LIBRARY AS PLACE & SPACE, BUYERS, GATEWAYS TO RESOURCES, D-LIB INFO PROCESSING INFRASTRUCTURE, ARCHIVES, ETC.
To meet the financial and intellectual access challenges, libraries have to be creative about how to build the “Perfect Shelves” with existing resources and new blood from multiple disciplines in collaboration with internal and external partners, e.g. campus IT, instructional technology, supply chains, bibliographic utilities, aggregating services, national libraries, etc.
It’s time to link all these processes into automated means for library materials processing and organization. It’s not the time to reduce the cost of cataloging, but to add new roles and processes into traditional cataloging functions and get ready for universal bibliographic control in digital age, e.g.
a) scanning book covers, TOCs, indexes, etc. based on pre- and post-resource selection and processing criteria, contractual agreement, etc.;
b) enhancing authority control to obtain measurement metrics of named entities, subjects, linking logic, rules, and relations;
c) defining data architecture, meta-data, linking logic, relationship and rules for building WORK/EXPRESSIONS/MANIFESTATIONS/ITEMS/HOLDINGS/CATEGORIES/LINKS (WEMI-HCL);
d) mining lib search logs, CIRC stats, etc. and augmenting additional data for baseline and comparison from other sources for usage analysis, etc.;
f) supporting info processing flows for production-level info analysis, graphing, maps, feature exaction, bundling or packaging, etc.;
g) packaging and repackaging info resources sensitive to the following:
g1) the needs, wants, interests and activities of individual users and groups, their use, and usage;
g2) context-awareness, freshness and robust in batch and real-time processing mode;
g3) user-centered design with C.I.A. and 121 e-agent;
g4) compliance requirements, e.g. contractual agreement, and other info security requirement;
h) ensuring data quality control using automatic means for taxonomy building and maintenance, usability assurance, etc.;
i) automating ownership and retention policy through user data, acquisition data, holdings and item data, and usage data.
The very fact that most item-level data for CIRC and patrons in ILS systems are unstructured, temporary, not sharable, etc. shall be reconsidered.
It’s also the time for the profession to look into what and how to meter our contributions to the learning community of a library. With librarians’ expertise in collection development and management, metadata services and cataloging, and user services, etc., we will mobilize the lib resources, workforce, and whatever is needed for the digital age in the 21st century.
As for librarianship in digital age, my question is “do we want to administer or being administered in the supportive role of a researcher’s digital life?”
In summary, the “Perfect-Shelf” is one stone for two birds – a) management for the intentional discovery of library resources with critical mass of information; b) info spotlight for lib resources promotion, community awareness, researcher’s digital life enrichment, etc. I haven’t checked into the perfect Shakespeare’s Shelf done by e-Book reader yet. However, I would imagine there is a lot for us to leverage when D-Lib info infrastructure is in place.
Sorry for the long comment in your inspiring blog and wonderful book!!! It’s hard to be pioneer like you.
As an artist, we have to live and breathe above the details and challenges facing New England Book Repository, Harvard Depository, Widener Library, yet still capable to paint the vision of their strategic directions and growth potentials. Anyhow, bravo for your “Findability Systems” in web-scale management!!! Please find my humble share listed in the comment section of your blog.
Sincerely yours,
Amanda Xu
I’m impressed, I have to admit. Seldom do I encounter a blog that’s both equally educative and
engaging, and let me tell you, you’ve hit the nail on the head.
The problem is something which not enough people are speaking intelligently
about. Now i’m very happy I stumbled across this
during my hunt for something regarding this.