Ask any archivist what question they receive the most, and the answer will invariably be some variation of: “when are you going to digitize all of this?” I field that question from people every week, and the lengthy answer is always a hybrid in which I attempt to meter expectations, teach something about digitizing material, and explain a timetable for a facility the size of the Vedder Library.
What it all comes down to is this: no, we aren’t going to digitize everything. Were I to suddenly kidnap an IT Team and barricade myself with them in the Vedder for the next fifteen years we might ostensibly make some progress — but even with a locked doors digitization-only workflow it would take a literal lifetime to get it all online, and even then the project would not be “done.”
Digitization is far more than taking a picture of something or scanning it and slapping it on a website. Done correctly, digitization is actually a process that involves preparing materials so they can be scanned, scanning items to make digital objects, creating descriptive information for the digital objects and originals, storing the original items so they can be readily retrieved on request (and for re-digitzation later if necessary), uploading the digital object to a content management system, backing up that content management system, running checks and doing maintenance on that system to maintain the integrity of the digital materials, and publishing those materials to a digital library so that researchers can understand the context and relationship of collections of digitized material. Heaven forbid you need to update software.
Digitizing a collection takes much longer than simply processing a collection, so for paper materials at the Vedder I’ve instead set the goal of improving our public catalogs by making descriptive finding aids available online. For a larger collection this can take months, but digitizing the material would take years, so we end up with a much larger inventory of useful material by writing dozens of finding aids for different collections in the same amount of time it would take me to digitize one of them. The information in these finding aids can also serve as the foundations of description for digitized materials down the road, so perhaps it is best to call this as a “measure twice, cut once” sort of approach.
Our photo collections are a different story. You may not be terribly surprised to hear that people like looking at pictures. This, and the fact that photographs convey far more information than one can catalog in a concise way, means that in a reference setting it’s really best to just have people look at an image for themselves. Photo collections get handled more frequently, are in higher demand, and are generally more compelling to a wider audience than manuscript material also — meaning that digitizing them and getting them available online is generally a far higher priority than it is with collections of papers.
To that end, I had the pleasure last week of meeting with a small team at Marist College that is part of this year’s Computer Science capstone project. Every year seniors in the School of Computer Science work on a capstone project in which they get practical experience working with client organizations in the Hudson Valley to develop and improve IT solutions for various needs. The Vedder Research Library has worked with two previous capstone groups, and this year the goal is to conduct research and lay the groundwork for a content management system which will eventually get the photographic collections of the Vedder accessible online. We don’t have a clear timetable yet, but perhaps this gives you a better picture of why my answer is always so long when folks ask about digitizing stuff.
Questions and comments can be directed to Jon via email@example.com.