Guerilla ERM: Lessons Learned from Some Time in the Trenches

Cal Lee
University of Michigan
School of Information

[Appears in Ohio Archivist, Spring 2001, 3-7.]

ERM stands for electronic records management, an area of increasing importance to records managers and archivists. A significant portion of our activities are conducted through the use of computers: business transactions, government services, political activism, informal correspondence, entertainment, and many others. Given well-recognized issues like technological obsolescence and potential mismanagement of computer files, we must make a concerted effort to ensure that these materials are preserved. We need to act now in the ways that we can, rather than waiting for better solutions to come along.

A Note on the Title

The title and theme of this article were inspired by a piece written by Jakob Nielsen entitled "Guerrilla HCI: Using Discount Usability Engineering to Penetrate the Intimidation Barrier." Nielsen is a prominent figure in the world of human-computer interaction (HCI), who advocates a very pragmatic approach to designing usable web sites. HCI (occasionally also called CHI) is, not surprisingly, the study of how people interact with computers and how computer systems might be better designed to interact with people. With its strong ties to cognitive psychology, the HCI literature tends to emphasize detailed theoretical models, extensive empirical data collection and rigorous statistical analysis. Such an approach can be quite daunting to someone with a limited budget who just wants to create a web site that is reasonably easy to use.

Nielsen's concern is that "insisting on using only the best methods may result in having no methods used at all." The result is many web sites put together with no feedback from actual users, which inevitably leads to poor design. A better approach, says Nielsen, is to practice "discount usability engineering." Sitting down with three or four potential users and asking them to "think aloud" as they try to navigate around your site, for example, can reveal valuable insights about what needs to be fixed. The results might not meet the standards of a research scientist, but they are profoundly better than no results at all.

Likewise, I contend that the archives profession could benefit greatly from more "guerilla ERM." We must take action now, given the reality we face. I see this need as much more urgent than the one Nielsen poses. Unlike a badly designed web site, which can still be used, given enough effort, electronic records that have been mismanaged often will be lost forever.

Some Background

The lessons I describe below are based on my experiences assisting numerous organizations in their efforts to manage, preserve and provide access to digital materials. Though each of those experiences has obviously contributed to my current perspective, the majority of my specific observations are based on work as Electronic Records Project Archivist at the Kansas State Historical Society (KSHS) from May 1999 to August 2000. The project was funded by a grant from the National Historical Publications and Records Commission (NHPRC) to conduct applied research. Under a previous NHPRC grant, Margaret Hedstrom, Associate Professor at the University of Michigan School of Information, had consulted Kansas on electronic records strategies and put together an initial draft of the Kansas Electronic Records Management Guidelines. Our work under the second grant was an attempt to apply the concepts contained in the guidelines.

In this capacity, I had the opportunity to:

I was certainly not alone in any of these efforts. The director of the project was Pat Michaelis and other members of the team were Linda Barnickel, Cynthia Laframboise, Matt Veatch and Jason Wesco. Without such a great group of people, the project could not have been so successful. The list of all other individuals outside of the KSHS who contributed to the project in some way would be too long to provide here.

Lessons Learned

I present the following as lessons rather than principles, guidelines or even best practices. They reflect only my own observations, though I believe they convey some important insights for our profession.

Research does matter.

Like other archivists, I have been witness to numerous written and spoken debates about the degree to which research contributes to real archival practice. This is an extremely important issue to raise, since research completely disconnected from practice has little point. Scrutiny by others, researchers and practitioners alike, is a vital component for research to have relevance to our profession. Of course, many individuals serve as both researchers and practitioners, which is another important means to provide interchange between theory and practice.

This debate within the archival profession, however, is often characterized by two moves that I see as counterproductive:

We do have answers.

Some archivists are fond of saying, "All we can do is raise the issues. We don't have any answers." I believe such a claim again reflects unrealistic assumptions, this time about what constitutes an answer. If the desire is for a method that will allow all electronic records to be stored on one type of computer for the rest of time without any technological difficulties, then indeed we do not have this answer and there are very good reasons to assume that we never will. If instead the desire is for methods, procedures and systems that can be used to facilitate the long-term preservation of authentic electronic records, then the world is full of answers. If we tell everyone who asks that we have no answers to electronic records questions, then we are effectively telling them two things:

YMMV (Your mileage may vary).

In Internet lingo, "YMMV" is a common caveat to bold statements. It can be taken as shorthand for something like "This has been the case, in my experience. But your situation may be a bit different." If we view the electronic records literature as a source of guidance from which to sample, as appropriate to our own social and technological contexts, then it can prove extremely valuable. If we assume an implicit YMMV in all conclusions, then we can view them with both professional engagement and healthy skepticism.

I think the Functional Requirements for Evidence in Recordkeeping project at the University of Pittsburgh is a good example. The details of that work (functional requirements, productions rules and metadata specifications) are too often approached as monolithic checklists for good recordkeeping, while losing site of the "literary warrant" concept that is so essential to their interpretation. In my own work, I have received a great deal of conceptual guidance from the Pitt project documents. This does not mean, however, that the story ends there.

Other examples include the Reference Model for an Open Archival Information System (OAIS), the Recordkeeping Metadata Standard for Commonwealth Agencies from Australia, the recently completed CURL Exemplars for Digital ARchiveS (CEDARS) project and the work between the U.S. National Archives and Records Administration (NARA) and the San Diego Supercomputer Center (SDSC). The documents associated with these projects can seem overwhelming at first. They are much more palatable if you sample from them, rather than trying to swallow them whole. If you can identify parts that are helpful to your institution's electronic records efforts and flag others as seemingly inappropriate, then you will be much better off than if you had failed to look at them at all.

Everyone can do applied research.

Research is about taking educated guesses, testing them in some way and then documenting the results. Just as Nielsen argues that small-scale user testing is important to designing usable web sites, so too is small-scale applied research important for meeting the needs of our stakeholders. Taking the time to document what we've learned can also save us and other archivists from the proverbial reinvention of the wheel. I must admit that this is a lesson I struggle with myself. It often feels much more rewarding to do, than to document what was done. As archivists, however, it's hard to deny the importance of documenting our activities for purposes of organizational (and professional) learning and memory.

As mentioned above, the Kansas electronic records project was an example of applied research. We were able to apply many of the components from our guidelines, and we learned numerous valuable lessons along the way (some of which are reflected in this article). The NHPRC has provided funding for numerous important research projects over the years, including the current Joint Electronic Records Repository (JERRI) work in Ohio. I am confident that JERRI will prove to be an extremely valuable source of guidance for other states in confronting the issue of electronic records custody.

Also worth mentioning is the Trustworthy Information Systems (TIS) project in Minnesota. For two reasons, this NHPRC-funded project serves as an excellent example of the previous point about sampling from existing requirements to meet one's needs. First, the TIS Handbook draws from a rich variety of previous work and existing federal policy then incorporates numerous details about particular legal requirements within the state of Minnesota. Second, in the words of the Handbook itself, it "provides a thorough, effective, and practical set of tools to craft procedures based on the specific and unique needs and information requirements of your government agency." As Robert Horton, State Archivist of Minnesota, has explained to me, trustworthiness is more a matter of "family resemblance" than strict definition. That is, recordkeeping systems that meet more of the criteria will be more trustworthy, but the correct balance for a given set of records to be considered trustworthy will vary by circumstances. (The concept of family resemblance comes from Ludwig Wittgenstein's philosophy of language, by the way, and is often a helpful way to think about thorny archival questions such as "What is a record?".) The project team also worked through the TIS criteria in a number of agencies to see how they applied in each case.

Not all electronic records efforts will have the benefit of NHPRC funding, and available funds at your home institution may be quite limited, but this does not preclude any sort of experimentation. (See " It will only break if you don't play with it" below.)

Resources are limited, meaning is expensive.

This point is closely related to those provided above. Archivists have known for a long time that capturing the context of records is not an exact science. Arrangement and description, appraisal and references services all influence what meaning will be made of records for which an archives is responsible. We must make compromises on what is said and what is left unsaid. All records have numerous layers of meaning, which we attempt to manage through their content, context and structure.

With electronic records, technological dependencies make these issues even more apparent. The layers of meaning are manifested in traditional recordkeeping systems in ways that are often relatively implicit and change only gradually over time. Electronic recordkeeping systems, however, must be explicit about which components will be preserved and how they will be reflected. In order to manage the complexity of technological components necessary to turn some charges on a physical medium into the meaningful records we desire, these components are broken into various layers of abstraction.

"Computer science is largely a matter of abstraction: identifying a wide range of applications that include some overlapping functionality, and then working to abstract out that shared functionality into a distinct service layer (or module, or language, or whatever). That new service layer then becomes a platform on top of which many other functionalities can be built that had previously been impractical or even unimagined. How does this activity of abstraction work as a practical matter? It's technical work, of course, but it's also social work. It is unlikely that any one computer scientist will be an expert in every one of the important applications areas that may benefit from the abstract service. So collaboration will be required."


- Phil Agre, Red Rock Eater, March 25, 2000

Ask for help.

Lest we get turned off by the technical implications of the first few sentences of Agre's quote, it is important to remember the punch line. Even computer scientists aren't experts in everything related to computers. In order to tackle technical issues, they break them into parts. Different people specialize in different parts of the problem, and if they run into an issue with which they're not familiar, they ask someone who might know.

Studies have demonstrated that most information seekers tend to look for answers close to home before venturing out into the rest of the information universe. If you're encountering some issue related to electronic records, it's often very helpful to find out how your peers have been addressing it. This interaction need not be face-to-face. Electronic mailing lists, Usenet newsgroups, online forums and even good old telephone calls and snail mail can assist in this effort.

Look for help.

Of course, interpersonal contact isn't the only way to gain useful guidance. Sources of information that can help us address electronic records often take the form of documents. In the majority of cases, these documents are available through the Internet, though there are also some that will require a trip to a major research library.

It is important to realize that traditional sources of guidance, such as library and archives journals, are just the tip of the iceberg. Electronic records touch on so many other areas (laws, regulations, organizational practices, document management systems, file formats, network file sharing, electronic commerce, etc.) that it's best to remain open to numerous avenues of information. I maintain a topical directory of Electronic Recordkeeping Resources (a more actively updated version of the directory I created in Kansas) that points to some that I have found to be relevant.

Conferences and coursework are also important options. Several members of the KSHS staff attended the Cohasset Managing Electronic Records (MER) conference in 1995, which informed them of many important issues and greatly contributed to their decision to pursue their first NHPRC grant. When I started my work in Kansas, I had recently completed a Master's degree from the University of Michigan School of Information (SI), which provided me with many relevant skills and concepts. There are also other educational options for those unable to attend conferences or take on more formal coursework. The NHPRC has been placing increasing emphasis on new avenues for archival education, as reflected by projects such as "Educating Archivists and their Constituencies," which is being managed by Minnesota and is focusing heavily on metadata and the eXtensible Markup Language (XML).

You get extra points for copying off your neighbor.

Most issues each of us confronts are also being confronted by others. Whenever possible, we should borrow their ideas and bend the ideas to fit our own situations. This will save time and effort and establish connections with our peers in other institutions. The KSHS has lived by this lesson. The "Kansas Digital Imaging Guidelines for State Government Records," for example, is an adaptation of guidelines developed by the state of Alabama. When asked about the use of Dublin Core tags in state web sites, I often pointed people to the "User guide to Minnesota Metadata." The "Electronic Records Draft Guidelines" from Mississippi provided us with some helpful guidance. Along with several other states and the government of Canada, we also took our lead on web site issues from "Guidelines for Electronic Records Management on State and Federal Agency Websites" by Charles McClure and J. Timothy Sprehe.

The Kansas electronic records project benefited most, however, from work being carried out in Ohio. The Kansas Electronic Records Committee (ERC) was inspired by and modeled after the ERC in Ohio. Records series from the Ohio ERC General Schedule for Electronic Records Subcommittee served as templates for many of our own. We carried out a case study to implement recommendations from the Ohio ERC File Management Subcommittee as a way of managing the files on our own internal network at the Kansas State Historical Society. The Kansas ERC is currently evaluating email guidelines from a number of places, including Ohio, in order to create email guidelines for the state of Kansas.

In turn, the Ohio ERC has borrowed and modified the Kansas Electronic Records Management Guidelines (which also drew heavily from existing documents) for use in Ohio. Ohio and Kansas have also cooperated on an effort to borrow the TIS Handbook from Minnesota for application in other states.

Everyone can be a "techie."

I am frustrated when I hear an archivist claim, "I'm not a technical person, so I can't really talk about that." As I stated above, no one has a thorough understanding of every detail of computer systems, and more importantly, even people whose jobs are intimately tied to computers often only understand a tiny portion of what they could potentially know. In order to be a sales manager for a software company, for example, it's not very likely that you would need to know how to write programs in C++. In order to be a systems analyst for an Internet security company, it's also not likely that you would have to know the intimate details of how instruction sets differ on the Pentium III processor versus the Pentium II.

More important than any of the details of how particular computer systems work is the language used to describe them more generally. Learning this language takes some effort and ongoing vigilance, but it does not take a PhD in computer science. If I were trying to devise a plan for preserving a database, it would be a very good idea for me to know what tables, records and data dictionaries are, but I wouldn't have to have the entire Oracle 8i operator's manual memorized in order to take part in such a conversation. If someone uses a term with which I'm not familiar, I've learned to ask them to explain it, or look it up. The Electronic Recordkeeping Resources site includes links to numerous online dictionaries of technical terms. It's often a pleasant surprise to find that learning a handful of terms related to a given technical issue qualifies us to discuss preservation issues related to that issue. If a question arises about the relative worth of two different software applications, one can visit the web sites of the companies that make them and read reviews in the trade literature.

The point is not to become a "computer expert" (as if such a category were even truly meaningful). Instead, we must simply be able to articulate to those who are responsible for computer system what it is that we're asking of them and how they might go about doing it. That final point is important to emphasize. If we expect programmers, system managers or anyone else to implement our requirements, it is best to articulate them as more than simply statements of need. Learning how to express concerns in terms of data models, use cases, business rules and functional requirements is extremely helpful in getting things done. Once again, you might be pleasantly surprised how easy some of this jargon is to pick up and use. It's not about the bits and bytes. It's about talking the talk, so others with skills in implementing our ideas can make them into a reality.

Open systems are your friends.

In order to manage the complexity of computer systems, breaking them into layers of abstraction, as described above, is only part of the story. It is also very helpful to develop and adopt conventions, generally called standards, for how those layers will work. That way, if I have the same layer on my system as you do on yours, we can be confident that they will be compatible. Important examples of these standards are protocols (such as TCP/IP and HTTP) and file formats (such as HTML). These standards allow us to exchange information through the Web, even though we don't all use the same software. In both the physical world and digital environment, standards address a common problem: each interface (i.e. point of contact between systems) adds complexity.

Standards turn an "N times M" problem into an "N plus M" problem. Stated another way, the number of necessary technical pieces that must be built between components in order for them to exchange information between each other is greatly reduced by allowing all components to interface with a common standard rather than having to all interface with one another. This is much like the idea behind the language Esperanto, which could (if widely known) allow individuals from different countries to converse with one another in one common language rather than attempting to learn the native languages of all other countries in the world.

Data management, interchange, interoperability, migration and ongoing accessibility are greatly facilitated by the adoption of open standards, which serve this Esperanto role for computer systems. As stated in the Kansas Electronic Records Management Guidelines:

"Whenever feasible, file formats, protocols and other system specifications adopted by state agencies should be those developed and adopted by recognized standards bodies. Since the requirements for fulfilling these standards are both publicly documented and generally supported by more than one vendor, agencies that adopt them will be much less likely to find themselves stuck with valuable but inaccessible records than will agencies that adopt more closed systems. The appropriate standards body will depend upon the nature of the technology involved, but three particularly important sources of standards relevant to electronic records management are the International Organization for Standardization (ISO), Internet Engineering Task Force (IETF) and World Wide Web Consortium (W3C)."

The adoption of such standards will greatly simplify our lives as archivists, regardless of whether we are taking physical custody of electronic records ourselves or advising record creating entities (individuals, businesses, government agencies, etc.). Any time a number of systems conform to a standard, this means we learn about the standard, rather than all of the details of each system. If someone tells you that she is creating policy documents as web pages that comply strictly with the W3C Recommendation for the hypertext markup language (HTML), for example, then knowledge of HTML would allow you to instantly determine what many of the preservation implications are for those documents. The same cannot be said for document created in XYZ Company's proprietary file format, which requires special software from the XYZ company to read.

Even standards change over time, and software vendors tend to add "extensions" to formats that only work in their own software. The recent CLIR report on "Risk Management of Digital Information" provides a discussion of the preservation implications of such extensions. Even HTML, which I mention above as an example of an industry standard, has fallen victim to this phenomenon. A bit part of the "browser wars" between Netscape Navigator and Microsoft's Internet Explorer was the constant shifting of what non-standard HTML tags each browser recognized. This is still much better, however, than having to deal with completely different formats for every collection of electronic records.

We also need not take a passive role when it comes to standards development. A number of prominent metadata standards initiatives have benefited greatly from the participation of archivists and librarians who are concerned about the preservation of digital materials. As we learn about industry standards and identify issues that seem not to be addressed or elements that seem to be missing, you can make your concerns known to the appropriate standards body. Though they will not always operate as quickly as you might like, many of the standards development groups are surprisingly open to new contributions.

At a more local level, one of the greatest avenues for advocating electronic records issues in Kansas was the development of the Kansas Statewide Technical Architecture (KSTA). The KSTA is a broad document, providing guidance to state entities on how to develop, manage and maintain information technology. This is Kansas's own effort to manage some of the complexity involved in facilitating government services through computer systems across the state. Serendipitously, the KSTA development effort began right around the time that our second NHPRC project was getting started. As stated above, we were able to introduce electronic records provisions into a number of the KSTA chapters and eventually even developed an entire chapter for the KSTA on Electronic Records Management and Preservation. This process greatly increased the visibility of our concerns among the information technology managers of the state. The need to create and maintain the electronic records chapter for the KSTA was also a major selling point for the creation of the Kansas Electronic Records Committee (ERC).

Pick your battles.

I don't think I'll ever forget the meeting I had with the director of a small nonprofit case management agency in Michigan as part of a project to improve their document management systems and procedures. As a group of us huddled around a form that was a photocopy of a photocopy of a photocopy, with several items that were no longer appropriate and others that were no longer even readable, he explained that they had created a much easier version of the form to use internally. They still needed to use the old form, however, when sending a copy on to the agency that created it. "Couldn't you get the other agency to accept your new, improved form in place of the old one?" one of our project team members asked. The agency director shrugged and gave us a look that hinted at years of belabored arguments over minutiae such as copies of such forms. He stated simply, "That's not a hill worth dying on."

In our efforts to apply our guidelines in Kansas, we encountered a number of hills that we eventually decided to abandon. Some were agencies that had originally agreed to take part in case studies, then failed to return our repeated calls and email messages, or who seemed uninterested in ever adopting retention schedules for their records. Others were laws or regulations that did not quite reflect the spirit of our electronic records guidelines but would have cost months of effort and huge political capital to address. Still others related to annual budgeting constraints that did not support the sort of long-range planning that we knew was most appropriate for the preservation of electronic records.

We did have noteworthy successes along all of these lines:

Remain flexible.

All of the accomplishments listed above required both persistence and compromise. None of them turned out exactly as we'd planned. In fact, many of our most important objectives emerged over the course of the project. If we had attempted to stick too rigidly to a pre-established agenda, we would have missed some extremely important opportunities.

Learn the current concerns of your stakeholders.

In order to be an effective advocate for the preservation of electronic records, it is important to be aware of the current concerns of the parties involved. Most of this article has been focusing on state government records, but this lesson is important for other types of archives as well. Managing electronic records appropriately takes resources, in the form of mental energy, time, expertise and often technology. If we want to convince others to commit such resources, we must speak in terms of their current goals and values. For one person, this might be fear of legal risks, for another it might be providing services more efficiently, for another it could be the need for public accountability or a sense of her community's history.

This point is closely related to the need to remain flexible. If a local newspaper is running a series on how various agencies are complying with the state's open records (i.e. sunshine) laws, this is an excellent opportunity to raise the issue of managing electronic records. If your boss's boss's boss starts telling everyone that "knowledge management" is the wave of the future, it's probably a good idea to explain how the management of authentic electronic records is a pivotal component of knowledge management.

For those who serve a lot of genealogists, attending some of their meetings and speaking to them about digital preservation concerns could be a good idea. The more they know about the issues, the more likely they will be to make smart decisions about preserving their own digital materials and the more actively they will advocate for the allocation of public resources to address these issues.

If you have created documents to inform others about electronic records issues, it's also often a good idea to have multiple versions. Three documents that we used most often for this purpose in Kansas -- the guidelines, KSTA chapter and "Kansas Electronic Recordkeeping Strategy: A Whitepaper" -- provided largely the same content, but in very different styles and lengths. We found that the Whitepaper, for example, was a great document to give to someone who only wanted an "executive summary" of the issues.

It will only break if you don't play with it.

If we do not act to preserve electronic records, they will quickly become useless, through medium degradation, mismanagement and technological obsolescence. We know the general approaches for dealing with these issues, and there is an urgent need for us to apply them. As new situations arise, the only way to discover what techniques to apply to them is to make an attempt. Colin Webb has indicated in a recent interview that the National Library of Australia can attribute a great deal of its success on this front to an attitude of "learning by doing."

Archival work has never had the benefit of certainty. Appraisal decisions run the risk of destroying too much or not enough. Descriptive practices always run the risk of emphasizing attributes of our collections that will not best facilitate future research. Allocation of resources to conserve one collection rather than accessioning another (or vice versa) often looks foolish in retrospect. Finally, with electronic records, there is one certainly on which we can rely. Failure to act immediately will result in massive loss of cultural memory. With that certainly in mind, any ERM, even guerilla ERM, starts to look pretty good.

References:

CURL Exemplars for Digital ARchiveS (CEDARS), http://www.curl.ac.uk/projects/cedars.html

Electronic Recordkeeping Resources, http://www-personal.si.umich.edu/~calz/ermlinks/

Electronic Records Draft Guidelines, Mississippi, http://www.mdah.state.ms.us/arlib/erglnav.html

Functional Requirements for Evidence in Recordkeeping, http://www.sis.pitt.edu/~nhprc/

HCI Bibliography: Human-Computer Interaction Resources, http://www.hcibib.org/

Joint Electronic Records Repository (JERRI), http://www.state.oh.us/das/dcs/opp/jointrepository.htm

Kansas Digital Imaging Guidelines for State Government Records, http://www.kshs.org/archives/digimag.htm

Kansas Electronic Recordkeeping Strategy: A Whitepaper, http://www.kshs.org/archives/ermwhite.htm

Kansas Electronic Records Committee (ERC), http://da.state.ks.us/itab/erc/

Gregory W. Lawrence, William R. Kehoe, Oya Y. Rieger, William H. Walters, and Anne R. Kenney, Risk Management of Digital Information: A File Format Investigation, Council on Library and Information Resources, June 2000, http://www.clir.org/pubs/abstract/pub93abst.html

Charles R. McClure and J. Timothy Sprehe, Guidelines for Electronic Records Management on State and Federal Agency Websites, http://istweb.syr.edu/~mcclure/guidelines.html

The National Library of Australia's Digital Preservation Agenda, an Interview with Colin Webb, RLG DigiNews, February 15, 2001, http://www.rlg.org/preserv/diginews/diginews5-1.html

Jakob Nielsen, Guerrilla HCI: Using Discount Usability Engineering to Penetrate the Intimidation Barrier, http://www.useit.com/papers/guerrilla_hci.html

Ohio Electronic Records Guidelines, http://www.ohiojunction.net/erc/RMGuide/ERGuidelines.htm

Recordkeeping Metadata Standard for Commonwealth Agencies, http://www.naa.gov.au/recordkeeping/control/rkms/contents.html

Records Management and Preservation Architecture, Chapter of the Kansas Statewide Technical Architecture (KSTA), http://da.state.ks.us/itab/erc/reports/architecture.htm

Reference Model for an Open Archival Information System (OAIS), http://www.ccsds.org/documents/pdf/CCSDS-650.0-R-1.pdf

Kenneth Thibodeau, Building the Archives of the Future: Advances in Preserving Electronic Records at the National Archives and Records Administration, D-Lib Magazine, February 2001, http://www.dlib.org/dlib/february01/thibodeau/02thibodeau.html

Red Rock Eater News Service, http://dlis.gseis.ucla.edu/people/pagre/rre.html

Trustworthy Information Systems Project, http://www.mnhs.org/preserve/records/tis/tis.html

User guide to Minnesota Metadata, http://bridges.state.mn.us/MMG-DCUserGuide.PDF

Ludwig Wittgenstein, Philosophical Investigations: The English Text of the Third Edition, New York: MacMillan, 1958.