Practices, Standards, and Arrangements

'The Tsetse' (Livingstone 1857:frontispiece). Copyright Royal Geographical Society (with IBG). Used by permission for academic purposes only.

This section enumerates and defines Livingstone Online's methodological practices, data production standards, and hosting and backup arrangements.

Introduction    Top

Livingstone Online is a data-driven project. We promote access to and use of our entire digital collection. We work in a transparent and accountable manner. We seek to ensure the long-term digital preservation of our core data. As a result, we rely on a set of well-defined practices, standards, and arrangements to realize these objectives.

 

Methodological Practices    Top

Code of conduct. The Livingstone Online Code sets out ideals that guide our work and determine our day-to-day collaboration with individuals and institutions around the world.

Open access and use. By design, we have made nearly all of Livingstone Online freely accessible to the public-at-large. We encourage the broad non-commercial use of our materials and, whenever possible, try to secure Creative Commons licenses for the materials we publish in an effort to promote educational dissemination.

User-friendly design. The Livingstone Online interface seeks to create a user-friendly experience. We have invested significant effort in developing a site that is easy to navigate, intuitive, and aesthetically enriching. We have also conducted extensive testing and have collaborated with a variety of end-users in an attempt to ensure that our site is as glitch- and bug-free as possible.

Dr. Livingstone's Steam Launch 'Ma Robert', 1858. Copyright Wellcome Library, London. Creative Commons Attribution 4.0 (https://creativecommons.org/licenses/by/4.0/).
"Dr. Livingstone's Steam Launch Ma Robert," 1858. Copyright Wellcome Library, London. Creative Commons Attribution 4.0

Quality control. Our image and transcription data and metadata undergo rigorous quality control. Each element of our core data passes through at least four stages of review, often much more, prior to online publication. Based on our quality control practices, we conservatively estimate our data production error rate to be about 5%, although in most cases the error rate is probably much lower.

Documentation. Our project is the result of a sustained, decade-long collaboration among a variety of entities. We strive to make all of our work as open as possible, as befits a publicly-funded project. In this spirit, we offer rich project histories that record every stage of project development via narrative text, images, and downloadable project documents. We encourage other interested parties to study and learn from our efforts as part of our mission to facilitate digital humanities knowledge transfer.

Credit. Livingstone Online is a collaborative project to which a wide range of individuals have made contributions of varying sorts. We have developed a complex credit model in order to recognize these contributions. Our model consists of several components:

  • Our staff page identifies and cites the roles of the core members of the Livingstone Online initiative as a whole.
  • Our project team pages list the participants responsible for developing Livingstone Online's editions and other projects.
  • Our main acknowledgments page enumerates the many individuals who are not Livingstone Online core or project team members, but who nonetheless have made important contributions to our work. There are also acknowledgment pages for our critical editions of Livingstone's Final Manuscripts, 1870 and 1871 Field Diaries, and Letter from Bambarre.
  • Our project histories describe and define collaborator contributions in more detail (see sub-section on "Documentation," above).

Livingstone Online thus aspires to digital humanities best practice through the combination of these strategies and seeks to foster a collaborative environment where the work of contributors is not only valued, but also recognized in a manner that does justice to their contributions and, ulimately, benefits both the contributors and the project as a whole.

The Livingstone Mid Africa Film Corporation, I Presume? Punch, June 1933, p.323. Copyright National Library of Scotland. Creative Commons Share-alike 2.5 UK: Scotland (https://creativecommons.org/licenses/by-nc-sa/2.5/scotland/).
"The Livingstone Mid Africa Film Corporation, I Presume?" Punch, June 1933, p.323. Copyright National Library of Scotland. Creative Commons Share-alike 2.5 UK: Scotland

Attribution. Our project has a standardized system for the attribution of individual site essays. As relevant, we identify first, second, and third authors as well as editors, peer-reviewing editors, and other such individuals. By default, the individuals who appear in the byline of a given piece are first authors unless otherwise specified: "Megan Ward" or "Megan Ward and Adrian S. Wisnicki." We use "with" to signal one or more second authors and "also with" to signal one or more third authors: "Ashanka Kumari, with Adrian S. Wisnicki, also with Keith Knox and Megan Ward." If necessary, parenthetical information may be added to distinguish between roles: "Adrian S. Wisnicki, with Megan Ward (authors); Justin Livingstone (peer-reviewing editor)."

Permissions. We take the rights of our collaborators seriously. All items published on Livingstone Online appear by permission. In all cases, we have secured permissions and agreements in writing in order to confirm the terms by which institutions and individuals have made items available to us and to ensure that we have written documentation available in the future, should any questions about collaborator agreements arise. For easy reference, we retain copies of all written permissions and agreements in the back end of our Drupal site using the Filebrowser integrated file management module. Wherever relevant, we prominently display the ownership, rights, and terms of use for the items we publish directly on our site.

 

Data Production Standards    Top

File naming. The base file name of each item in our digital collection consists of only a "liv" prefix plus a unique six-digit item number. The item number has no meaning, and we make it a practice not to put any bibliographic information into our file names. Rather we keep all such information in our metadata files. Click here for more detail on how we use file names.

Images. When possible, we request 8-bit TIFF images at 600 dpi created to the 6.0 TIFF specification from our collaborating institutions, and we provide the institutions with a clear set of imaging (and permission) guidelines. However, we have collected our core image data from an array of institutions over more than a ten-year period (2004-present). As a result, variations in image capture methods and specifications have been inevitable, due to differing institutional protocols and ongoing methodological changes in fields such as library science, imaging science, and the digital humanities.

Our usual practice is to crop images to one image per manuscript page, if images do not already conform to this format. However, we do not normally crop out any contexutal elements that may appear in images, such as color charts or rulers, in the interests of giving users as much information as possible. Also, when we crop images, we always retain an archival backup copy of the uncropped image.

Gary Li, Sharon Messenger (in reflection), and Caroline Overy outside the Royal Society for Arts Library, 2007 Copyright Livingstone Online (Sharon Messenger, photographer). Creative Commons Attribution-NonCommercial 3.0 Unported (https://creativecommons.org/licenses/by-nc/3.0/).
Gary Li, Sharon Messenger (in reflection), and Caroline Overy outside the Royal Society for Arts Library, 2007. Copyright Livingstone Online (Sharon Messenger, photographer). Creative Commons Attribution-NonCommercial 3.0 Unported

Transcriptions. We have produced all our transcriptions in XML in conformance with the TEI P5 encoding guidelines. We have recorded our custom use of these guidelines in a special TEI document called an ODD (One Document Does-it-all). We use this ODD to generate both an HTML-based encoding manual to direct all our transcription efforts and an RNG schema to ensure that all our XML-based transcriptions are valid within the scope of our TEI customization.

Bonus: Download our complete TEI transcription files (780 files).

Double-Bonus: Download PDF reading copies (823 files, including 50 "extra bonus" HTML annotated reading copies) of our TEI transcription files.

Triple-Bonus: Download our complete TEI transcription materials, which include our coding manual, transcription templates, ODD, and RNG schema.

Metadata. We build detailed metadata for each item in our digital collection in an XML file created according to Version 3 of the Metadata Object Description Schema (MODS). Click here for more detail on how we use the MODS format.

Bonus: Download our complete MODS files (3032 records), which include additional metadata not available elsewhere in our site.

Index of Culpeper's Complete Herbal (London 1815), by Richard Evans. Copyright Livingstone Online (Gary Li, photographer). May not be reproduced without the express written consent of the National Trust for Scotland, on behalf of the Scottish National Memorial to David Livingstone Trust (David Livingstone Centre).
Index of Culpeper's Complete Herbal (London 1815), by Richard Evans. Copyright Livingstone Online (Gary Li, photographer). May not be reproduced without the express written consent of the National Trust for Scotland, on behalf of the Scottish National Memorial to David Livingstone Trust (David Livingstone Centre).

 

Hosting, Site Setup, and Backup Arrangements    Top

Site hosting. The University of Maryland Libraries host Livingstone Online in an Islandora framework (version 7.1.7). The Islandora framework combines a front-end Drupal content management system (version 7.56) with a back-end Fedora digital asset management system (version 3.6.2). Thanks to this arrangement, Livingstone Online is integrated within the overall collection of the Libraries and so can be preserved and maintained as part of that larger collection.

Development access. An agreement between the University of Maryland Libraries and the Livingstone Online project team sets out the responsibilities of the Libraries in hosting the site and defines the basis on which the project team can access and develop all the site's content.

Site Setup. Livingstone Online is built and deployed using a number of different tools. At the heart of our system, we use Docker (version 17.06) for deployment of code and dependencies, and Git for storing and managing code and configuration. We have a number of GitHub Repositories where we share our code and configuration. These repositories fall into three categories:

  1. Docker related - used to build Docker image(s) that are deployed to the server (prefixed with docker-);
  2. Code related - used by the site to implement functionality (prefixed with livingstone_online_); and
  3. Unrelated - not related to site setup (for instance, MODS files, TEI files, etc.).

Docker images are built automatically when changes are made to the repositories identified above. This is performed by the Docker Hub service. After Docker images are built they are automatically deployed by the Docker auto deploy application running on the site server. More information about the site setup is available here.

Site versions. Our project team uses two online versions of the site for development: stage and production. Stage provides iterations of the site where our programmers can experiment with design and test changes to code for review by project staff, while production provides the public-facing version of the site. Programmers also work with local versions of the site on their own computers.

Data and site backup. We employ multiple strategies to backup Livingstone Online's data. The University of Maryland Libraries create nightly incremental backups of the site using Commvault data protection systems, and deleted files are retained for fourteen days after deletion. All Livingstone Online data is also duplicated to tape storage held in the UMD Libraries secure server room. Code, including TEI and MODS files, is versioned in GitHub. The site's Drupal database is backed up automatically to our production server on a daily basis using Drupal's Backup and Migrate module, while the whole Drupal files directory is also sync'd via a cron job to an external server. Finally, the project director maintains local backups of all core project data on a series of computers, external hard drives, and remote servers.

 

Appendix 1: File Naming    Top

Each item in our collection receives only a simple base file name:

liv_000455

liv_013013

Images receive an additional four digit segment that identifies the specific image in the item sequence:

liv_000005_0003.tif

liv_000005_0004.tif

MODS and TEI files have the relevant acronyms added to the base file name:

liv_000149_MODS.xml

liv_002010_TEI.xml

Consequently, the base file names enable the easy association and organization of all files related to an item, while additions to this name and/or file suffixes allow differentiation among the files:

liv_000017 – base file name

liv_000017_0001.jpg – JPEG image file

liv_000017_0001.tif – TIFF image file

liv_000017_0001.tif.md5 – MD5 data integrity verification file

liv_000017_0001.tif.txt – TXT Dublin Core and image capture and processing metadata file

liv_000017_0001.tif.xmp – XMP sidecar metadata file

liv_000017_MODS.xml – XML MODS metadata file

liv_000017_TEI.xml – XML TEI P5 transcription file

Return to primary section on File Naming.

 

Appendix 2: MODS Elements    Top

Our MODS files include the following elements:

identifier – the base file name for the given item; the genre and number of the item as set out in the Clendennan and Cunningham catalogues of Livingstone documents (1979, 1985); where relevant, the shelfmark of a copy of the item held by the National Library of Scotland

titleInfo.title – the title of the item with and without the date

name.namePart and name.description – the name and birth and death dates of the creator(s) and, if a letter, the authority name of the addressee(s); the authority name of the repository as set out, whenever possible, in the Library of Congress Name Authority File (NAF) file; a short set of biographical facts related to the addressee

genre – the genre of the item as drawn from the Getty Research Institute's Art and Architecture Thesaurus Online

originInfo.dateCreated – the date of the item in written day-month-year form; the date as expressed according to the ISO 8601 format

originInfo.place.placeTerm – the place where an item was created or composed , as specified by Livingstone himself or, if not specified, than as supplied by the Livingstone Online team based on contextual inference; the authority name in the Library of Congress Name Authority File (NAF) file  of the place where an item was created or composed

subject.cartographics - the approximate latitude and longitude of the place where an item was created or composed

physicalDescription.note and physicalDescription.extent – physical details relating to an item, including whether it takes the form of a manuscript, photocopy, typescript, newspaper item, or other printed format; the page length of the item; the size of the item in millimeters

location.shelfLocator – the repository shelfmark

accessCondition – the terms by which an item is available for use and reuse

relatedItem.identifier – the bibliographical details or URL for any previous full or partial publications of the item

The MODS metadata also forms the basis for the derivative XMP metadata that we add to the image header of each image and for the standalone XMP and TXT Dublin Core metadata that we produce for each image.

Return to primary section on Metadata.

Return to top