The Data Capsule for Non-Consumptive Research: Final Report

Loading...
Thumbnail Image
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Digital texts with access and use protections form a unique and fast growing collection of materials. Growing equally quickly is the development of text and data mining algorithms that process large text-based collections for purposes of exploring the content computationally. There is a strong need for research to establish the foundations for secure computational and data technologies that can ensure a non-consumptive environment for use-protected texts such as the copyrighted works in the HathiTrust Digital Library. Developing a secure computation and data environment for non-consumptive research for the HathiTrust Research Center is funded through a grant from the Alfred P. Sloan Foundation. In this research, researchers at HTRC and the University of Michigan are developing a “data capsule framework” that is founded on a principle of “trust but verify”. The project has resulted in a novel experimental framework that permits analytical investigation of a corpus but prohibits data from leaving the capsule. The HTRC Data Capsule is both a system architecture and set of policies that enable computational investigation over the protected content of the HT digital repository that is carried out and controlled directly by a researcher.

Description

Keywords

HathiTrust Research Center, HTRC Data Capsule, non-consumptive research

Citation

Journal

DOI

Link(s) to data and video for this item

Relation

Rights

Type

Technical Report