HTRC Data API Performance Study

Loading...
Thumbnail Image
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

HathiTrust Research Center (HTRC) allows users to access more than 3 million volumes through a service called Data API. Data API plays an important role in HTRC infrastructure. It hides internal complexity from user, protects against malicious or inadvertent damages to data and separates underlying storage solution with interface so that underlying storage may be replaced with better solutions without affecting client code. We carried out extensive evaluations on the HTRC Data API performance over the Spring 2013. Specifically, we evaluated the rate at which data can be retrieved from the Cassandra cluster under different conditions, impact of different compression levels, and HTTP/HTTPS data transfer. The evaluation presents performance aspects of different software pieces in Data API as well as guides us to have optimal settings for Data API.

Description

Keywords

Cassandra, Performance, HTRC, API, performance evaluation, HathiTrust Research Center

Citation

Journal

DOI

Link(s) to data and video for this item

Relation

Rights

Type

Technical Report