"Invisible Work" as a Lens for Understanding Humanware's Role in Research Cloud Computing: Evidence from an interview-based study

Researchers in various scientific disciplines are leveraging cloud computing resources to enhance the scale, speed, and portability of research processes and products. Like any other cyberinfrastructure, cloud computing deployment in scientific research requires arrangements of technologies, people, organizations, and institutions to ensure smooth functioning and sustainability. To date, published discussions of cloud computing's promise have often placed technological capabilities and financial benefits front-and-center. These accounts are in line with traditional return-on-investment models for assessing a new technology's viability. The attention to ROI, however, has left assessment of the "invisible" work (and costs) placed on human actors in reaching cloud computing's promise relatively unexplored. In this paper, I report on the findings of 45 interviews with career researchers, research support staff, and student researchers engaged in cloud vendor-enabled research. The purpose of conducting the interviews was to identify commonalities in the labor required to start, maintain, or migrate research processes to the cloud via vendors (e.g., AWS, Azure, and Google Cloud). Two central types of "invisible" labor emerged as themes across the interviews: absorbing the time costs of learning new skills to migrate research to the cloud and managing billing for multiple, decentralized projects. In the discussion, I contextualize these two types of seemingly mundane work in broader debates about the burden new cloud computing technologies might place on scientists and research support staff. I conclude by suggesting that continued documentation and analysis of cloud research-enabling labor is needed to ensure that these invisible costs are understood and, perhaps in the future, shared appropriately among vendors, universities, and researchers.


SOCIOTECHNICAL ARRANGEMENTS IN CLOUD RESEARCH COMPUTING: UNIVERSITIES, VENDORS, RESEARCHERS, AND "HUMANWARE"
Research computing is inherently a multi-stakeholder, collaborative process. Accordingly, cyberinfrastructure to support research computing is funded, created, and sustained by many different stakeholders and takes on various funding models and forms (e.g., public, private, public-private, or other models) [1,2]. Scholars and practitioners alike recognize that the success of any cyberinfrastructure-no matter who funds it or what it is used for-depends on human action and coordination that far exceeds the core technical development of the infrastructure [3]. In other words, the technological capabilities of an infrastructure do not alone determine its uptake, use, and maintenance; positive outcomes also depend on various other social, political, organizational, and institutional support mechanisms. The current status of cloud computing in scientific research is no exception. Various cloud computing service providers (vendors) are jockeying for position within the scientific research computing market and thus are shaping the sociotechnical arrangements arising in the space. Amazon Web Services, Microsoft Azure, and Google Cloud, for example, are forging partnerships with universities, research centers, small research groups, and other stakeholders to enable cutting-edge research, at the same time staking claim to computing services for large segments of the scientific research community. Once in place, some of these partnerships will endure long into the future, just as frontrunning operating systems, programming languages, and other computing resources have entrenched themselves into scientific workflows in the past.
The importance of assessing what these providers offer, what their services cost, and the relative value compared to publiclyfunded options is well-established [5]. The traditional approach for assessing potential investments in cloud services is the returnon-investment model, where an array of startup, operating, maintenance, overhead, and other costs are pitted against the research value and cost of alternative service options. However, certain costs associated with cloud research computing, or any new technology implementation, are not easily-quantifiable: Work is performed by individuals and groups who are not paid directly for their involvement in developing cyberinfrastructure, but rather for a body of work that includes many different research tasks. The central argument of this paper is that these types cloud-enabling work are "invisible," or difficult to account for when assessing the economics of cloud services. In support of this argument, I present evidence from interviews conducted with practitioners who actively participate in and enable cloud research. The interviews sought to identify themes in the challenges cloud research practitioners and support staff encounter and, in turn, bring visibility to work that otherwise may not be considered when making resource allocation decisions. Below, I articulate the research questions guiding this study and how I sought to answer them.

RESEARCH QUESTIONS
Cloud computing is widely viewed as a transformative approach to research computing, enabling enhanced size, speed, and efficiency of scientific analyses. However, few studies have explored the types of invisible labor required to facilitate cloud research, particularly the tasks that are administrative, routine, and perhaps mundane in nature. Such forms of labor are not easily quantifiable, but should be well-understood by decisionmakers in the cloud research computing space. Accordingly, the following research questions guided this study: 1. What types of "invisible" labor enables research cloud computing? 2. How do the workers who perform "invisible" labor perceive the work and its potential impact on their careers?

METHODS
To answer the research questions above, I conducted semistructured interviews with 45 scientists and research support staff in astronomy at one R-1 university. I chose astronomy because of my familiarity with the technological tools used in the discipline and because of its history of computationally-intensive methods, which are amenable to and ripe for cloud computing applications. Semi-structured interviewing involves asking the same set of questions to all interviewees, with latitude for the interviewer to ask "probing questions" along the way [6]. Example questions on the interview protocol for this study included: 1. What is your role in your research team's cloud computing work? 2. What have been some unexpected challenges in doing research using [insert cloud service provider name]? How have you addressed these challenges? 3. What, if anything, would you change about the cloud research service you're receiving? 4. How do you think the skills you've acquired in doing cloud research will impact your career?
Semi-structured interview questions are intended to be sufficiently general that responses can be compared across interviewees; particularities are addressed in probing questions.
The interviews included in this study ranged from 29 minutes to 1 hour and 43 minutes. After recording the interviews, I produced written transcripts and compiled them into a unit for analysis using Atlas.ti© qualitative data analysis software. I followed a grounded theory approach [7], in which the analyst layers interpretation onto the transcript data through repeated readings, identification of initial themes, and refinement of themes. Refinement of themes includes a process of selective coding, where the analyst looks for additional instances of reoccurring themes, and axial coding, where the analyst attempts to relate individual codes to one another through constant comparing and contrasting of instances of the codes. Codes are continually broken down and/or conflated until there is thematic consistency across all instances of each particular code.

FINDINGS: TYPES OF "INVISIBLE" WORK IN CLOUD RESEARCH COMPUTING
As discussed above, scholars and practitioners have emphasized the critical role of "humanware" in capitalizing on the promises of cloud computing in scientific research. I sought to document and describe the "invisible" tasks humans take on in support of cloud-enabled research. The themes generated from interviews with cloud computing practitioners supported the assumption that such invisible work exists and is prominent in cutting-edge scientific research. Additionally, the findings illustrate two particular types of "invisible" labor required of humans in the cloud computing loop that may otherwise go unnoticed in traditional ROI assessments.
A theme in nearly all interview responses about cloud computing labor involved the unfavorable tradeoff between the frequency or intensity of the labor and the labor's visibility. In particular, the early-career researchers I interviewed repeatedly discussed the amount of time and effort required to migrate their research workflows to the cloud in juxtaposition to the visibility and credit they received for the work. Graduate students and postdocs noted that their labs' leaders placed them in charge of the transition on the basis of their computing "proficiency," "acumen," "know-how," and "expertise." In 20 instances, interviewees expressed regret that they had been placed in charge of migrating research workflows to the cloud. As one postdoc respondent noted in a statement that encapsulates the point many respondents made: Similar responses emerged across early-career interviewees. On one hand, this finding is unsurprising: Graduate students, postdocs, and other non-tenure-track researchers have long engaged in invisible or otherwise undervalued labor to ensure that research gets done. On the other, though, the fruits of this invisible labor have historically benefitted the individual's lab, research group, institution, funding agency, or other entities closely tied to the individual's own success. In discussing cloud computing work, though, interviewees expressed reluctant expectations that third-party vendors would reap the largest monetary and technological rewards from the shift to the cloud. These themes emerged strongly in response to questions about the work required to migrate to the cloud. A graduate student, for example, expressed the following sentiment: I'm a, how is it called, a guinea pig or a mouse for the future researchers who will do everything in the cloud. I do think one day it won't be so difficult to port everything, or it will just be done there from the very beginning. Now, it's a lot of work.
The second type of invisible work that respondents repeatedly discussed involved billing for the cloud services they were using. Two themes in the responses about billing emerged. First, researchers and research support staff expressed frustration with having to develop their own mechanisms for predicting the costs of vendor-provided cloud computing services into the future and lacked confidence in their estimates. Predictions, they noted, were difficult because the scale of data collection, storage, and analysis changes quickly. Furthermore, interviewees stated that they could not anticipate what cloud service vendors would charge for services. These unpredictable elements rendered grant budget allocations difficult to approximate. As one interviewee representatively stated: Similar to the requirement of continually learning new skills, the unpredictable and volatile nature of academic research for graduate students and postdocs is expected and unsurprising. However, respondents' awareness of cloud service providers' ability to alter costs on a whim was striking and stood in stark contrast to how they perceived other computing resources that were central to their work. Interestingly, 15 of the respondents cited open source software as a counterexample to the unpredictable costs of cloud computing. Interviewees made statements such as "I know the open source tools I use will always be free," "I wish they'd [vendors] adopt an open source model," and "At least with the software I use, I know it's not going to go away or be unaffordable anytime soon." The assumed future volatility of cloud computing costs were only one part of respondents' concern about billing. Strikingly, all of the interviewees who managed multiple projects-10 of the 45-noted that their cloud service providers made it difficult to parse out the costs attributed to each project. For example, a manager of a lab that provides computing support to various astronomy research groups described his approach to divvying up the costs of the resources: Another respondent, also research support staff, stated that she spent hours parsing billing documents from another vendor for the labs she supported. When asked about the amount of time she spend on billing issues relative to other tasks, she remarked: That work, 0 hours were written into any grant, it's not written into my job description here [at the support lab]. And I'm not getting a break on anything else I was doing. So we really, we are allsystems-go for the computational part, the research part, but we are no-systems-go for the administrative stuff.
Universally, researchers and research support staff mentioned at least once in their interviews that the administrative tasks involved in deploying cloud research services were too burdensome, with billing being the most common source of ire. Although it may be tempting to brush such responses off as expected complaints from people charged with changing their everyday practices, the themes in the responses prompt questions that deserve further exploration from social scientists, university IT departments, cloud service vendors, and others in the cloud research computing landscape.

DISCUSSION
The promises of cloud computing in scientific research are undeniable. The scale, speed, and efficiency offered by cloud services stand to revolutionize computationally-intensive science, as evidenced by the innovative cloud deployments described by other authors in this track. We cannot, however, ignore the experiences of researchers and research support staff who experience and/or anticipate considerable challenges in pushing the frontiers of cloud research computing. Accordingly, documenting the "invisible labor" of cloud research-enabling practitioners should expand to include a wide variety of disciplines, universities, funding agencies, cloud service providers, and other stakeholders. We can then begin to place these experiences in conversation with other groups of workers who face technological disruption in their workflows, their labor arrangements, and their career trajectories.
In pursuing this research agenda, scholars might consider using a paragraph from the the National Science Foundation Advisory Committee for Cyberinfrastructure Task Force's 2011 Report on Campus Bridging [8] (p. 108) as a starting, motivating point, which reads: If we are as a nation to retain, within the fields of cyberinfrastructure and computational and data-enabled science and engineering, the best and brightest experts then the value proposition they face as individuals must be such that it is rational, and consistent with a good quality of life, to pursue and maintain a career in these fields.
The small sample of researchers who participated in this study, when asked directly, uniformly expressed apprehensiveness with regard to the present and future career implications of their ongoing work in supporting cloud computing research. To be sure, the interviewees did almost universally indicate that they were optimistic about the sustainability of cloud research and, in turn, the market desirability of the skills they had developed in transitioning their research teams to the cloud. But across the responses, mundane yet important cloud research-enabling tasks beleaguered the dayto-day experiences of the "humanware" core who make cloud research possible. In documenting and presenting these challenges, we might begin to develop solutions to distribute the invisible costs of cloud computing research among the stakeholder groups. Currently, researchers and research support staff appear to be undertaking a disproportionate amount of labor in advancing cloud research computing. In other words, cloud vendors are currently learning from and capitalizing upon the everyday labor researchers and research support staff put into migrating research workflows to the cloud, with little direct compensation to those workers.
The findings of this study could, of course, be used to support arguments that publicly-funded cloud options should be the way forward. However, the primary argument is that invisible labor, and thus invisible costs, should be considered no matter the provider of the cloud service. The limitations of this argument and the opportunities for future research are presented in the next section.

LIMITATIONS AND FUTURE RESEARCH DIRECTIONS
The study presented above presents themes from a small sample of interviewees working on cloud research computing applications within one discipline. Accordingly, the findings may not generalize to all types of practitioners engaging in cloud computing research using cloud services provided by vendors. However, the findings are intended to lay a foundation for further interrogation of the burdens cloud computing may place upon researchers and research support staff amidst the growing demand for cloud services.
The current study will be followed by a survey issued to a broader cross-section of researchers and research staff who use vendor-provided cloud services. The survey is intended to evaluate whether the invisible labor documented here is pervasive across disciplines, universities, and other contexts. Additionally, it will aim to uncover other types of invisible work that makes cloud research possible. Such work might include organizing trainings or tutorials without financial support from the cloud service vendor; hiring on staff with expertise in one or more cloud services to meet researcher demand; developing robust billing systems to handle the complexity of cloud research financial costs; or forging new partnerships with cloud service providers that blur the boundaries between stakeholders in the research process.