No. WP- 02-10Leveling the playing field, or expanding the bleachers?
Socio-Technical Interaction Networks and arXiv.orgNovember 4, 2002
Eric T. Meyer and Rob Kling
Center for Social Informatics
School of Library and Information Science
Main Library 011
Indiana University
Bloomington, IN 47405
http://www.slis.indiana.edu/SCIT
Contact Information:
Eric T. Meyer
Indiana University
791 Union Drive
Indianapolis, IN 46202
e-mail: etmeyer@indiana.edu
Phone: 317-274-4434
Fax: 317-274-1362Rob Kling *
School of Library and Information Science
Main Library 011
Indiana University
Bloomington, IN 47405
e-mail: kling@indiana.edu
Phone: 812-855-9763
Fax: 812-855-6166* Send communications to Rob Kling
Abstract
It is has been argued that the use of electronic forums for scientific communication has numerous positive consequences, including being an important means for increasing the participation of scientists who are in peripheral locations, such as less research-intensive universities. ArXiv.org, the electronic research manuscript repository for physics and related fields, is examined to understand the level-playing field story told about this kind of online resource. A random sample of research manuscript postings from 1993 and 1999 were coded and analyzed. We did not find evidence that arXiv.org has served as a leveling influence in the fields of theoretical high-energy physics, astrophysics and mathematics. As an alternative to the standard view of arXiv.org as a level playing field, the authors present a socio-technical interaction network model that better explains the roles of online scientific publishing within the matrix of resources that support the conduct of research.
1.0 Introduction
The New York Times carried a story on the front page of its science section in May, 2001 (Glanz, 2001) describing arXiv.org and its remarkable success as a research repository for leveling the playing field for physicists, mathematicians, and other scientists whose research it can help to communicate to others in their fields. ArXiv.org is a repository where scientists post their research manuscripts for access by the other members of their research fields, before the manuscripts have been published in peer-reviewed venues. Neither authors nor readers are charged for communicating via arXiv.org. ArXiv.org has become a popular repository for some physicists and mathematicians. By October 2002, it collected over 213,000 research manuscripts.The New York Times story describes Lubos Motl, an undergraduate physics student in Prague, who was both ill and disconnected from others who understood his passion for string theory, a lively area within theoretical high energy physics. When Mr. Motl posted a research manuscript on arXiv.org, according to the New York Times article, “established string theorists were so impressed by his work that he ended up with a scholarship to Rutgers [University]…Mr. Motl is a striking example of how the archive is changing physics” (Glanz, 2001). According to Glanz, " The archive is transforming the quality of scientific research at institutions that are geographically isolated and, in many cases, small and financially precarious." One of Glanz's informants, a professor of physics at a research center in a small town 500 miles south of Santiago, Chile is quoted as saying: "It freed the third world from the need to be in Princeton, Pasadena or Paris in order to do frontier research." Ironically, Mr. Motl moved to Rutgers rather than remaining in Prague.
This argument that repositories of research articles that do not charge readers or authors, such as arXiv.org, serve to level the playing field by allowing scholarly participation by scientists who have been often excluded has a great deal of appeal in scientific circles. This "leveling hypothesis," if true, can expand the pool of scientific talent, and has very few negative aspects. The simplest arguments made by the proponents of the leveling hypothesis are technologically deterministic in nature: if we build a high quality online repository that is available free of charge, participants will come, and the benefits of communication will accrue to all, regardless of their status or physical locations. While more complex arguments (for instance, see Brunner (2002)) may take other factors into account, such as the likelihood that elite scientists will mainly be hired by elite institutions, even these arguments use the language of leveled playing fields or similar metaphors to explain at least part of the benefit to the development and use of scholarly resources on the Internet. A central question is whether these Internet resources enable groups that were previously excluded from participation in the ongoing disciplinary dialogue and communications between scientists to be included and to be able to participate. In the case of arXiv.org, does it function to expand participation in scientific dialogue? Are scientists who were previously peripheral in terms of scientific publication able to become more actively involved with publication in their field?
It is important to underline the differentiation between access and participation. Providing access to electronic forums is relatively simple: develop a web site with suitable features, submit the URL to search engines, list-servs, and friends, and wait for people to show up and access the available information. Participation is an entirely different dimension of behavior. To participate actively in an electronic forum such as arXiv.org, one cannot just lurk and browse or read other people’s research manuscripts. People also must engage in the behavior necessary to have a voice in the online information resource in question . In the case of arXiv.org, scientists must also post their own research manuscripts for others to read (and discuss).
The forms of participation differ in different electronic forums. In discussion forums, such as bionet groups, Usenet newsgroups, or Stevan Harnad's "American Scientist Forum" about scholarly electronic publishing (http://amsci-forum.amsci.org/archives/september98-forum.html) participation can take the form of posting a comment or query, or briefly responding to another person's posting. In contrast, the threshold for participation in arXiv.org is much higher: potential participants must post documents that have the form of scientific manuscripts (i.e., titles, abstracts, introductions, analyses, conclusions, references, as well as scientific content). ArXiv.org does not include electronic forums for discussing research manuscripts that have been posted. Participation in arXiv.org requires the conduct of research and an ability to communicate it in the form of at least a brief report, not simply the ability to comment on others' research. One challenge with electronic forums for scholars is that providing access has proven to be a tractable issue as a result of the wide adoption of the Internet, while encouraging wider participation beyond an existing network of scholars has been much more difficult.
A more useful metaphor than the level playing field for arXiv.org is reflected in the title for this paper, Leveling the Playing Field, or Expanding the Bleachers? If a major function of arXiv.org is to allow wide and unrestricted access to scholars who wish to share their research findings, we will argue that rather than successfully leveling the playing field, a more descriptive metaphor would be that arXiv is expanding the bleachers and allowing additional spectators to view the results of the output of the original major players. This study examines the role of arXiv.org as a repository of scientific manuscripts in expanding participation by examining whether arXiv.org is actually "leveling the playing field" in its topical areas and allowing or encouraging many more participants with few if any barriers to participation, or if it is more often "expanding the bleachers" and creating more spectators. This level playing field claim has been made by arXiv.org’s founder: “The original objective of the e-print arXiv.org was to provide functionality that was not otherwise available, and to provide a level playing field for researchers at different academic levels and different geographic locations…” (Ginsparg, 2001). As long as the domain is limited to access to the repository itself, Ginsparg’s expectations were realized. Any interested reader can access arXiv.org manuscripts, and potential authors have the ability to post manuscripts to the repository.
However, with regard to participation, it is less clear whether the playing field has really been made level simply by making this useful repository available. If the playing field is truly level, all should be welcome on the playing field, almost regardless of status, geography, and similar characteristics that may be unrelated to the quality of their scientific contributions. If contrary to the conventional perception, there is more of an expansion of the bleachers than a leveling of the playing field, that does not preclude that some expansion in participation also occurs. More teams may be added, more leagues may be allowed to play on the field. However, if international participation at a soccer field is expanding by the inclusion of elite European teams, this is different from an unrated team from a small country showing up and playing as equals. Aarseth (1997) points out that “a reader, however strongly engaged in the unfolding of a narrative, is powerless. Like a spectator at a soccer game, he may speculate, conjecture, extrapolate, even shout abuse, but he is not a player.” This soccer metaphor will be revisited in the discussion.
The argument that the use of internet-based resources has leveling effects in science has been empirically examined in only a few studies that we can identify. Hesse et. al. (1993) argue that a computer network called SCIENCEnet increased productivity and participation in the field of oceanography.
Networks may have differential payoffs for peripheral and core scientists. “Peripheral” in our study did not mean unimportant, but removed from (or facing barriers to) those resources necessary for doing good oceanography—remote instruments, geophysical data, global projects, disciplinary committees, important research programs, and colleagues…. We found some evidence that, at the margin, peripheral oceanographers benefit more from network usage than core oceanographers do. This lends credence to the theoretical pattern described [in which peripheral scientists’ scientific productivity increases more rapidly with increased network access to resources than does the productivity of core scientists] (Hesse et al., 1993, p. 91).While cautioning that this conclusion is not necessarily valid in other scientific fields, Hesse nevertheless sets the stage for others to argue that computer networks are a leveling force for peripheral scientists. Other examples of arguments for computer networks increasing scholarly activity and participation are Cohen (1996), Walsh & Bayme (1996), VanAlstyne & Brynjolfsson (1996) and Wellman et. al. (1996).This article challenges the leveling hypothesis -- the claim that a major function of electronic repositories of research manuscripts, as illustrated by arXiv.org as the most cited model, is to level the playing field and substantially increase peripheral participation in scientific research.
2.0 Arxiv.org
What is now arXiv.org was first started as an e-print repository at Los Alamos National Laboratory in August 1991. Its founder has described its inception:The first database, HEP-TH (for High Energy Physics -- Theory), was started in August of '91 and was intended for usage by a small sub community of less than 200 physicists, then working on a so-called "matrix model" approach to studying string theory and two dimensional gravity. (Mermin [Reference Frame, Physics Today, Apr 1992, p.9] later described the establishment of these electronic research archives for string theorists as potentially "their greatest contribution to science.") Within a few months, the original hep-th had quickly expanded in its scope to over 1000 users, and after little more than three years now has over 3600 users. More significantly, there are numerous other physics databases now in operation (see xxx physics e-print archives) that currently serve over 25,000 physicists and typically process more than 40,000 electronic transactions per day (i.e. as of 10/94).These systems are entirely automated (including submission process and indexing of titles/authors/abstracts), and allow access via e-mail, anonymous ftp, and the World Wide Web. The communication of research results occurs on a dramatically accelerated timescale and much of the waste of the hardcopy distribution scheme is eliminated. In addition, researchers who might not ordinarily communicate with one another can quickly set up a virtual meeting ground, and ultimately disband if things do not pan out, all with infinitely greater ease and flexibility than is provided by current publication media (Ginsparg, 1996).Ginsparg’s argument illustrates one of the unusual aspects of research collaboration within physics: it initially served a “small sub community of less than 200 physicists” who already engaged in sharing research manuscripts before publication. “While the high energy physics community did have a pre-existing hardcopy preprint habit that had already largely supplanted journals as our primary communication medium, this is not a necessary initial condition for acceptance of an electronic preprint archive,” Ginsparg (1996) argues. Regardless of whether this initial condition is necessary for a repository’s success, however, the pre-existing system of research manuscript illustrates one of the inequalities that arXiv.org was meant to reduce. Specifically, physicists located at major research universities were more likely to communicate with one another and to receive communications of research in a timely manner than physicists who were at less research-intensive universities. In other words, being part of the research-intensive core of physics represented a major advantage over those on the periphery in terms of maintaining currency in the field.
With the advent of the World Wide Web, the Los Alamos repository became known by its URL: xxx.lanl.gov . Participation as measured by the numbers of research manuscripts posted in the repository has increased substantially over the last decade. As Figure #1 shows, the growth in manuscript postings has followed a nearly perfectly linear path.
Figure 1: Monthly Submission Rate for arXiv.org
Source: Retrieved May 8, 2002 from http://arxiv.org/show_monthly_submissions
The question to answer in this study, however, deals with not just increasing numbers of postings, but in understanding the socio-technical nature of how scholars use the arXiv.org repository. One possible explanation for the large increase in postings is that the server is proving successful at lowering barriers to participation in the physics and mathematics research communities. This is consistent with the arguments made by arXiv.org’s founders and supporters (for examples see Ginsparg (1996), Ginsparg (2001), Van de Sompel (2000), Hurd (1996), and Bachrach et al. (1998)). If arXiv.org is a leveling influence, scientists on the periphery should begin to publish more like their counterparts in the more research-intensive core universities.
3.0 Methods
This study is an authorial analysis of manuscript submissions to arXiv.org during 1993 and 1999. The data were collected and analyzed between January and August 2000. Since the manuscripts are listed chronologically in order of posting to the repository, a random sample of manuscripts from the HEP-TH, MATH and ASTRO repositories for the years 1993 and 1999 was obtained by selecting every third manuscript until the necessary sample was collected to yield a 95% confidence interval.In order to test the hypothesis that participation in the research communities served by arXiv.org has been broadened over time, 1329 manuscripts were coded from three of its repositories; HEP-TH , MATH and Astrophysics, and authors were classified according to their institutional affiliation. An additional 2,922 manuscripts with international authorship were tallied but not ranked as discussed below. These three fields were selected in order to examine areas of arXiv.org which have become fairly well-established within their subfields. Scientists publishing in string theory, for instance, are likely to be aware of, read from and post to the arXiv.org repository. In fact, string theory is the topical area of particle physics that Glanz (2001) cites in his New York Times story. Whether scientists are aware of and regularly read and post to the repository may be less true for other fields where arXiv.org has not become as much of a standardized communication medium.
As a variable that represents the academic status of an institution reasonably well, the Carnegie classification of each author’s institution is a useful measure (Carnegie Foundation, 1994) . While the purpose of the Carnegie categories “is not intended to establish a hierarchy among higher learning institutions” (Foundation, 1994), it does serve as a useful variable to group colleges and universities with other like institutions. The categories are based partly on the highest degree conferred and partly on federal funding. Thus while Research Universities all award at least 50 doctoral degrees a year, Research I institutions also receive more than $40 million in federal support annually. In contrast, Research II universities receive between $15.5 and $40 million in federal research funds annually.
It is a reasonable supposition that some physicists located at less research-intensive universities are interested in trying to raise their research profiles. While not claiming that this necessarily occurs, it is a useful starting point when selecting the repositories. The third repository, astrophysics, was selected after an informal survey of physics departments at less research-intensive (Doctoral and MA Comprehensive) universities showed that it was the best represented subfield among these universities available as a single repository on arXiv.org. Ten out of twenty randomly selected physics departments at Doctoral 1, Doctoral 2 and MA Comprehensive universities offered astrophysics at the graduate level, as indicated by the web sites for their physics departments.
The 125 research universities in the United States account for only three percent of the total number of degree granting institutions. As indicated below, however, these same few institutions account for the majority of manuscripts posted on arXiv.org. There is no similar rating of institutional status for international universities allowing the same sort of categorization for non-U.S. institutions. However, the number of internationally authored manuscripts were tallied and reported.
4.0 Results
Table 1. Posting rates for all author-instances (%)
Carnegie Classification HEP-TH (Physics theory) MATH Astrophysics 1993 1999 change 1993 1999 change 1994 1999 changeResearch I 79.9 88.6 8.7** 74.1 72.5 -1.6 35.0 21.4 -13.6**Research II 10.0 2.1 -7.8** 7.1 11.6 4.4 2.0 1.7 -0.3Doctoral I 1.8 0.3 -1.5* 1.8 1.2 -0.6 0.3 1.4 0.9Doctoral II 0.2 0.1 -0.1 3.6 2.0 0.1 1.7 2.0 0.3MA 1.5 1.0 -0.5 4.5 4.6 1.4 0.0 0.0 0.0Liberal Arts 0.3 0.5 0.1 0.0 1.4 -2.1 0.0 0.0 0.0Other 6.3 7.3 1.1 8.9 6.9 - 15.2 10.9 -4.3International - - - - - 0 45.9 62.6 16.7**Total Authors (n) 580 670 0 125 551 0 383 294 0Total manuscripts (n) 349 350 0 80 350 0.6 100 100 0Mean authors 1.66 1.91 15.1** 1.56 1.57 0 3.03 2.94 -0.9Total # of manuscripts posted 2091 2825 0 219 2332 0 1032 5639 0Sample size 16.6 12.1 0 37.4 14.8 0 9.7 1.8 0Int'l manuscripts (n) 989 1179 0 100 1011 0 0 0 0Int'l of manuscripts 74.0 77.5 3.5* 54.9 65.9 10.9** 0 0 0** p < .01, * p < .05
Table #1 summarizes the results of this study. Several interesting points emerge. Internationally authored manuscripts accounted for 74.0% of the HEP-TH postings in 1993, and increased by 3.5% to 77.5% in 1999. Likewise, both MATH (10.9% increase in international authorship) and Astrophysics (16.7% increase in international authorship) saw substantial increases in international participation. Below, the astrophysics repository is used to examine the makeup of this international participation. Here, however, it is interesting to note that one of the difficulties is trying to categorize international authorship since it is difficult to accurately assess the status of non-U.S. institutions. Without knowing whether these manuscripts are coming from researchers at large, well-funded, and/or research-intensive universities or smaller, less well-funded, less research-intensive and/or more geographically dispersed institutions, it is impossible to say whether their participation represents an indication that arXiv.org has had a leveling influence for international researchers. We suspect though, that since the rates are high in both 1993 and 1999, we are seeing participation by researchers at prestigious international research centers. This is based on the observation that good internet connectivity has been slower to be adopted in less highly developed countries, and thus would not yet have been strong in 1993. This would also be consistent with the patterns for the United States discussed below, as well as in the astrophysics manuscript repository discussed below. It seems to be a reasonable hypothesis to tentatively accept.
The change in HEP-TH postings by authors in U.S. institutions (Table #1) does not support the leveling hypothesis for arXiv.org. The large proportion of author-instances coming from Research I universities (80% in 1993) does not decrease but instead increases to 89% in 1999 (p < .01). There are no significant shifts in the patterns of HEP-TH manuscripts other than a 5.4% decline in submissions from Research II universities. For all other authorship from the remaining classes of universities, posting rates were low in 1993 and remained low in 1999.
For the distribution of publications in the mathematics repository, a similar pattern emerges. For the U.S. institutions represented, there is a small 1.6% (non-significant) decline in manuscript submission by authors from Research I universities. But there is not a corresponding significant increase in authorship by researchers from universities in any of the other categories. Instead, the change seems to be fairly evenly spread across the institutions, and could easily be due more to the increase in number of postings of research manuscripts over this time period than to any other factor.
In the astrophysics repository, the change from 1994 to 1999 is similar to that in the other repositories. While the proportion of authorship by researchers at Research I universities declines significantly during this time period, the only significant corresponding increase is by international authors. In astrophysics, in fact, the increase in international authorship was even more marked than in mathematics and physics.
An interesting element in Table #1 is the increase in the mean number of authors per manuscript in the HEP-TH repository. In 1993, the mean was 1.66 and this increased to 1.91 in 1999. This 15.1% increase is statistically significant (p < .01) and indicates that, at least for these two years, there is an increase in collaboration. However, it does not appear that there are increased collaborations across classes of universities. Thus, the hypothesis that an Internet based repository will increase participation by more diverse types scientists doing the type of research reported on arXiv.org is not supported.
4.1 Posting rates and Institutional Affiliations
The Internet and the presence of an easily accessible web-based repository of research manuscripts do not increase the variety of scholars participating in science as measured by posting research manuscripts to the repository. Why are researchers at less research-intensive institutions not posting their research manuscripts to arXiv.org?Remember that Figure #1 shows a linear growth in submission rates for the arXiv.org repositories. This graph is easily accessible to visitors to the arXiv.org site, and certainly lends the impression that submissions are increasing at a relatively steep rate. In Figure #2, on the other hand, a different picture emerges. The individual repositories at arXiv.org do not show the phenomenal growth rates, after an initial one to two year initial period of growth. In the case of HEP-TH, this period of strong growth was centered in 1992 and 1993. After 1993, however, its growth rates were less than 10% annually. Thus, while there is generally growth in the number of submissions, it is not nearly as steep a line as that in the overall repository. Only astrophysics, which has had a more dramatic uptake that only now appears to be leveling off based on trends in 2000-2001, shows the linear growth one might expect from the trends on arXiv.org in general.
Similarly for the mathematics repository, there is a period of large growth, this time centered on the 1998 time period. This rapid uptake came at about the same time that the American Mathematical Society (AMS) endorsed arXiv.org for mathematics publications and some of the mathematics abstract services began to include the arXiv.org repositories in their updates (American Mathematical Society, 1999).This indicates that for mathematicians, unlike physicists, the adoption of arXiv.org as a repository was less than automatic (a phenomenon that Kling & McKim (2000) have described in their manuscript on field differences in the use of electronic media). This recent change in widespread use of the repository also allows one to look at the results discussed above as similar to a before and after picture. In the MATH repository the general demographics of the posting authors did not change significantly, even though we studied two time periods with very different rates of submission. This is further indication that this particular repository is not acting as a strong leveling influence, especially for mathematicians at American universities.
Figure 2. Posting Rates for Selected arXiv.org Repositories, 1991-2001
Source: Data compiled from arXiv.org.
Another indication that arXiv.org is not leveling the playing field for science comes from examining the sources of submissions in regard to specific institutional affiliations. In 1993, the authors from each of three universities published more than 5% of the manuscripts sampled from the HEP-TH repository (Princeton University [8.6%, n=50], Massachusetts Institute of Technology [6.6%, n=38], and University of Texas at Austin [5.0%, n=29]). In 1999, the authors from two of these universities again published more than 5% of the sampled HEP-TH manuscripts (Princeton University [6.0%, n=44] and Massachusetts Institute of Technology [7.0%, n=47]). This is striking, and certainly suggests at least one explanation for submission trends: that having a network of active scholars co-located geographically increases publication rates, and thus participation. Princeton University, for instance, lists thirteen high energy physics theory faculty in 2000, and MIT shows fourteen on their web page. Having this large a group appears to contribute to the overall scholarly output, as one would expect. Of the universities with over 20 author-instances in the repository, there are unsurprisingly only Research I institutions represented.
The authors of manuscripts in the mathematics repository are less concentrated into a few university departments. Only three universities had more than twenty author-instances in this sample of the repository in 1999 (Harvard University, University of California at Berkeley and Rutgers University), and none contributed more than 5% of the authors. It appears, then, that mathematics may be a more distributed discipline, but as seen earlier, this does not translate into wide participation beyond the Research I universities.
Looking at Table #2, the origins of increased participation on arXiv.org for the astrophysics repository becomes clearer. There are significant increases in author-instances in Germany, Italy, France, other Western European countries, Japan and other Asian countries. These highly developed nations, while certainly benefiting from increased abilities to participate internationally, participated actively in elite scientific disciplines before the Internet.
Table 2. International Author Affiliations in ASTRO repository (all authors) (%)
AstrophysicsCountry/Region 1994 (N=139) 1999 (N=184) ChangeOther Western Europe not listed below 8.3 11.2 2.9**Italy 3.0 10.5 7.5**United Kingdom 9.9 9.9 0.0Germany 4.3 8.5 4.2**Japan 3.3 7.8 4.5**France 0.7 5.1 4.4**Other Asia not listed 1.0 3.7 2.7**Canada 3.3 2.0 -1.3South America 2.0 1.4 -0.6Australia/New Zealand 4.3 1.0 -3.3Other Eastern Europe not listed 3.3 1.0 -2.3Israel 2.6 0.0 -2.6** p < .01, * p < .05
5.0 Discussion
This study assessed whether arXiv.org has been successful in increasing participation in the physics and mathematics research communities. One could assume that physicists and mathematicians who had something to contribute would have a reasonably high level of competence to participate, but not necessarily all of the necessary resources. Less well-funded universities generally provide faculty members with fewer funds for research, fewer travel funds to attend meetings and conferences, and fewer computer resources. They also usually have fewer and smaller Ph.D. programs, so their faculty members typically have less research assistance. The faculty at these institutions could be seen as suffering from information inequality.Choosing to look at the posting behaviors of elite scientists is in contrast to studying something like a site directed to the public for the discussion of investment advice. In that case, many of the people accessing the information would not have enough knowledge about investing to realistically contribute. Physicists and related disciplines, on the other hand, have an entrance fee—one has to start with an advanced degree.
Most professors of physics, astronomy and mathematics are not at Research I institutions for a number of reasons. These include a tight job market, an interest in teaching over research, personal preferences for a type of college or a geographic area, family reasons to live in a certain area, and others. An underlying question for this research in this context is that for scientists participating in highly specialized research, where does information sit among the constellation of resources that drive their participation in research? Other resources include funding levels, facilities, office space, research assistance, teaching loads, committee assignments, administrative jobs, and so on.
This study found that even though most colleges and universities in the U.S. can be assumed to have at least some level of internet access by 1999, there are very low levels of participation outside of Research I universities on the scientific manuscript repositories that were examined. This leads us to question the Standard Model of how scholarly repositories, and arXiv.org in particular, are understood.
Figure #3 portrays the Standard Model of participation in arXiv.org. ArXiv.org sits at the middle of a network of authors and readers. The network is simplified in this case to six individuals for illustrative purposes. In this Figure, three of the scholars depicted are both authors and readers, contributing manuscripts to the repository and reading manuscripts posted there. These people would fit the definition of active participants in scholarly research. Two of the scholars in the Figure are simply readers, accessing manuscripts from arXiv.org but not posting their own manuscripts. Understanding the detailed behavior of these people would be very difficult. To do this, one would have to use the access logs at arXiv.org and trace back IP addresses to the domain owners to try to determine the location of the people accessing the repository. In addition, since these people don’t post articles, they are not active participants even though they may be very regular passive participants. The last individual in Figure #3 is just an author. It is possible to post an manuscript (or have posted an manuscript on which one is not the primary author) without having a perceived need to access arXiv.org for accessing the manuscripts of others, although this would likely be fairly uncommon.
Figure 3. Standard Model of ArXiv.org
In the Standard Model, scholars (or disciplines) that don’t yet participate by posting manuscripts to working repositories modeled after arXiv.org are simply uninformed and will eventually see the benefit to participation (Harnad, 1999). This model, however, is much too simple and doesn’t take into account the socio-technical dynamics of the scientific research communities. Instead, a socio-technical interaction network (STIN) model helps represent the situation more completely (for a more complete discussion of STIN theory, see Kling, McKim & King (In press)). Figure #4 depicts a model of one portion of the STIN that comprises arXiv.org. In this model, individual actors have a far more complex existence than simply that of author/reader seen in the Standard Model. In the STIN model, each scholar is a complex actor who is not only an author and a reader, but also a researcher, a teacher, a collaborator, a member of scholarly societies, an employee of an academic institution with all the requisite responsibilities, and in a number of other communities and networks, including local non-scholarly networks of all sorts. To understand the scholar’s publishing and electronic posting activities, it is necessary to see these other relationships that are part of their STIN as contributing to the likelihood that they will do the research that enables them to post manuscripts to scholarly research repositories.
In Figure #4, for instance, one can see that in the collaboration on the top half of the diagram, there are three collaborating authors, each fulfilling a wide variety of social roles. Author A, for instance, holds an appointment at Institution A. This appointment both demands responsibility on the part of the faculty member, but also confers advantages in terms of funding, status, support, and so forth. Authors B & C both hold appointments at Institution B, with a similar reciprocal relationship with their institution, hence the arrows indicating flows from the authors to the institution as well as vice versa. The double-ended arrows between the authors and their non-research responsibilities represent an ongoing negotiation in their careers as they balance the demands on their time between research and other activities. Likewise, the double-ended arrows between authors represent another sort of negotiation as they work on their research and its presentation.
Figure 4. Socio-Technical Interaction Network (STIN) Model
The larger arrow flowing from the collaboration to arXiv.org represents their participation through posting their manuscript to the repository. This arrow is larger to indicate a preference for participation in research projects over mere access to research (the smaller arrows flowing from arXiv.org back to the researchers). The bottom half of the diagram indicates some of the other types of authors and collaborations contributing to arXiv.org, each of which would have their own complex networks.
The networks drawn here are of course a simplified type: any given scholar would have a large, complex and heterogeneous network of interactions, including not just their academic life but also their personal and social life. This helps illustrate the point, however. Looking at our results from this point of view, a more coherent story emerges. Now, instead of viewing electronic repositories as an academic field of dreams (“if you build it, [they] will come”), the likelihood of participation requires additional features beyond accessibility. First, scholars must be well placed, either through their academic affiliations, through relationships developed during graduate school or at conferences, and/or through other sources of face to face contacts, before they are likely to participate in these scholarly communities. Second, if scholars are at institutions with high non-research demands on their time, such as heavy teaching loads, numerous committee assignments, and large amounts of student advising, this aspect of their STIN will tend to reduce the likelihood that they will participate on arXiv.org.
6.0 Conclusion
This research is not aimed at criticizing arXiv.org or minimizing its value to the scientists who use it. Instead, we are questioning the claim made by some analysts that arXiv.org, and by extension other resources like it, are in and of themselves sufficiently leveling influences to broaden participation in scientific research. Without complementary resources, new researchers are unlikely to conduct research, publish and present their findings and to have a voice in the elite scientific communities served by arXiv.org. The presence of valuable information resources is not sufficient to change the requirements for doing research.In the Socio-Technical Interaction Networks (STIN) model described above, the factors external to the central behavior are crucial for understanding the network of interactions. In the case of arXiv.org, the central behavior we looked at was posting research manuscripts to the repository. As shown, simply providing access to a useful and well-designed repository is not sufficient to change social relationships inherent in academe. Other policies and programs are necessary if peripheral scientists are to be brought closer to the core.
An example of a program that attempts to aide peripheral scientists in just this way is EPSCoR, the Experimental Program to Stimulate Competitive Research. EPSCoR is a federal program operating in 21 states that have historically received the least federal R&D funding in science and engineering research. The funding provided by EPSCoR takes the form of Research Infrastructure Grants, outreach visits of NSF staff to aide researchers in identifying funding sources, and partnerships with state and local agencies designed to enhance science and engineering research. Programs such as EPSCoR do not come with arXiv.org. To support the development of scientific participation among scientists at peripheral institutions, a bundle of resources is required that may include programs such as EPSCoR as well as a wide range of other resources aimed at developing scientific research expertise.
Mr. Motl laboring in obscurity in Prague is not the norm for scientific researchers. Nor does Mr. Motl even realistically represent a broadening of participation. He did not stay in Prague and continue working at a distance, utilizing information technology to minimize or eliminate the effects of distance and participate in a new, democratic model of physics research. Instead, his story is an example of an elite reinforcing model: Mr. Motl was moved into the Rutgers University/Princeton University orbit of physics scholarship. To return to this paper’s metaphor, this is rather like baseball scouts doing their scouting by videotape submissions rather than visiting every high school. The scouts have access to bring more recruits into their elite ranks, and use this to add players to their team to strengthen the team. This comes at a cost to another potential player, since there are only so many spots on the team just as there are a relatively stable number of research positions available at Research I institutions. Furthermore, Mr. Motl’s Prague is not strengthened by his scholarship just as a baseball player’s hometown team is not strengthened by their recruitment to a major league team. The playing field’s barriers remain to subsequent potential participants.
Physics research is not and never will be a level playing field. While a few outstanding hopeful participants make sneak onto the field and awe the other participants, most potential players who may be just as skilled as some of the existing participants will remain in the stands as spectators. They will consume that which the participants have created. The question that arises from this study is how much this applies to other research arenas – especially those that require fewer complementary resources.
Acknowledgments
This work is based on data gathered from the Physics repository arXiv.org, which generously allowed the repetitive downloads required for this research. Portions of this research, including funds for research assistance and travel to conferences, have been funded by NSF Award #9872961 via the SCIT (Scholarly Communication and Information Technology) project (http://www.slis.indiana.edu/SCIT/) and the Center for Social Informatics (http://www.slis.indiana.edu/csi/), Indiana University, Bloomington, Indiana, USA.References Cited
Aarseth, E. (1997). Cybertext: Perspectives on Ergodic Literature. Baltimore: Johns Hopkins Press.American Mathematical Society. (1999). American Mathematical Society Executive Committee and Board of Trustees May 1999
Meeting Minutes [On-line]. Available: http://www.ams.org/secretary/ecbt-minutes/ecbt-minutes-0599.htmlBachrach, S., Berry, R. S., Blume, M., von Foerster, T., Fowler, A., Ginsparg, P., Heller, S., Kestner, N., Olyzko, A., Okerson, A., Wigington, R., & Moffat, A. (1998). Intellectual Property: Who should own scientific papers? Science Magazine, 281(5382), 1459-1460.
Brunner, R. J. (2002). Dr. Brunner Goes to Bloomington: Astrophysics Meets Informatics (Informatics Colloqium Series). Bloomington, IN: Indiana University.
Cohen, J. (1996). Computer mediated communication and publication productivity among faculty. Internet Research, 06(2/3), 41-63.
Foundation, C. (1994). A Classification of Institutions of Higher Education, 1994 Edition.Ginsparg, P. (1996). Winners and Losers in the Global Research Village [On-line]. Available: http://arxiv.org/blurb/pg96unesco.html
Ginsparg, P. (2001, Feb 19-23, 2001). Creating a global knowledge network. Paper presented at the 2nd Joint ICSU Press - UNESCO Conference on Electronic Publishing in Science, UNESCO HQ, Paris.
Glanz, J. (2001, May 1). Web Archive Opens a New Realm of Research. New York Times, pp. D1-D2.
Harnad, S. (1999). Free at Last: The Future of Peer-Reviewed Journals. D-Lib Magazine, 5(12).
Hesse, B. W., Sproull, L. S., Kiesler, S. B., & Walsh, J. P. (1993). Returns to science: computer networks in oceanography. Communications of the ACM, 36(9), 90-101.
Hurd, J. M. (1996). Information technology: Catalyst for change in scientific communication, Proceedings of 1996 IATUL Conference, Irvine, CA.
Kling, R., & McKim, G. (2000). Not Just a Matter of Time: Field Differences in the Shaping of Electronic Media in Supporting Scientific Communication. Journal of the American Society for Information Science, 51(14), 1306-1320.
Kling, R., McKim, G., & King, A. (In press). A Bit More To IT: Scholarly Communication Forums as Socio-Technical Interaction Networks. Journal of the American Society for Information Science and Technology. (Early draft for m.s. reviewers at
http://www.slis.indiana.edu/csi/WP/wp01-02B.htmlPalais, R. S. (1994). AMS preprint database and server (Usenet posting in sci.math.research group) [On-line]. Available: http://math.albany.edu:8010/g/Math/ejournals/articles/amsplans
Van de Sompel, H. (2000). The Santa Fe Convention of the Open Archives Initiative. D-Lib Magazine, 6(2).
VanAlstyne, M., & Brynjolfsson, E. (1996). Internet - Could the Internet balkanize science? Science, 274(5292), 1479-1480.
Walsh, J. P., & Bayma, T. (1996). Computer networks and scientific work. Social Studies of Science, 26(3), 661-703.
Wellman, B., Salaff, J., Dimitrova, D., Garton, L., Gulia, M., & Haythornthwaite, C. (1996). Computer Networks as Social Networks: Collaborative Work, Telework, and Virtual Community. Annnual Review of Sociology, 22, 213-238.