WPbanner5.jpg (10176 bytes)
No. WP-97-07

A Typology for Electronic-Journals:

Characterizing Scholarly Journals by their Distribution Forms


Rob Kling and Geoffrey McKim

kling@indiana.edu
mckimg@indiana.edu
September 1997
 

Center for Social Informatics
SLIS
Indiana University
Bloomington, IN 47405
http://www.slis.indiana.edu/csi



 

Introduction

The purpose of this short report is to provide a set of consistent terms that can be used when discussing scholarly communication in general, and electronic journals in particular.  Our motivation is the inconsistency that pervades the discourse in scholarly communication in use of terms like “electronic journal” and “online journal”.   In the early 1990s, the term "electronic journal" was used to refer to journals that were exclusively distributed in electronic form. The term, "electronic journal" contrasted sharply with "paper journal."

Around 1997, certain major scientific publishers began to distribute many of their paper journals in electronic form as well. Unfortunately, some analysts continued to refer to a paper journal as one which was circulated in electronic form, even if it's paper form was more widely read.  This use of "electronic journal" to refer to journals with substantial paper versions as well as electronic editions -- such as  Science, the Journal of the American Society for Information Science, and the Journal of Astronomy -- simply confuses discussions. Questions about the legitimacy of electronic journals (Kling and Covi, 1995) and their cost structures (Fisher, 1996) can only be confusing when journals that are circulated  purely in electronic form and those that have dual editions are confounded into one group.

This report recommends a nomenclature for electronic journals that helps to clarify rather than confuse discussions of  electronic journals. We will begin by placing journals in a larger model of scientific communication.
 

Elements of a basic model of scientific communication

We focus upon reports of scientific studies as the basic molecule of scientific communication. Scientists hear reports of studies in lectures and seminars and read reports of the same studies in pre-prints and  journal articles.  We want our basic model to help us understand the role of e-media in such reports (in contrast with traditional channels).  Incidentally, we are using the term “report” in a generic sense; a report need not actually be called a “report”.

Scientific Reports and Packages

We will introduce a refined vocabulary for characterizing three major elements: (1) scientific reports (in various versions); (2) groups of reports "packaged" together in forms such as journal issues and conference proceedings; and (3) the various channels through which scientists can communicate their reports to specific kinds of readers.

Scientific communication systems have important structural properties. For example, a molecular biologist who studies the genetics of hummingbirds may share preprints of her studies with a few other avian biologists. Many biologists share "working drafts" of the to-be-published articles with a group of 10-40 other labs that conduct related research. One has to be known and trusted by a biologist to be a recipient of reports in this informal but relatively stable kind of communications channel. When the article appears in print, in a journal such as Nature, it can be read by a much larger number of biologists -- and also members of the public, including newspaper reporters. The scientific journal packages a number of reports, and circulates them through specific channels, primarily subscription lists and research libraries.

We will treat the report as the generalized basic unit of scientific research product. A report may appear in many forms: as a journal article, as an article in a conference proceedings, as a talk at a seminar or a conference, as a working paper, as a pre-print, or as a chapter in a book.  By considering the various forms in which a report may be instantiated, we can characterize more concretely the trajectory of a report through its pre-publication and publication phases.  A report is not an atomic entity; rather, it consists of multiple elements, such as an author, abstract, body, references, and appendices.  These parts of a report are in many cases themselves extractable form the report and usable themselves: for example, the title and abstract may be extracted and included in an abstract database.

We can then define a similarity cluster as a collection of reports by a specific set of authors that a consensus of journal editors or reviewers in a given community would consider to be equivalent, with respect to informative value.  In other words, if one report in a given similarity cluster were published in a journal, and another report in the same similarity cluster were submitted to the same journal, the editorial board of the journal would consider the second to have “already been published.”  The reports in a given similarity cluster will generally appear in a variety of forms.  For example, a report may begin as a draft, become part of a working paper series, be distributed as a pre-print at the time of submission to a journal, be published in a scientific journal, be distributed to colleagues of the authors as a reprint, be archived in a digital library, and indexed by a secondary indexing service such as Lexis-Nexis, and eventually be reprinted as a chapter in a book.  Although the specific text and ancillary attachments for these reports may be different, they are essentially the same in that a member of the readership community having read one of the reports in the similarity cluster would find another report in the same similarity cluster to contain the same core argument with the same basic data.  Of course, a given reader may find the an attachment or data table unique to one particular report in a similarity cluster of substantial informative value.  However, the core arguments and basic data remain the same for reports in a similarity cluster.

We can then characterize any report R as a 3-tuple, R(auth, seq,ver), where auth is the set of authors of the report, seq is an index of the report itself, where reports having different sequence would be considered to be different reports, and could thus both be published in a particular journal, and where ver is an index to a version of the report.  The sequence number is in itself arbitrary; in general sequence numbers represent temporal progression – a report with sequence number 2 was probably written after one with sequence number 1.  However, order of sequence numbers is in itself unimportant. In this notation, a similarity cluster would consist of reports in which the author set and the sequence number are the same; only the version numbers would be different.

For example, let MJPS be the author set of M. Jones & P. Smith, and let B represent their work as part of a particular project.  Thus R1=(MJPS,B,1) might be the draft of their report, R2=(MJPS,B,2) a pre-print, R3=(MJPS,B,3) an article in a journal.  All three of these reports would comprise a similarity cluster.

The reason for our seemingly baroque notation and definition of the report, however, becomes more clear when we consider the fact that readers do not usually receive reports alone.  For example, journal articles are generally bundled together into an issue.  In order to talk about bundling, we will call these bundles of reports packages. Common packages of reports include journal issues and proceedings of research conferences.  Of course, packages can also include self-published singletons (documents posted on a personal home page, pre-prints or drafts sent to colleagues), working paper series’, and books of articles, among other forms. The important concept is that readers do not generally receive reports in isolation, and that packaging influences who will read which reports.

In our notation, then, a package P would consist of a set pf one or more reports R, generally with different author sets, sequence numbers, and version numbers.

P={R1, R2, ..., Rn}, n>0, where:

Rj=(Aj, Bj, Vj)
Aj is an author cluster

Bj is the sequence number of the report
Vj is the version number of the report
 

Editorial Packagers and Production Packagers

These packages are more than mere logistical and economic conveniences however; as it turns out, the trust that a reader places in a particular report is tightly coupled with the degree of trust the reader has in the packager of the reports.  First, however, it is important to distinguish two different packager roles: that of the editorial packager and the production packager.  The editorial packager is the institution or individual that selects reports to appear in the package.  Typically in a journal, the editorial board and/or the scholarly society sponsor is the editorial packager; in a self-published package, the author(s) may actually fulfill that role.  Over time, the name of an established journal generally becomes a proxy for the editorial board itself; thus the Science and Nature journals themselves can be thought of as the editorial packagers for the reports that they publish.  The production packager, on the other hand, actually provides the transformation of the reports as intellectual objects to the packages as material (construed broadly to include bits as well as atoms) objects, as well as frequently providing dissemination, marketing, and associated administrative processes.  The production packager is traditionally referred to as a publishing house or a “press”; of course, the increasing work with documents in electronic format makes the original press metaphor decreasingly relevant.

We make the distinction between the two types of packagers because they play different roles vis a vis the trust that readers put in the reports contained in the package.  In this discussion, trust in a report refers to the degree to which a reader is willing to rely upon a report as a basis for subsequent action.  For example, trust in the report typically means trust in the editorial packager (and, in particular, trust in their standards, peer-review procedures, knowledge, etc.).  Trust is placed in the production packager to a much lesser extent.  For example, a new journal may attempt to boost its standing by affiliating with a high-status production packager, such as Cambridge University Press or MIT Press.  Consider Science Online, the Web-based version of Science, as an example of the distinction between the two types of packagers.  The editorial packager of Science Online  is the editorial board of Science magazine, and the American Association for the Advancement of Science.  Highwire Press, a Stanford University venture, is the production packager.  While Highwire Press plays a large role in making Science Online available to readers (and in giving it a professional appearance), scholars’ trust in Science Online is not at all dependent upon trust in Highwire Press; rather, it is dependent upon the editorial board of Science and the AAAS.

As would be expected, trust emerges as major theme in our discussion of package selection, and will be considered more extensively later in this report.  In that section, we will argue, using the vocabulary of reports and packages, that the different factors that play a role in influencing the trust that a reader has in the reports vary by reader community, particularly with regard to the reader’s centrality or peripherality in a particular target community.

Before researchers are able to read reports, they must first extract the report from its package, and convert it into a usable material form.  In its simplest form, extraction may consist merely of opening up a new journal to the appropriate page and reading a specific article.  Increasing in complexity, extraction and conversion may also involve locating a journal issue in a library and photocopying the article, searching an electronic archive, downloading the article as a PDF (Adobe Acrobat format, a page description language used in many electronic versions of paper journals) file, and printing the file.  Extraction and conversion is not automatic, however; it frequently entails many technical and social complexities, which add to the difficulty of making use of a given report in a given package. For example, on the technical side, the user may have to have a special viewer installed in a Web browser (or obtain an Internet connection at all, perhaps from a remote or home location).  On the social side, the user may have to negotiate access to a particular archive.
 

Communication Channels for Scholarly Reports

Finally, the third major component of our model of scientific communication is the concept of communication channels, which connect readers with reports.  We posit two distinct types of communication channels, both of which are critical to scholarly communication: announcement channels and access channels.  Announcement channels are those communication channels through which readers and potential readers learn about reports (and thus new results). Once again, the importance of announcement channels depends on the centrality of the reader -- more centrally positioned active scholars frequently already know about upcoming results  -- therefore, access channels may be more important for more peripherally-located readers.  Announcement channels include subscription lists (to journals, to LISTSERVs), personal or institutional distribution lists (e.g. a pre-print distribution list or clearinghouse), current periodical shelves in libraries, current awareness services such as Current Contents, and even word of mouth.

Access channels, on the other hand, are channels through which scholars gain access to reports after they are released, at the time of use. Access channels are extensive and multifaceted. In its simplest form, an access channel can be the receipt of a paper journal itself; if one has received a paper journal, they only need open it to the appropriate place to access the report.  Access channels may also include: secondary indexing services, such as BIOSYS and Medline, which allow researchers to obtain materials, or at least complete citations, from topical queries or partial citations; libraries; electronic archives of various sorts; publisher backfiles; etc.  Paper journals can confound the distinction between announcement channels and access channels; if a subscriber receives a paper journal, they have both received an announcement (perhaps through the table of contents, perhaps through the journal pages themselves) and have access to the article itself.  On the other hand, Current Contents and many online abstract services more neatly separate the functions of the two types of communications channels.

Access channels vary in their stability over time.  For example, research libraries are generally highly stable over large spans of time (compared to the time span of a research project).  They obtain this stability through a variety of work practices, including cataloging, archiving, indexing, shelving, and the provision of interlibrary loan services.  On the other hand, the stability of many Internet-based document repositories and Web sites is frequently questioned, often depending upon the continued commitment and interest of a single person.   In general, the more institutionalized the channels become, the more stable they become.  The library, thus, as a highly institutionalized channel, provides a highly stable access channel over time.

As we will argue later in this paper, the existence and strength of these channels form a critical component of the scientific communication system.  In particular, we will characterize the strength of a publication act by the stability and audiences of a set of announcement and communication channels.

Communications channels can exhibit some important structural properties.  When access channels for a particular class of publications, in combination with the author sets and reporting channels, become highly stable, and form closed loops, they can be said to form institutional circuits.  For example, a certain fairly well-defined set of researchers tend to contribute to high-impact journals.  These same journals are distributed via subscription and research libraries, and are indexed by major indexing services. The scholars that subscribe to these journals and have ready-access to research libraries are then the ones that can access the reports and journals.  The loop is closed as these same researchers are then able to contribute again to the journals.  The closed-loop nature of institutional circuits is a frequent target of criticism from those outside of the circuits, and consequently much of the more revolutionary rhetoric about the democratizing potential of electronic communication is aimed at attempting to open up or broaden these circuits.  We will return to the electronic publishing and democratization theme later in this report.

The importance of the package itself and its function within communication channels can be seen in the following illustration.  A researcher publishes an article in a high-impact journal.  Libraries make this work accessible over time by subscribing to the journal, receiving the packages, and cataloging and allocating shelf space to them.  On the other hand, consider another researcher who self-publishes a report.  The library will generally not, for a variety of reasons, obtain (or have any way of obtaining) that report, catalog it, and allocate shelf space to the individual article.  Even if the researcher copies and sends his report to the top 200 research libraries, it is unlikely that any of these libraries will accept, catalog, and make accessible this paper.  Hence that article may not be accessible widely through certain channels.  However, the “invisible college” of the author of the report frequently wants the report itself, often even in a working paper version, not necessarily the whole journal package, frequently because of the delay that the packaging entails (Peskin, 1994).  This discontinuity between the needs of certain reader communities and the communications channels combined with the packaging process has encouraged the development of alternate packaging methods and communications channels, such as the XXX E-Print Archive in physics at Los Alamos National Laboratories and its paper predecessors, including the paper-based preprint clearinghouse at Michigan State University.

The ecology of the scientific communication system can thus be characterized structurally as being composed of reports and similarity clusters, packages and packagers, and access and announcement communications channels.  In the rest of the report, we will utilize this model and the vocabulary it provides in order to explain some key phenomena, as well as to explore potential transformations.
 
 

The Character of Electronic Journals

In most discussions of the scholarly communication system, the term electronic journal serves more to obscure than to clarify.  Consider the following four different glosses on the term.  Machovec, in his “Electronic Journal Market Overview - 1997”, provides as examples of electronic journal projects Project Muse at Johns Hopkins University Press, the Journal Storage, Project (JSTOR) project, Elsevier Press, Springer-Verlag, Blackwell, Science, Highwire Press at Stanford University and Academic Press.  All of his examples of electronic journals represent publishers or aggregators who are delivering conventional paper scholarly journals in electronic form in parallel.  On the other hand, MIT Press’s Janet Fisher (1996) writes that “In the period from 1993 to 1995, the number of e-journals has increased, but they are still almost entirely free and created almost entirely by dedicated groups of individuals without production subsidy from institutions or scholarly associations (p. 231)”, referring primarily to journals or journal-like publications that exist in electronic form only, like Bryn Mawr Classical Review, Postmodern Culture, and Psycholoquy .  Odlyzko (1996) has still a different view of electronic journals, seeing them more akin to Ginsparg’s preprints, as collections of unpackaged, but potentially refereed documents, available for download from a central server:

“The new technologies, however, are making possible easy publication of electronic journals by scholars alone.  It is just as easy for editors to place manuscripts of refereed papers in a publicly accessible directory or preprint server as it is for them to do the same with their own preprints.   The number of electronic journals is small, but it is rising rapidly.” (p. 95)

Finally, an environmental biologist we talked to had a still different view.  When asked if biologists would ever accept electronic journals, he gave a categorical “no, not at all”.  When prompted with “Even if they were peer-reviewed?”, he revised his answer: “Oh, yes, of course they will if they are peer-reviewed.”  His initial assumption was that the term “electronic journal” did not imply, or even suggest peer-review; rather, he likely saw electronic journals as having the same status as papers posted on an individual’s home page.

One consequence of this lack of clarity on the meaning of electronic journal is widely varying estimates of the number of electronic journals in existence. A second, more serious consequence is the potential for misunderstanding and miscommunicating changes in use of electronic media for scholarly communication.  For example, suppose in our conversations with researchers we found that scientists are increasingly searching the electronic archives provided by conventional journal publishers, in order to track down articles published in paper journals.  By some conceptions, but not all, of the electronic journal, we could state that scientists are increasingly using electronic journals in their research.  However, this does not mean that scientists are using electronic journals in the sense that Odlyzko, or even Fisher meant.
 

We define an electronic journal broadly, as a package of peer-reviewed reports that can be accessed by readers through electronic communications channels.  Note that, in contrast to the evolutionary biologist mentioned above, we use the term journal to include the concept of peer review; all electronic journals, so defined, are ipso facto refereed.  Electronic scholarly communications that are not peer-reviewed can go under a variety of labels, including e-prints, working papers, electronic magazines, and electronic newsletters.  We refer to isolated reports made publicly and electronically available in non-peer-reviewed form, either as posted on an individual or organizational Web page, or on a server such as the arXiv.org Physics E-Print Server, as electronic working papers.

Electronic journals can thus include everything from Psycholoquy and the World Wide Web Journal of Biology, which are produced and accessed generally only in electronic form, to journals distributed as part of the JSTOR project, to electronic versions of conventional paper journals, such as Science Online and the American Astronomical Society’s Astrophysical Journal, Electronic Edition.  The defining characteristic is that they are accessed by readers electronically.

Pure Electronic Journals vs. Hybrid Journals

We can then make some distinctions between types of electronic journals.  We define a pure electronic journal as an electronic journal which is primarily distributed and accessed through electronic channels.  In doing so, we acknowledge that even a pure electronic journal may, and probably will, be printed out for reading, and possibly even stored in libraries in a printed form, for archival purposes.  However, pure electronic journals are accessed primarily in electronic form. Examples of pure electronic journals include Psycholoquy, the World Wide Web Journal of Biology, and the Internet Journal of Science: Biological Chemistry.  Thus far, there are remarkably few pure electronic journals in the sciences, and those that exist publish so few articles as to be a mere statistical blip.  As a point of comparison, when this paper was written, the Internet Journal of Science: Biological Chemistry had published 8 reports in 1997 (along with some conference information).  At the same point in time, the Journal of Biological Chemistry had published approximately 3500 reports in 1997.

We can then contrast the pure electronic journal with the hybrid paper-electronic journal (or pe-journal).  The pe-journal is a package of peer-reviewed reports available through electronic channels, but whose primary access channels are paper-based.  Examples of pe-journals include:Science Online, Cell, Nature, the Journal of Biological Chemistry, Astrophysical Journal, and the Journal of Neuroscience.  Since the criteria for distinguishing a pure electronic journal from a pe-journal are anchored in the readership, a pe-journal could certainly become a pure e-journal, if readership changed such that the journal was accessed primarily electronically, and paper copies just were produced for archives or even libraries.

There are still questions by some scholars about the legitimacy of publishing in pure e-journals. In contrast, we know of no evidence that the legitimacy of any paper journal has declined when it became a hybrid pe-journal (such as Science and Science Online). On the other hand, there have been claims that pure e-journals could be much less expensive to produce than paper journals. However, there is no evidence that the cost of  publishing paper journals declined significantly when they offered electronic editions as well! In brief, this simple typology helps to maintain clearer contrasts in the discussions of the media of journal publication.

Electronic Access Channels

The electronic access channels need a bit of clarification as well.  First, if an electronic journal is available in some form over the Internet, it can be said to be online.  An electronic journal that is not available online may be available via CD-ROM, for instance.  Second, even electronic journals that are online are not necessarily available to everyone with an Internet connection; while many, such as Psycholoquy, are open, some may be access controlled (using a variety of technical mechanisms for access control).  Some access-controlled journals may only require that readers sign up before accessing (e.g. the Journal of Digital Information (JODI), while others require a subscription, such as Science, Cell, and Nature, which may be on an individual basis (such as Science) or an institutional basis (such as the Astrophysical Journal).
 

References

Fisher, Janet.  1996. Traditional Publishers and Electronic Journals. In: Peek, Robin and Newby, Gregory.  Scholarly Publishing: The Electronic Frontier.  Cambridge, MA: MIT Press.

Kling, Rob. & Lisa Covi (1995). "Electronic Journals and Legitimate Media  in the Systems of Scholarly Communication." The Information Society. 11(4), 261-271.

Machovec, George.  1997. “Electronic Journal Market Overview”.  Colorado Alliance of Research Libraries.  March.  Available from: http://www.coalliance.org/reports/ejournal.htm

Odlyzko, Andrew.  1996. Tragic Loss or Good Riddance? The Impending Demise of Traditional Scholarly Journals.  In: Peek, Robin and Newby, Gregory.  Scholarly Publishing: The Electronic Frontier.  Cambridge, MA: MIT Press.

Peskin, Michael E. 1994. "Reorganization of the APS Journals for the Era of Electronic  Communication.” Sept. 12. Unpublished. Available from: http://publish.aps.org/EPRINT/peskin.html