Stephen J. Bensman and Stanley J. Wilder
In this paper, we analyze the structure of the library market for scientific and technical (ST) serials. The analysis takes the form of an exercise aimed at a theoretical reconstruction of the ST serials holdings of LSU Libraries after almost a decade of massive cancellations and a policy of adding no new subscriptions. This exercise was done in conjunction with the Louisiana State University (LSU) Serials Redesign Project (SRP), and it utilized an experimental computer program called the Serials Evaluator. Much of the paper is devoted to a discussion of the set definitions, measures, and algorithms necessary in the design of a computer program to appraise ST serials.
LSU faculty ratings were utilized as the main measure of ST value, and we investigated the nature as well as the strengths and weaknesses of faculty ratings. Chemistry played the role of the test discipline, and other ST fields were investigated to determine whether the processes affecting chemistry are also active in them. We develop the hypothesis that human knowledge functions on the same probability structure as biological nature and society. We show that this probability structure results in the highly skewed, stable distributions that characterize the social stratification system of science and technology as well as of the serials system based upon it.
Science and technology are seen in this paper as dominated by stable elites, who tend to center around traditionally prestigious institutions and publish their work in U.S. association journals. Consequently, U.S. association serials have higher ST value, and they play a dominant role not only in internal library use but also in interlibrary loan. Due to their higher ST value, U.S. association journals can be sold to libraries in greater numbers at cheaper prices than the journals of commercial publishers, and this causes the ST serials market to bifurcate, with ST value tending to concentrate on the U.S. association serials and costs on the commercial ones.
As a result of the highly skewed, stable nature of the ST serials system, the ST serials holdings of LSU Libraries were found to have suffered little damage, despite almost a decade of massive cancellations and no new subscriptions in the face of an exponentially growing serials population. To bring the serials holdings at LSU Libraries up to optimal level in 33 ST disciplines, it was estimated that only 118 new subscriptions costing $81,882 were needed, and of these much of the perceived value derived from 53 titles that cost $39,948. Moreover, it was still possible to cancel subscriptions to another 342 titles that cost $222,409 without materially affecting the perceived value of LSU Libraries' serials holdings in the 33 disciplines. We see no solution to the present crisis of the ST serials system in its present form, through technology, cooperative collection development, or consortia, and we state that librarians will have to change the nature of this system by utilizing the new technology's capability of delivering information rapidly to move from subscriptions to a free market in ST information through document delivery.
In this paper, we describe an exploration of the structure of the library market for scientific and technical (ST) serials that was done in conjunction with the Louisiana State University (LSU) Serials Redesign Project (SRP). It is a continuation of Bensman (1996). The purpose of the exploration was to analyze the options open to academic libraries for resolving the serials crisis currently occurring. The exploration was done as a mock exercise in reconstructing the ST journal holdings of LSU Libraries after almost a decade of massive cancellations and a policy of adding no new subscriptions. An experimental computer program called the Serials Evaluator was designed and utilized in the reconstruction of these holdings.
This paper is divided into five main sections. The first section is historical, locating the roots of the current crisis in the nature of ST growth and price inflation and showing that these factors compelled academic libraries to begin the transition from ownership to access in their handling of ST serials. The section describes how the crisis forced LSU Libraries into massive serials cancellations and increased reliance on interlibrary loan borrowings, finally culminating in the birth of the SRPa conscious attempt to integrate the concepts of ownership and access.
The next section is theoretical. The nature of set definitions and probability distributions in library and information science together with their statistical ramifications are analyzed. The system of probability distributions that biologists have developed to model patterns in nature is set forth, and we show how the key distribution of this systemthe negative binomial distribution (NBD)has penetrated the information and social sciences because it models the stochastic processes underlying the highly skewed distributions typically found in these disciplines. Particular attention is given here to the controversy over the applicability of the NBD to external monographic circulation.
Using chemistry as an example, we then illustrate with the aid of the National Research Council (NRC) database how the highly stratified social system of science and technology resulting from these stochastic processes is dominated by stable elite groups. We next demonstrate with chemistry data that the ST journal system is a reflection of this social structure, proving by citation analysis that the superiority of U.S. association journals derives from the elite group publishing in them. We conclude the theoretical section by describing how the ST journal system functions in much the same way as the social stratification system of science and technology, concentrating on the stability at the top of the citation distribution and the zero citation class.
Following the theoretical section are two practical sections in which we demonstrate the implementation of theory in an analysis of the LSU Libraries serials holdings in science and technology. The vehicle for this is a discussion of the set definitions and measures necessary for the design and operation of an experimental computer program called the Serials Evaluator. We begin by showing how the Library of Congress (LC) classification schedules were utilized to construct statistically valid subject sets. We then describe the way in which LSU faculty ratings of journals were quantified into an ST value measure called faculty score and our method for validating this score with citation-based measures as well as both external and internal library use.
Data from the University of Illinois at UrbanaChampaign (UIUC) Chemistry Library are employed to measure the effect of the operating algorithms of the Serials Evaluator in terms of cost-per-use. We then show that virtually all ST fields manifest the same phenomenon previously found in chemistry, i.e., a bifurcated pattern with ST value concentrating in the journals of the U.S. associations and costs in the titles of commercial publishers. We conclude by demonstrating how this fact was utilized to design a leveraged restructuring of LSU Libraries' ST serials holdings.
The last section is an economic one, and it delineates the contradiction between social and economic logic that leads to the paradox of an inefficient market in which libraries have to pay more money for the less important ST information. Analyzing the options available to librarians, we conclude that librarians will be compelled to continue the transition from ownership to access by moving from subscriptions to the free market of document delivery.
Such is the overall structure of the paper. However, a caveat must be issued before it is read. We present what can be called "a stick figure view" of the ST elite. This elite is much more complex than the depiction given here, where we analyze only the academic social stratification system of U.S. science and technology. Even here the picture may be oversimplified. The ST elite is not located entirely at the academic institutions repeatedly mentioned in this paper. These institutions are utilized as exemplars of the ST elite, which spreads out over other institutions in the manner typical of bibliometric distributions. Then there is the question of the role of research establishments in government and industry. Moreover, the presentation of the elite in this paper may be distorted from the international perspective. There is ample anecdotal evidence that the superiority of U.S. association journals may not be so much a function of the superiority of U.S. science and technology as of a globalization of world science and technology through the U.S. associations.
As a result of doing the research for this paper, we have formed the opinion that library and information science might be poised to rise from a social to a natural science. This is because library and information science appears to have a coherent probability structure, strong relationships, and stable phenomena, resulting in a high degree of predictability.
However, before library and information science can make this transition, two major problems have to be solved. The first is the crucial problem of set definition. The persistent failure to define proper sets obscured for years the strong correlation of citations with library use. Now this same problem appears to be complicating the uncovering of the true probability structure of human knowledge. Sets in library and information science are inherently ambiguous due to the way disciplines overlap and share the same literature. For example, during the course of the research, there were constant problems with biochemistry journals. The logic of the chemistry journal set used in this paper and its predecessor was defined by a survey of the Department of Chemistry without the participation of the Department of Biochemistry. This resulted in the biochemistry journals being more highly cited than warranted by their importance to the faculty of the Department of Chemistry alonea characteristic particularly of the Journal of Biochemistry, which had a citation rate much higher than that of the most highly faculty-rated title, the Journal of the American Chemical Society. Consequently, sometimes the biochemistry journals fell out of the statistical models as outliers, and sometimes they remained in the models, distorting the parameters. This problem was crudely handled by running the models both with and without the outliers, but a far better solution would probably have been the application of fuzzy set theory.
The other major problem that has to be solved is the construction of better measures of ST value. These better measures must exhibit two primary characteristics. First, they have to reflect accurately the way the human mind perceives such value. From this perspective, major deficiencies were discovered in the impact factor citation measure published by the Institute for Scientific Information (ISI), even though ISI citations performed much better as predictors of library use than LSU faculty ratings, which not only suffered major perceptual failures but were politically difficult and expensive to obtain. Unlike total citations, ISI impact factor failed to correlate well with either faculty ratings or library use due to its controlling for size. The only way impact factor could be used statistically was to construct from it crude ordinal variables for nonparametric models.
The second necessary attribute of value measures is that they must accurately capture the stochastic processes underlying the production, utilization, and evaluation of information. On this attribute the traditional way of measuring the peer opinion of the scholarly quality of U.S. research-doctorate program faculty suffered a total failure. When tested, the peer ratings of the scholarly quality of chemistry research-doctorate program faculty resulted in a probability distribution that not only gave a false picture of the structure of these ratings but also of the stochastic processes by which this structure arose.