Accessing Electronic Journal on the Internet
Indiana University-Purdue University at Indianapolis
There is a rapid growth of scholarly electronic journals on the Internet. The ARL Directory of Electronic Journals, Newsletters and Academic Discussion Lists included 110 electronic journals and newsletters in its 1991 edition, 240 titles in the 1993 edition, and over 400 titles in the 1994 edition. In 1995 the number was increased to 700 titles (up 66% from 1994), and in 1996 up to 1,688 (257% increase since 1995). New electronic journals are released in an average of 15 titles per day in NewJour, an update service for the ARL Directory.
The 1996 Directory notes that the number of e-journals published by the commercial publishers and scholarly societies is increasing, so is the number of peer-reviewed journals available on the Internet.
The majority of these electronic journals are in fact online versions or editions of the print journals. According to the "STM journal survey report" by Hitchcock, Carr, and Hall, 65% of the electronic journals in science, technology and medicine (STM) are electronic editions of print journals and 35% are published in electronic format only. The number of electronic-only STM journals was very small compared to the total number of science journals published in 1995. (Steve Hitchcock, Leslie Carr and Wendy Hall, "A survey of STM online journals 1990-95: the calm before the storm" In ARL 1996 Directory of Electronic Journals, Newsletters and Academic Discussion Lists)
While both the 1996 ARL Directory and Harter's 1995 survey report that 90% of the networked electronic journals are free, Hitchcock, Carr, and Hall find a much lower percentage (57% ) of free e-journals in science, technology, and medicine.
Modes of Access
- Internet browsing lists and search engines such as Lycos , InfoSeek, WebCrawler, Yahoo, and AltaVista:
Of these, some are hierarchically organized subject browsing lists, and some are search engines which collect data from the documents on the Net, create a searchable database, offer sophisticated search capabilities and present ranked retrieval results. Certain locator servers such as Yahoo have both browsing and searching capabilities.
Search engines offer automated resource discovery and indexing capability, use natural language processing, provide immediate access to a vast array of Internet resources. They are easy to use and maintenance-free. However, search engines offer no authority control for names or subject vocabulary, nor classification schemes for information organization. Neither do they provide cross-references or support field searching. The free-text searching on documents themselves, file name, URL, html title and document headers, etc. often results in extensive hits requiring users to wade through masses of irrelevant and incomprehensible entries to get the information they need.
- Traditional library catalogs:
Library catalogs provide access to journal titles which have been carefully evaluated and selected for the inclusion in a library's collection. Library catalogs use controlled author identification, subject vocabulary, and classification to organize bibliographic data in the descriptive surrogate records, providing selective and comprehensive search capabilities with high precision rate. A number of next generation library catalog systems are now capable of hot-linking the URL on the bibliographic records to the electronic documents being described. Catalog records provide content information which can help users to evaluate the resources before accessing, whereas indexing tools find a match and provide access only. Likewise, system requirements and access information on the catalog records are critical for users who must be able to have compatible computing environment to access. Library catalogs should provide equal access to traditional print serials and electronic serials in one integrated system. So users do not have to perform the same search in multiple systems. This is especially critical for those titles split-published in multiple formats. The applicability of cataloging rules and MARC formats in cataloging electronic journals have been challenged. Problematic cataloging areas such as editions, seriality, frequency, designation, and title continue to be addressed and solutions are being sought by the cataloging community. Another weakness of MARC record is its structure which is designed for single object description, and linear access does not adapt well to the description of multi-level hypertext objects on the Internet. Some people are skeptical about the usefulness and validity of the library cataloging approach in organizing Internet resources. Some are questioning the need to create bibliographic surrogates for electronic resources when the digital objects are available by a mouse click, and their title page and table of contents are both online. One of the major opposition against cataloging Internet resources holds that cataloging is too costly and labor-intensive, and the cataloging process cannot keep pace with the proliferation of resources available on the Internet. The OCLC InterCat project which calls for collaborative cataloging efforts demonstrates that libraries can play an important role in providing access to Internet resources. A recent study by Taemin Park finds 60.2% of the electronic journals listed in the 1996 ARL Directory have been cataloged on OCLC (Taemin Park, Bibliographic Control and Access of Networked Electronic Journals in OCLC and Local Libraries)
- Library Web pages:
These special electronic journal pages provide linked access to selected or acquired titles. Being small in scale and very focused in coverage, these pages are more useful than the Internet finding aids such as Yahoo or AltaVista. Library e-journal pages are easy to navigate and can keep statistics for e-journal use studies. Some pages provide manually created annotations or unit records for each electronic journal included, serving as a brief description to help users determine if the journal meets their information needs. Some pages of e-journal directories (e.g. CICNet's EJC and NewJour) provide searching capabilities on title, topics, and/or descriptions. However, topic headings are generally too broad to be useful for file/document retrieval. Headings assigned do not follow any conventional thesaurus and are thus difficult to search unless one knows the vocabulary being used.
Library Web pages provide access by known titles only. there is no mechanism to provide variant title access and serial management control at all. Creating annotation or description records are labor-intensive and redundant tasks which duplicate the work of cataloging. For this very reason, a number of libraries are contemplating to discontinue the creation of annotations or descriptions for their e-journal pages. For those libraries who catalog electronic journals in their catalogs, keeping e-journal Web pages requires double database maintenance.
In summary, Internet search engines retrieve too little information on the objects to be useful. Library catalog model provides rich description and efficient access, but is too costly for the large amount of e-journals on the Internet. And library Web pages are merely title browse lists with links to the electronic resources.
The convergence of Web and cataloging
A number of research projects and developments attempt to either adapt cataloging principles to improve the Internet search engines or to utilize robot to assist traditional cataloging process. For instance, a few developing systems make use of subject analysis tools to incorporate thesauri along with natural language access to add controlled vocabulary automatically (Carol Mandell & Robert Wolven ). Some developing systems use part of the MARC fields, classification schemes, prepared abstracts, subject headings, etc. However, none of these fully exploit cataloging methods and standards. Data collected automatically by robot are not catalog record equivalents. Moreover, none of the indexing tools could substitute library catalogs.
Several research projects are undertaken to explore computer-assisted cataloging. Some approaches assume that less-than-full bibliographic description provides sufficient access to resources whose title pages and table of contents are both online. Projects Scorpion attempts to use DDC classification schedules to automate subject assignment for e-documents and present the search results to catalogers as a tool. There is the meta-cataloging approach which attempt to expedite cataloging process by attaching metadata to the resources they describe. This approach follows the traditional Cataloging in Publication model that descriptive data are supplied by the resource creators and can be mapped and translated into various codes and be used by search engines and traditional cataloging. Initiatives such as TEI headers, Dublin Core data elements, and the OCLC Spectrum Project attempt to establish a standard subset of basic data elements necessary for resource identification and retrieval using key metadata supplied by authors to create minimal-level surrogates which can be used for full level catalog records.
Despite the three options to access electronic journals, the basic question regarding journal article access remains unanswered. Although traditional catalog surrogate records have significantly enhanced the identification and access to journal titles, they do not point nor access to the journal content, that is, the articles per se. This is also true for the library e-journal home pages. The majority of library users searching journals in the OPAC have either a citation in hand or a topic in mind. Internet search engines are the only one of the three modes discussed above that provide limited author, title, and subject search capabilities for journal articles on the Internet. Article-level access has been traditionally provided by commercial indexing and abstracting services. Unfortunately, these services are lagging behind the publishers in providing access to electronic journals. On the other hand, some electronic publishers such as Project Muse and JSTOR are building search engines in their own databases for full-text indexing. As more and more index and abstract databases are moving away from the CD-ROM technology to use the Web interface, it seems that the next logical step is for the service providers to provide the links to the electronic journals on the Internet. Alternatively, following the "hooks-to-holdings" model, indexes and abstract databases can be linked to a library's holdings by ISSN number and provide direct access to e-journals by URL (856 field) instead of call numbers as currently used in the "hooks-to-holdings" model.
Cataloging will not be replaced anytime soon. And cataloging Internet resources deserves strong commitments from the library community. The development of computer-assisted cataloging is promising and should be further explored. Dynamic catalog records with hyper-link capability awaits for further research. Future development and applications of knowledge bases and robots for knowledge gathering and retrieval may have potentials to revolutionize or replace traditional cataloging operations. In the meantime, as we live in an environment where multiple approaches and options prevail, no one approach is perfect nor can any one approach provide access to all the networked electronic resources for all people.
This paper was presented at the Chinese American Librarians Association Midwest Chapter 1997 Annual Meeting.
Copyright © 1997 Julie Su.
Submitted to CALA E-J on May 9, 1997.