Non-Consuming Relevance: The Grub Street Project 1

External Image

Please see:

http://rup.rice.edu/image/cnx_images/shapes-buybutton.jpg Non-Consuming Relevance: The Grub Street Project Laura Mandell

To summarize my response to Professor Muri at the outset: yes. And now onto some details. I am director of a project that has been modeled upon and is partly sponsored by NINES (our host for this event), a project mentioned in Professor Muri's excellent paper. It is called 18thConnect, and, like NINES, it will aggregate and peer-review digital projects. 18thConnect diers from NINES in one major way, however:

ECCO. The Eighteenth-Century Collections Online database, mentioned by Professor Muri, contains page-images of 140,000 texts ranging in size from a two-page pamphlet to the 1500-page Clarissa. The Text Creation Partnership has double-keyed 1,809 of these texts and then run out of money. 18thConnect has a development team at the University of Illinois, headed by Robert Markley. They are augmenting an OCR program by designing it to read eighteenth-century texts, specically. After meeting with Martin Mueller a few weeks ago, I have come to believe that 18thConnect can play a crucial part in creatingnot just an acceptable level of error in text les corresponding to ECCO's page images, but actually a vital, immense dataset of eighteenth-century texts, minable by machines, word-searchable by all. That thrilling prospect will involve not just OCR but automated TEI tagging, automated linguistic markup, automated dictionary look-up, etc., etc. I attach at the end of this essay a table of all the organizations that could be involved in producing this vast data set and that could benet from the text-cleaning process. If Martin Mueller's Project 2015 really works, as I believe it will, new kinds of knowledge will come from textual data that will convince our institutions of the value of text as data, which will provide, as Professor Muri so aptly stated, the ground for making digital humanities a going concern. Academic institutions must become consumers clamoring for textual data if the humanities are to be relevant in the twenty-rst century.

The phrase consuming relevancealmost quoted in the title to my response comes from Frank Ker-mode's book An Appetite for Poetry (1989). He says of the canon that it gives us books of timeless, consuming relevance, texts that have meanings and messages that were applicable to the time in which they were written but continue to be relevant and perhaps even prophetic, though the prophecy could of course be of the self-fullling sort arising as a consequence of veneration. Less abstractly, did Shakespeare understand modern psychology or partly produce it, a question of concern to Harold Bloom.² But Shake-speare's production of our inner lives wasn't his, as attendees at this conference know. Insofar as we have

1This content is available online at <http://cnx.org/content/m34322/1.2/>.

2Shakespeare: The Invention of the Human (New York: Penguin Putnam, 1998).

Available for free at Connexions <http://cnx.org/content/col11199/1.1>

been shaped by Shakespearean drama, it has been through the mechanisms of bardolatry instantiated in the eighteenth century and mediated by new print technologies that made it possible for Shakespeare's work to be disseminated beyond his wildest imaginings.³ The rst modern editor and rst modern apparatus materializes in the 1720s around Hamlet.⁴

However, a much newer technology is evoked by the phrase non-consumption, which my title almost quotes as well. The Google Research environment proposes to oer the non-consumptive use of texts, noted by John Unsworth, Lisa Spiro, and Don Waters in various essays and talks about the Google Settlement, anticipated and actual.⁵ 18thConnect, like NINES, will provide an alternative to that research environment, one that has been constructed by scholars and librarians working together, a necessary competitor to Google.

But as Professor Muri noted, we cannot give away proprietary texts such as those in Gale-Cengage's ECCO catalogue. We have been given permission by Gale to ingest those texts into a search engine that allows for full-text searching and, we hope, data-mining. Users of 18thConnect will not be allowed to consumei.e., readthe ECCO texts, if their libraries do not subscribe to ECCO, but only to use these texts, use-value somehow being imagined as completely distinct from the exchange-value that prompts consumption and production, library budgets and Gale-Cengage prots. The Grub Street Project provides another way of using texts: mapping them. And this cultural analysis of textual relevance via mapping will open up worlds of thought as yet undened.

As we move from the consuming relevance that prompted literary scholars to virtually memorize a circumscribed set of texts to the non-consumptive data-mining and mapping that will generate we-know-not-what-kinds of knowledge, there is another form of consumption dogging us, the kind of consuming mentioned by Professor Muri in this paper: time-consuming tasks. And when she uses the phrase, she describes what is happening for her at a university that sounds identical to my own. She talks about how her research time is eaten up by the multiple problems confronting the Grub Street Project, institutional, nancial: ultimately her time is eaten up by public relations problems endemic to scholarly editing and the digital humanities. I want to reiterate all Professor Muri's articulations of these problems, but I'll do so in the most textually economical way, through anecdote.

I was oered the opportunity to be a grant reviewer for SSHRC's Image, Text, Sounds and Technology (ITST) projects in literary and textual studies, described in Professor Muri's Appendix A. The quality not just of the projects but of the applications for these funds was astoundingit actually made me realize how far ahead the digital humanities are in Canada, but this isn't the time for indulging in nationalistic sentiment. The participants on this adjudicating committee all knew that there would not be enough money to fund projects that should have been funded, no matter what. And in fact SSHRC is in the process of reorganizing the granting structure precisely in order to let the excellent digital humanities projects compete against traditional projects of lesser quality and value.⁶ SSHRC itself wishes to improve those statistics found in Appendix A: they would say yes to Professor Muri's comments here.

Second, MLA held a workshop designed and run by Susan Schreibman called Evaluating Digital Work for Promotion and Tenure (December 2009). Susan asked a number of us to create a tenure case for a digital

3Margareta de Grazia, Shakespeare Verbatim: The Reproduction of Authenticity and the 1790 Apparatus (New York:

Oxford University Press, 1991); Michael Dobson, The Making of the National Poet: Shakespeare, Adaptation and Authorship, 1660-1769 (New York: Oxford University Press, 1992).

4See my Special Issue: `Scholarly Editing in the Twenty-First Century'A Conclusion, Literature Compass 7/2 (2010):

120133, p. 125.

5John Unsworth, Computational Work with Very Large Text Collections: Google Books, HathiTrust, the Open Content Alliance, and the Future of TEI," Text Encoding Initiative Consortium Annual Members' Meet-ing, Ann Arbor, Michigan, November 13, 2009 http://www3.isrl.illinois.edu/∼unsworth/TEI.MMO9.pptx (<http://www3.isrl.illinois.edu/∼unsworth/TEI.MMO9.pptx>), accessed 1 March 2010; Lisa Spiro, Digital Humanities in 2008, II: Scholarly Communication & Open Access (<http://digitalscholarship.wordpress.com/2009/02/24/digital- humanities-in-2008-ii-scholarly-communication-open-access/>)http://digitalscholarship.wordpress.com/2009/02/24/digital-humanities-in-2008-ii-scholarly-communication-open-access/ (<http://digitalscholarship.wordpress.com/2009/02/24/digital-humanities-in-2008-ii-scholarly-communication-open-access/>), accessed 1 March 2010; Don Waters, The Changing Role of Special Collections in Scholarly Communications, Research Library Issues 267 (December 2009), http://tiny.cc/AC3QO (<http://tiny.cc/AC3QO>), accessed 1 March 2010.

6See SSHRC's discussions about restructuring: http://www.sshrc.ca/site/whatsnew-quoi_neuf/renewal-renouvellement-eng.aspx (<http://www.sshrc.ca/site/whatsnew-quoi_neuf/renewal-renouvellement-http://www.sshrc.ca/site/whatsnew-quoi_neuf/renewal-renouvellement-eng.aspx>)

APPENDIX 37 humanist, real or imagined, in no more than four pages. She distributed these cases to a crowd of twenty-eight workshop participants, many of them chairs or deans, and then gave them a checklist to help facilitate discussions among groups as to whether each of these cases were tenurable. Dr. Schreibman noticed at the end of the workshop that these mock-tenure-case reviewers were complaining that the candidates for tenure had not adequately explained the research value of their digital work. We thought they had. When these non-digital humanists read the portions of these written case documents in which that value was explained, they complained about technical jargon, and skipped those sections. This is a really severe problem. Such misunderstanding can be corrected by peer-reviewing organizations such as NINES and 18thConnect, which would be able to instruct P&T Committees on the value of a digital resource. My own Poetess Archive was the basis for promoting me to Full Professor because it has the NINES seal of approval. But there is a worse eect of this problem than the simple misunderstanding for which NINES or 18thConnect can compensate.

Professor Muri describes comprehension problems for reviewers of grant applications as well as promotion and tenure. She never knows how much basic education in the digital humanities she must provide. What Dr. Schreibman and I discovered is that the basic education process in the mock-tenure cases took too much time and space, supplanting some of the discussion about research value. The same thing is happening right now in this response: I am writing about institutional issues and problems rather than discussing what kind of research questions might be askable and answerable in a literary mapping of London or data-mining textual artifacts of the eiughteenth century on a scale never yet thought possible.

But it gets worse. Because so much textual space in articles, essays, books, P&T documents, and grant applications is consumed by basic education, writers then justify their consumption of time and space by making untenable utopian claims about the value of the medium alone, regardless of what's in it. Imagine Samuel Johnson describing all the possibilities of the periodical medium rather than writing his essays for the Rambler, Adventurer, and Idler, which of course do such a media analysis but not upon the periodical medium as if it were contentless, as if AutoTrader and Atlantic Monthly were in any sense the same thing. On the contrary, Johnson's personae, his ramblers and adventurers, idly produce snippets of text and achieve specic ideas both for the writer and his readers by consuming short periods of time. The whole infamous database debate in PMLA was predicated on the idea that one could talk about the generic value of a database as if databases sustaining searches in the Whitman Archive were fundamentally the same as any use of a database.⁷ Such generalizations can spawn utopian, absurd discourse about the power of digital media that articles such as John Unsworth's excellent The Importance of Failure are designed to counteract.⁸

Dean Unsworth's essay lays out how to formulate digital projects that perform research as well as how to write about them. He says that digital projects (archives, editions, databases, etc.) ought to be able to declare the terms of their potential success or failure: chances are, such an explanation is only possible if the scholars really know what they are trying to accomplish. Moreover, real research is only being done if the possibility of failure exists. Failure in the research proposal, We want to know X, means being able to conclude after research has been performed, we can't know X for the following reasons, but we can know Y. Even interpretive projects follow this rule. One is frequently thrilled to nd an essay or book that implements the most thorough analysis of a particular text via one particular methodology because it allows one to see how far that particular theoretical vantage point will take us, and where it breaks down or becomes uninteresting. And Dean Unsworth insists that the methodology be formulated and stated explicitly for such projects. Once again, no matter what the result, the digital project is a success insofar as it implements that methodology to the utmost and reveals what results can and cannot be had from it. However, Professor Unsworth admits, this kind of successful failure is not really suitable for institutional consumption: Frankly, the only metric that is likely to matter to the universities that sponsor such projects is their success in attracting outside funding, but scholars, designers, and funding agencies ought to care about more than those simple intrinsic criteria.

When I teach literature classes, I say to my students that their essays are essaies in the French sense

7PMLA 122 (October 2007).

8John Unsworth, Documenting the Reinvention of Text: The Importance of Failure, Journal of Electronic Publishing 3, no. 2 (December 1997).

of the word: they are trials. I tell students, you'll have your thesis, and you'll try to prove it by quoting the text, but when you get to the end, if you no longer believe your thesis, that too is an excellent result. You tried it, and it didn't work, and now you know that that idea, followed as far as you can take it, doesn't work.

What an amazing thing, really, to know that an idea you had while reading was really YOUR idea and not the author's, that the idea is not in fact supported by the text. So, I say, if you don't believe your thesis by the end of the paper, just say so, and I'll give you an A. But are these the kind of concluding white-papers expected by the NEH? And there is no need to idealize the sciences: we know how much data has been falsied for purposes of funding and prestige. The social consumption of research sometimes contaminates it the way TB infects the lungs.

So what kinds of success and failure (success from failure) do we want from mappings of the sort under-taken by Professor Muri? Her work parallels that underunder-taken by Fiona Black and Bertram Macdonald, the SSHRC-funded History of the Book in Canada project (http://www.hbic.library.utoronto.ca/home_en.htm⁹ ). It begins as a series of published volumes about this history, and will end as a GIS project, mapping in-formation given in the following chart:¹⁰

Figure 6.1

The books describing all this information are completed, published in six volumes and two languages

9http://www.hbic.library.utoronto.ca/home_en.htm

10This chart and the subsequent information about this project come from Bertrum H. MacDonald and Fiona A. Black, Using GIS for Spatial and Temporal Analyses in Print Culture Studies, Social Science History 24.3 (Fall 2000): 505-535, p.

510.

APPENDIX 39 by the University of Toronto Press. But the mapping is advancing with GIS technologies, in particular, a recent email from Fiona Black tells me, capitalizing upon new geo-referencing technologies that enable using historical maps.¹¹ MacDonald and Black argue in their article describing this undertaking that mapping information obtained from databases such as the British Book Trade Index (http://www.bbti.bham.ac.uk/¹² ) enables answering the following kinds of questions, beautifully tabulated for us:

Figure 6.2

11Sun, Mar 7, 2010 at 4:43 PM.

12http://www.bbti.bham.ac.uk/

Figure 6.3

(Black and Macdonald 514).

It is never the case that such questions canNOT be answered using books or databasesof course they can. However, one would have to extract the information from the media in which they appear. The mapping pre-extracts it, and lets you go on to think about what this information shows us, once mapped.

Every medium abstracts information. Here, pictorially, are some abstractions of book history information:

APPENDIX 41

Table 6.1

These particular views of the book history informationa closed book, a pile of microlm, and a database search formare lossy to say the least, obfuscating to say the most. Nothing can be known until you open the book, look at the microlm, search the database.

Figure 6.4

These activities are trivial, you might say: the information is there. Is it? I want to trace one particular dataset that is there in many forms.

Ian Maxted published the book (inadequately pictured above) called The London Book Trades, 1775-1800: A Topographical Guide (Exeter: by the author, 1980). The book, clearly almost homemade, contains an introduction in typescript and a folder full of microche. The microche pictures a typed list of the names by street of those businesses involved in print culture, including the names of all who worked there. The typescript, we are told, was designed for Maxted's own personal use to check the accuracy of information as he was proong The London Book Trades, 1775-1800, and subsequently given to the British Library so that they could derive cataloging informationidentify printers and/or places from inadequate title pages. The microche images are almost undreadable:

APPENDIX 43

Figure 6.5

This pdf of that microche image had to transform it from negative to positive. Maxted recognized the problem of communicating this information in book form, and to me he almost seems like a soldier in the eort to overcome information loss due to aordances of the book medium, as can be seen clearly in the copyright information to be found in his Topographical Guide:

Subject to the provisions of the Copyright Acts any part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without the permission of the author.

When I rst read this sentence, I had to go back over it several times: surely I was missing the not Subject to the provisions of the Copyright Acts any part of this publication may NOT be reproduced . . . , but NOT is not there. A typo? The sentence following this one cleared up my confusion: The author would however be interested to be informed of any major use made of the publication. Maxted is saying, in eect, PLEASE reproduce this document in a more useful (electronic) form than I was able to do. He just wants his work to be eectively used. And aware of the terrible quality of the microche, Maxted went further, reproducing the work in its entirety with the rest of his works on a blog (http://bookhistory.blogspot.com/

Figure 6.6

Much of Maxted's work forms the basis of the British Book Trade Index, as one can see from their sources, though Maxted's Topographical Guide is not (yet) listed among them. And yet, to what extent is the information even THERE, in that database? It is extractable, yes, but here is what one must go through to extract it:

APPENDIX 45

Figure 6.7

Figure 6.8

By carefully following these guidelines, I could extract Maxted's street-by-street list of booksellers, en-gravers, printers, etc. The addresses are given in the returns, beautifully, indicating the dates an entity occupied a particular address, but entry by entry. To get to that list of addresses by date, you have to click on each element in the returned listeach printer, bookseller, etc. So of course to understand the relationships among these printers and booksellers, one thing I would need to do is to generate lists of where

Dans le document Online Humanities Scholarship: The Shape of Things to Come (Page 41-59)