Wednesday, October 17, 2007

Collective Intelligence in the Institutional Repository: Making DSpace Personal

Surveys of open repository adopters over the past two or three years have clearly highlighted the "institutional" nature of institutional repositories. The motivations for implementing IRs have always been those of the host institution, while the stated benefits to the individual user and contributor have either been those of the institution projected "down" to them, or happen to be shared goals such as enabling greater access to information or providing managed, long-term preservation of artifacts. Meanwhile, some of those same surveys identify sustaining a constant stream of contributions from the community as the chronic threat to the health of repositories; while all open repository platforms have been designed for self-service ingestion, it is a fact that the strongest and most current repositories are those that have professional staff who are responsible for content management, a luxury few institutions can afford. Even those institutions who have implemented mandatory submission policies, especially in light of increasingly "enlightened" publishers' policies on Open Access, still have not been able to achieve high levels of participation. The simple truth is that participation in an IR today represents extra effort for the busy scholar, effort that doesn't add real value to their research, their authorship, or their collaboration with others in their field.

We'd like to give researchers strong incentives to "live" within DSpace --- features that motivate them to spend significant time there, manage their content there, and make formal submission of content into the IR an easier and more natural part of their work. In general, we'd like their personal space or "desktop" within DSpace to be an amplifier of their research activities. For starters, we believe the user should have basic (but in this Web2.0 world, expected) capabilities available to them for relating their current activities and interests to other artifacts in local collections, so we're experimenting with features like item bookmarking and tagging within local collections and using this constructed "context" as a basis for recommending related items. We'd like to leverage this further as a basis for identifying and retrieving related items within that repository's federation (see our earlier notes on pf-dspace in this blog and elsewhere) and especially for identifying colleagues with related interests. And we want to apply this to identifying and harvesting related materials from other, heterogeneous sources such as external blogs, wikis, and web sources.

This basic contextualization of the scholar's current focus is really just a starting point, because it represents only a few aspects of the scholarly workflow. The real value to both the scholar and their host institution comes when they can leverage other basic functions of the repository in their core research, including the management of both data and information artifacts, and especially using their repository to manage access to their materials for their distributed colleagues.

In terms of the management and versioning of artifacts, there are certain repository capabilities that the developer community has long come to expect from distributed code management systems such as SVN and CVS that are curiously foreign to the IR space, but really shouldn't be. As scholarly journals increasing demand research to be submitted as "packages" containing not only text but also data sets and other content that has been culled from the set of collaborators and authenticated using robust techniques, the proper management of research artifacts in more active ways will become a central function of the IR.

One of the the truly exciting aspects of working with the DSpace open source community is that many of these objectives are already on the horizon for members of the DSpace community, and developers across the globe are hard at work implementing various pieces. The said, we still think there needs to be a focus on the needs of that individual scholar, ensuring that as the Facebook(tm) generation takes its place in the DSpace user community, bringing with them as they will their high expectations for contextualized social networking in nearly everything they do, that DSpace is more than ready to work for them!

In the coming days Desmond Elliott, one of our ace developers working on DSpace at HPLabs in Bristol (UK), will use this space to describe his awesome patches to DSpace, which include item bookmarking, item recommendations and user tagging. In the near future we hope to be more specific about other aspects of this work, including its name...