DLIBRARY PLATFORM

ONLINE ACCESS TO ARCHIVAL AND LIBRARY COLLECTIONS

The online platform was developed by non-profit group dLibrary/InfoRost with a goal to improve access to archival and library collections over the Internet (Why online platform dLibrary/InfoRost was created and its advantages for archivists, librarians and researchers). The platform’s architecture was built to reflect the following main ideas and functionality requirements for online collections:

UNIVERSALITY AND SCALABILITY

Designed to handle a variety of material, access venues and services. The platform supports publishing of large, multi-million page collections of

  • archival and bibliographic records

  • documents

  • books, journals, newspapers

  • photographs, maps

  • audio/video files, etc.

The platform’s intuitive interface was developed based on the needs and information-seeking habits of a large audience of users. The interfaces and metadata displays are flexible and can be changed to reflect specific requirements of collections and material types. This approach supports integration of fragmented collections onto one central platform for improved access.  


Archives and libraries require a highly scalable access platform as their needs in digitization and online publishing evolve rapidly. InfoRost platform has a built-in capacity for expansion to accommodate for millions of new document pages, terabytes of data and spikes of simultaneous users during peak times. For this reason, the platform is also optimized for use with cloud computing technologies such as Amazon’s EC2/S3 services.

LINKED METADATA AND INDEX ACCESS

The platform supports an unlimited number of metadata fields which can be interlinked between relevant descriptions. The metadata fields are also aggregated and displayed in browse lists of terms, forming collection-level indexes:

Linked metadata and collection indexes expose relevant terms and provide scholars with multiple access points to documents. The platform features a multi-level hierarchy for the display of finding aids on the record group, series, folder, and document levels. Each collection level provides two views for the lists of lower level objects: a table view with sortable columns and a grid view for images.

The description at each hierarchy level includes the archival ID number, title, annotation, extent of the collection, start and end dates for documents, and descriptors (keywords). The latter includes types such as Author, Document type, Geographic location, Publications, Names, Organizations, Social group, Position, Subject, Forum, and language, as well as compound descriptors.

BIRDS-EYE VIEW BROWSING AND FULL TEXT SEARCHING

The platform interfaces provide several options for intuitive browsing of flat and complex hierarchies of description such as multi-level finding aids. The DocView browser is optimized to display and zoom in/out quickly on hundreds and thousands of document images providing scholars with comprehensive (bird’s-eye) visual overview of large collections. The browser includes grid view, filmstrip browse and page flipping option for bound material:

The full-text search engine allows quick searches with found keywords highlighted in the context of a document or its description. The advanced search handles more complex queries for individual fields of descriptions and their combinations (title, annotation, author, language, compound descriptors, start and end dates for documents). The search results can be displayed either as a table with sortable columns (doc IDs, titles, start and end dates) or as a grid of images:

ACCESS CONTROL AND USAGE STATISTICS

The administrative module allows multi-tiered access to records and documents to satisfy specific requirements of any collection. Access can be restricted by username/password or IP address to documents in high resolution only (descriptions and small size document page icons will be available to public) or to any parts of description and full image documents.

The usage statistics module allows monitoring and reporting of each collection’s usage.

SUPPORT OF HIGH VOLUME DIGITIZATION AND RESEARCH PROJECTS

The platform supports integration with scanning equipment and provides tools for automation of image/metadata processing and online publication. All platform tools can be used from local computers or remotely over the Internet. Scanner-platform integration and remote access increase productivity and decrease the cost of high-volume digitization and online publishing operations.

The platform is also optimized to serve as a testbed for research in:

  • new forms of online publishing and scholarly communication
  • social processing online - archival descriptions, interlinking of documents, improvement of OCR results
  • visualization, data mining, GIS

TECHNOLOGY USED

  • Ruby programming language
  • Ruby on Rails web framework
  • MySQL database management system (can be swapped for another DBMS when needed)
  • Solr Search

EVALUATION OF THE PLATFORM

The InfoRost platform was developed by a group of experts in archival and library science, digitization, online publishing and information technologies. The platform received high evaluation from a number of scholars who tested it on the example Records of International Brigades collection (http://interbrigades.inforost.org) at the following organizations:

  • Brill Publishers
  • Centre d’études des mondes russe, caucasien et centre-européen (Cercec)
  • Davis Center for Russian and Eurasian Studies at Harvard University
  • Department of Historical Information Science, Moscow State University
  • Faculty if History, Higher School of Economics, National Research University (Russia)
  • International Institute of Social History (Netherlands)
  • Italian Society for the Study of Modern and Contemporary History (SISSCO)
  • Russian state archive of social-political history (RGASPI) 
  • Slavic and East European Collections, Stanford  University Libraries
  • Stockholm School of Economics
  • Vereinigung zur Förderung des Archivwesens (VFA)

 

Contact information: kirill.fesenko@dlibrary.org

NEW PROJECTS ON THE PLATFORM

2016, Spring

Murzilka Complete Online Archive 1924 - 2016

MURZILKA is a popular Russian journal for 6- to 12-year-old children. This favorite kid’s journal has been published for 88 years since 1924 without a single interruption. Murzilka is listed in the Guinness Book of World Records as the longest-publishing journal for children. The collection includes over 1,000 issues of Murzilka.

http://elib.murzilka.org

http://elib.so-edinenie.org

2015, Fall

Archives of the Deaf-Blind Support Foundation

The project digitized and published online for public access several archives of prominent educators of deaf-blind children I.A. Sokoliansky and A.I. Mesheryakov, as well as materials of the so-called "Zagorsk experiment."

 

2015, Summer: Archival collection "German documents of the Second World War"

2014, Summer: Archival collection "German documents of the First World War"

2013, Fall: State Historic Public Library of Russia's Digital Library

2013, Summer: Archival collection "Documents of German secret services 1912-1945"

2010, Fall: Archival collection "Records of the International Brigades"

 

EXAMPLES OF FORMATS ON THE PLATFORM

BOOKS

NEWSPAPERS

MAPS

ARCHIVAL COLLECTIONS