Зографска електронна научноизследователска библиотека

Zograf Electronic Research Library. Integration of Electronic Resources

Andrej Bojadžiev

Introduction

The electronic description and publication of the literary sources from the Holy Monastery of Zograf is a process that unites different approaches and ideas1. On the one hand, this is an attempt to integrate the methods for describing different literary documents – manuscripts, charters, archives. On the other hand, the team realizes that these sources are written in different languages and that this requires taking into account the traditions of description when cataloging. On the other hand, the description and publication need to be linked to information about personal and local names, as well as information about the events encountered in the sources. Last but not least, it is necessary to develop tools for extracting and searching for data from the descriptions and publications.

Development basis and related projects

The electronic version of the Zograf Scientific Library relies on the experience gained from the project “Repertorium of Old Bulgarian Literature and Literature” (Repertorium) for the description of the sources. The two projects use a similar way of describing the sources, the differences between them stem from the different goals they set for themselves. Repertorium is a project whose main focus is the description of the content of Slavic medieval manuscripts. The inventory of the sources from the Zograf Monastery covers both Slavic manuscripts and all other manuscripts and archives from the Holy Monastery. On the other hand, Repertorium does not aim to describe only one collection, it is more of an electronic unified catalog. In contrast to it, our activity is focused on the description, research and publication of literary sources from a single collection. With such a task, the electronic version of the catalog is supposed to provide links and information about people, places and events mentioned in the sources. This turns the creation of the site into a special kind of encyclopedic reference book, oriented towards the history of a literary center and its history from the appearance of the first written records to our time.

A similar unified catalogue to Repertorium, but with the task of describing the manuscript heritage in only one country, Sweden, is the Manuscripta project. Unlike Repertorium, it includes more paleographic and codicological data, as well as digital copies of the manuscripts themselves.

Another project, that of colleagues from the University of Hamburg, dedicated to the digitization of descriptions of Ethiopian manuscripts together with photographs of the sources themselves (Beta Masaheft), is in an early stage of development. In terms of its goals, it is very close to the project for the electronic description of the Zograf Holy Monastery, but has the character of a unified catalogue dedicated to only one linguistic tradition – Ethiopian.

Our efforts also come close to those of our colleagues developing an electronic environment for Syrian culture (Syriaca.org). However, their emphasis is on information about local and personal names, and the sources are used only for data, without providing detailed descriptions.

In the field of electronic description of Islamic manuscripts, our initiative comes closest to the project for their unified catalogue in the UK (FIHRIST). However, we are not aware of a project dedicated to the digital description of Ottoman archives that would be accessible on the Internet.

The activity of attribution of texts is related to the use of various encyclopedic reference books, which are now abundantly available in electronic form. For Slavic manuscript texts, the Versiones Slavicae2 project is of particular interest, which presents information on titles and authors together with a bibliography of sources and a link to the Pinakes project for Greek texts. This integration of data is extremely important because it allows for the indication and exchange of links between the different projects, which greatly facilitates the work of identifying sources, coordinating the forms and languages of names and titles. All this contributes to the unification and standardization of the individual forms. A similar initiative has been introduced in the Repertorium project, where the scientific titles of the texts are given in three languages – Bulgarian, Russian and English.

A similar initiative, but in the field of paper research, is the Bernstein Project. Memory of Paper. It allows researchers to attribute and indicate the exact digital copy of the watermark and thus establish the time of the appearance of the paper. On the other hand, the project is preparing a dictionary and systematization of terms in the field of paper and watermarks in several languages3.

The technology

In modern humanities, the concept that data is processed and prepared for publication in electronic form, and then, if necessary, a version of it is presented on paper, is increasingly accepted. This approach has become predominant in recent years in the description and cataloguing of old manuscripts and archives. This has necessitated the creation of electronic inventory models that integrate photographic data and provide a link to digital editions of texts. There is no longer any debate about whether a single model for describing manuscripts from different language traditions is possible – Slavic, Greek, Ottoman or English4. Technologies now allow this to be done relatively easily. There are different views and approaches on the issues of combining descriptions with editions, with different types of data and linking them in semantic networks (Semantic Web). In this regard, the network is developing extremely rapidly and it is increasingly easier to connect a given local project with another similar initiative in the field of digital humanities.

The Zograf Electronic Library is based on the popular standard for exchanging, archiving and storing documents in electronic form, known as XML. We use a model for describing manuscripts and archives common to the Repertorium project (a variant of TEI), which has been developing, changing and enriching since 1995 to the present day5. The novelty of the Zograf Library is expressed not so much in the technical correspondence between the two projects, but in the requirement that the technology allow for the combination of different types of descriptions of sources, corresponding to different linguistic traditions. In other words, on the basis of one computer model, an attempt has been made to integrate different types of data – inventories, editions, terminological information, directions and bibliography. For example, the difference is in the links to references to Internet maps (GeoNames), to Versiones Slavicae, Pinakes and Repertorium for authors and scientific names of texts. In some cases, references to the Virtual International Authority Archive (VIAF) are also given.

This allowed the use of a large part of the technologies related to the XML language in the digital humanities, such as languages for transforming and publishing data (XSLT, XSL-FO) and languages for searching and querying (XPath, XQuery). The information itself was placed in the eXist database, and this allowed us to combine the extraction of data and its presentation in the form of HTML pages with the help of technologies such as CSS, JS, JQuery. The pages themselves are designed using Bootstrap, a software library that combines HTML with CSS and JS, and the general interface and design of the pages was taken from Bootswatch. Thus, the entire electronic presence of the Zograf Library is a combination of free and open source technologies.

The project uses the idea of links between descriptions, editions and so-called authority files6 from the dsebaseapp initiative.

Electronic description and research approaches

Here we will focus on just a few of the important approaches in combining the data from the description, on the one hand, with various reference books, and on the other, with previous catalog numbers and signatures.

Watermarks

With the availability of freely accessible electronic databases with watermarks on the Internet, the project provides an opportunity for linking and verifying the information extracted from the manuscript.

Example 1:

            <watermark>
               <motif xmlns="http://www.ilit.bas.bg/repertorium/ns/3.0" 
               facs="http://www.ksbm.oeaw.ac.at/_scripts/php/loadRepWmark.php?rep=briquet&refnr=10731&lang=fr">
               Ръкавица със звезда отгоре</motif>
               <term type="watermarks">подобен</term>
               <rs type="work" ref="#Briquet1923">Briquet 1923</rs>
               <num>10731</num>
               <date>1509 г.</date>
            </watermark>

The link to the external database is expressed using the value of the facs attribute. In this case, it leads to the electronic version of the Briquet album7. The watermark characteristic is accompanied by the most general concepts for similar references. In this case, this is done as part of the term element, which has similar content. This is followed by a bibliographic reference rs (reference string), a catalog number (num) and a date (date). Where there is no possibility of connection and checking on the Internet, a simple bibliographic reference is given.

Toponyms in the sources

In the description, local names are given with a reference to an authoritative file with a list of the relevant places. In the description of Slavic manuscripts, it is common for parts of the manuscript to be currently stored in other places around the world.

Example 2:

   
            <rs type="place" ref="#Sankt-Peterburg">Санкт Петербург</rs>

This reference indicates that it is a place and refers to the file with a list of places (ref="#Sankt-Peterburg"). The corresponding place with more data looks like this in this case:

Examaple 3:

               <place xml:id="Sankt-Peterburg">
                  <placeName type="pref" xml:lang="bul">Санкт Петербург</placeName>
                  <placeName type="alt" xml:lang="rus">Санкт-Петербург</placeName>
                  <location>
                     <geo decls="#LatLng">59.93863 30.31413</geo>
                  </location>
                  <idno>http://www.geonames.org/498817/</idno>
               </place>

We can interpret this code as follows. Any information about a place, regardless of the type of toponym, is united by the place element with the corresponding identifier, the value of the xml:id attribute. The name of the place can be in several languages (xml:lang) and the form of one of the names can be preferred (type="pref") over the others (type="alt"). Next comes information about the place with geographical coordinates, which are entered in the geo element. This element has a permanent attribute decls (declarations) and a value #LatLng (an abbreviation of latitude and longitude). The identifier then serves as a reference to the resource used on the Internet (in this case, to the geoNames site). The idea of compiling this file is the gradual accumulation of geographical data that can be used in the description of all sources, as well as in the publication of electronic editions of texts. This data is accompanied by a map using the Leaflet software library. The information in this file could be combined when searching for other data – a reference to the description, a direction to personal names or terms.

Personal names in sources and descriptions

Unlike place names, which, regardless of type, can in most cases be accompanied by geographical coordinates and limited to general information, the variety of personal names is great. A detailed classification is not possible here, but at least the following types are important in the description and edition of texts:

Name of the scribe (writer) of the monument;
Name of the author of the text;
Names mentioned in notes and appendices. In turn, these names can be those of monks, rulers of various ranks, or names of secular figures.

We can easily continue this classification. The important thing from the point of view of electronic presentation is that in some cases we have additional information, and in others it is not available or known to us at this stage.

Example 4:

<person xml:id="Кирил_монах">
                      <persName xml:lang="bul">
                         <forename>Кирил </forename>
                      </persName>
                      <note>
                        ...
                         <p>монах</p>
                      </note>
                   </person>

In this case, the only fact we know about Cyril so far is that he was a monk. We have considerably more information about the names of the authors of texts, but instead of copying and reworking the data from known reference books, it is better to give a reference to known reference books and tools. Compare, for example, the following case:

Example 5:

                <person xml:id="Амфилохий_Иконийски">
                     <persName xml:lang="bul">
                        <addName>Иконийски</addName>
                        <forename>Амфилохий </forename>
                     </persName>
                      <note>
                         <p>Кесария Кападокийска ок. 330 – ок. 379 година Кесария Кападокийска</p>
                         <p>Автор</p>
                         <p>http://pinakes.irht.cnrs.fr/notices/auteur/152/</p>
                        ...
                      </note>
                  </person>

In this example, the first paragraph (p) provides information about the time when he lived, the second contains a general typology (Author), and the third provides a link to the Pinakes project for further information about the author and his works.

There is no doubt that as more information about the names in the Zografski Knižovniki accumulates, the information of the first type will gradually take on the form of the data in Example 5.

History of the sources in the collection

The history of the sources entering a given collection and the change of catalog numbers and signatures is one of the most difficult to trace in the Slavic manuscripts in the Zograf Monastery. Such data belong to the little-studied parts of the collection. One of the goals of the description is to highlight the history of preservation by tracing the catalog numbers and signatures from the manuscript catalogs (stored in the library), the printed descriptions. There are three places from which we can obtain such information:

Already printed inventories and catalogs;
the manuscript inventory descriptions kept in the monastery library;
Data from the bindings, endpapers, and the pages themselves of the manuscript books.

First, information about the current status of the catalog number and signature is described:

Example 6:

                     <idno type="shelfmark">Зогр. 2</idno>
                     <idno type="catalogue" n="2">2</idno>
                     <idno type="mss_cat" n="2">2</idno>
                     <note place="inline">
                        <locus>защитен лист</locus> с мастило. Под него написано 
                        <quote xml:lang="bul">Миней</quote>, зачертано и добавено <quote xml:lang="bul">Лѣствица в 1638</quote>.
                        <rs type="work" ref="#Райков1994">Райков и др. 1994: 29</rs>
                        <rs type="work" ref="#Стоилов1903">Стоилов 1903</rs>
                        <rs type="work" ref="#Каталог1">Каталог 1</rs>
                     </notes>

In the modern description of the collection, it was decided to use the catalog numbers from the last catalog of Slavic manuscripts, continuing them with the description of the manuscripts not yet cataloged. In such a case, the catalog number also becomes the signature of the manuscript. In this example, the signature is entered using the type attribute and its value shelfmark. When there is a match, the idno element is repeated with different values of type. In this case, type="catalogue" and type="mss_cat". This is followed by a specified, placed as a note (note). The idea of this note is to provide additional information, where this information is known from and to accompany it, if possible, with bibliographic references. The locus element indicates the place in the manuscript from where the number is visible, along with possible notes from librarians or catalogers. The original text is placed in the quote element, indicating the language used (xml:lang="bul"). Bibliographic references follow, arranged from the most recent publications to the oldest. In this case, this information is extracted from a printed catalog8, from the catalog of Anton Stoilov9 and from the first preserved manuscript catalog in the monastery itself.

Example 7:

  
                    </altIdentifier type="former">
                           </idno type="unioncatalogue">300<//idno>
                           <note>
                              <rs type="bibl" ref="#Турилов2016">Турилов, Мошкова 2016<//r>
                           <//note>
                    </altIdentifier>

This data is placed as part of the altIdentifier element (alternative identifier), with the corresponding values of the type attribute (in this case unioncatalogue, i.e. union catalogue) and is accompanied by a bibliographic reference10.

Next are the data from the remaining catalogs or manuscript numbers, arranged from the latest to the earliest, followed by a registration of the old signatures, again in the same order.

Example 8:

 
                           <altIdentifier type="former">
                               <idno type="catalogue">163 </idno>
                               <note>
                                  <locus>Преден защитен лист </locus> цифрата 2 зачертана с молив и написано със същия молив > Iljinskij!  
                                  <rs type="work" ref="#Каталог1937">Каталог 1937 </rs> 
                                  – допълнено по-късно с индекс а, т.е. 163а  <rs type="work" ref="#Ильинский1908">Ильинский 190811 </rs>
                               </note>
                            </altIdentifier>
                           
                            <altIdentifier type="former">
                               <idno type="mss_cat">45 </idno>
                               <note>
                                  <locus>1r </locus>номерът  <quote xml:lang="bul">148 </quote> 
                                  е зачертан с молив и написано отстрани  <quote xml:lang="bul">45 </quote>.
                               </note>
                            </altIdentifier>
                           
                            <altIdentifier type="former">
                               <idno type="mss_cat">148 </idno>
                               <note>
                                  <locus>1r </locus> с мастило:  <quote xml:lang="bul">З.Б. № 148 </quote>
                                  <rs type="work" ref="#Каталог2">Каталог 2 </rs>
                               </note>
                            </altIdentifier>

The idea of this approach is to gradually gather information about the various descriptions and movements of manuscripts within the library itself, linking them together in a single information chain. In this way, the history of the manuscript collection will be supplemented and it will become possible to discover its history more fully.

Conclusion

There is no doubt that as the project develops, some solutions and integrations between the individual parts of the software, as well as connections with related projects, will change. The important thing in this case is that the established connections between the individual structural units already allow for specialized searches, data extraction and typology. Their development will be determined in the future by the need to perform specialized analyses such as combining and presenting in graphic form the dates of the sources or the dependencies between the names of the persons, the places where they occur and the documents. In our case, the description of the sources goes hand in hand with the formation of the authoritative files and bibliography, a necessity that is also dictated by the development of modern technologies.

References

Ильинский, Григорий А. 1908. Рукописи Зографского монастыря на Афоне. Известия Русского археологического института в Константинополе 13. 253–276.

Райков, Божидар, С. Кожухаров, Х. Миклас, Хр. Кодов. 1994. Каталог на славянските ръкописи в библиотеката на Зографския манастир в Света гора. София: CIBAL.

Стоилов, Антон. 1903: Преглед на славянските ръкописи в Зографския манастир. Библиотека (Приложение на „Църковен вестник“) 3 (7–9). 117–160.

Турилов, Анатолий, Л. Мошкова. 2016. Славянские рукописи Афонских обителей. 2. изд. Београд – Москва: Алетеа.

Bojadžiev, Andrej, A. Miltenova, D. Radoslavova 2003. A Unified Model for the Description of Medieval Manuscripts? In: A. Miltenova, D. Birnbaum, S. Slevinski (eds.), Computational Approaches to the Study of Early and Modern Slavic Languages and Texts. Proceedings of the “Electronic Description and Edition of Slavic Sources conference”, 24–26 September 2002, Pomorie, Bulgaria, 113–136. Sofia: Boyan Penev Publishing Center.

Bojadžiev, Andrej. 2009. Constructing Repertorium 3.0. Another View of the TEI P5 Model for Medieval Manuscript Descriptions. Scripta & e-Scripta 7. 13–48.

Bojadžiev, Andrej. 2012. Guidelines to Repertorium Initiative Model for Manuscript Descriptions. Scripta & e-Scripta 10–11, 9–103. http://repertorium.obdurodon.org/Bojadziev_Scripta_ 11.pdf

Briquet, C. M. 1923. Les Filigranes. Dictionnaire historique des marques du papier dés leurs apparition vers jusqu'en 1600. 4 Vol. 2ème édition. Leipzig: Hiersemann.

Frauenknecht, E., C. Kämmerer, P. Rückert, M. Stieglecker. 2018: Watermark-Terms. Vocabulary for Watermark Description. http://www.memoryofpaper.eu/products/watermark_ terms_v11.1_en.pdf

Hillman, D. I., R. Guenther, A. Hayes. 2008. Metadata Standards and Applications. Trainee Manual. Washington: Library of Congress. https://www.loc.gov/ catworkshop/courses/metadata standards/pdf/MSTraineeManual.pdf

Miltenov, Yavor and Aneta Dimitrova. 2018. The “Versiones Slavicae”. Database and the Old Church Slavonic Translations of John Chrysostom's Homilies. In: L. Sels, J. Fuchsbauer, V. Tomelleri & I. De Vos (eds.), Editing Medieval Texts from a Different Angle: Slavonic and Multilingual Traditions (Orientalia Lovanensia Analecta, 276). 213–224. Leuven–Paris–Bristol: CT.

Projects

Bernstein: Bernstein. Memory of Paper.

Beta Masaheft: Beta maṣāḥǝft: Manuscripts of Ethiopia and Eritrea. (Schriftkultur des christlichen Äthiopiens: eine multimediale Forschungsumgebung).

dsebaseapp: Digital (scholarly) Editions Base Application.

FIHRIST: FIHRIST. Union catalogue of manuscripts from Islamicate world.

GeoNames: Geographical Database.

Manuscripta: Manuscripta. A digital catalogue of medieval and early modern manuscripts kept in Swedish libraries.

Pinakes: Pinakes | Πίνακες. Textes et manuscrits grecs.

Repertorium: „Репертоар на старобългарската литература и книжнина“ (Repertorium of Old Bulgarian Litterature and Letters).

Syriaca.org: Syriaca.org: The Syriac Reference Portal.

Versiones Slavicae: Versiones Slavicae. Catalogue of medieval Slavonic translations and their Greek sources.

VIAF: The Virtual International Authority File.

Standards, technologies, software

Bootstrap: An opensource toolkit for developing with HTML, CSS, and JS.

Bootswatch: Free themes for Bootstrap.

CSS: Casscading Stylesheet Language.

eXist: A NoSQL document database and application platform.

HTML: HyperText Markup Language.

JQuery: JavaScript library.

JS: JavaScript / ECMAScript.

Leaflet: An open-source JavaScript library for mobile-friendly interactive maps.

Semantic Web: Web of Data.

TEI: Text Encoding Initiative.

XML: Extensible Markup Language (XML).

XPath: A language for addressing parts of an XML document. XQuery current status.

XQuery: An XML Query Language. XQuery current status.

XSL-FO: XSL Formating Objects.

XSLT: The Exstensible Stylesheet Language Family. Cover page.