|
GALLOWS VARIANTS AS NULL CHARACTERS IN THE VOYNICH MANUSCRIPT
|
|
INTRODUCTION The Voynich manuscript is a vellum-bound book located in Yale University's Beinecke Rare Book and Manuscript Library. It is at least 400 years old and may have been written as early as the thirteenth century. The manuscript is of interest principally because the text is encrypted in a code that has sent scholars and code breakers away in defeat for nearly 100 years. The Voynich manuscript is not written in any known language - even the character set is unique to the corpus. It is, for many, an irresistible puzzle. The problem of the manuscript has aroused interest from many disparate groups - medievalists, linguists, computer scientists, and cryptographers. Members of the US intelligence community have been involved with breaking the Voynich cipher unofficially since the 1940's (Reeds, 1995). Little is known of the provenance and authorship of the manuscript, and theories about its content and origins abound (D'Imperio, 1977, Landini & Zandbergen, 1998). Efforts have been made to examine the text using the assumption that it is a coherent document written in a natural language. These attempts have ruled out simple substitution ciphers and other elementary encryption schemes. Word length, character and term frequency, and comma counts all yield results that hint at something more than gibberish. Landini has demonstrated that the manuscript satisfies the rank-frequency, number-frequency, length-frequency and length-rank laws for natural language text (Landini, 1997). Since the translation effort is a massive undertaking involving many researchers, this study posed a small but concrete question whose answer is intended to chip away at the mystery of the manuscript. Looking at the actual text of the Voynich manuscript presents a useful avenue of approach. Since the character set is entirely unique (with obviously unknown meaning), there is a great deal of speculation about the position, composition, and formatting of characters, words, and word strings. One possibility raised in discussion by Voynich scholars is that a specific and frequently occurring set of characters (the "gallows" variants) represent textual metadata - perhaps nulls or word breaks (Grant, 2000). Since this is a subject that can be effectively investigated quantitatively, it became the focus of this study. Examining the manuscript with the assumption that the gallows characters are non-textual shed light upon the underlying structure. By eliminating these eight characters in total, and in various combinations, it was possible to compare altered texts to the intact original and examine them statistically. Observed differences and similarities demonstrated the semantic importance of the gallows characters to the manuscript as a whole. Front Matter ... Introduction ... Literature Review ... Methodology Findings ... Conclusions ... Bibliography ... Files ... Resources |