Exploring DNA storage in the context of preserving and venerating religious text

A) The Diamond Sutra and its encodings

1. The Diamond Sutra (Skr. Vajracchedikā Prajñāpāramitā Sūtra) is a Mahāyāna Buddhist text composed in India sometime between the 1st and the 4th century. It belongs to a group of works today sometimes referred to as prajñāpāramitā (Perfection of Wisdom) literature. One of the main teachings of the Diamond Sutra is that all concepts and phenomena are ultimately without foundation (skr. apratiṣṭha); they have no firm ontological basis, but are momentary and not to be relied on. Be this as it may, even if the ultimate reality of all concepts is denied, the Diamond Sutra, like any other text needs a medium, a carrier for its information to persist. As such, the Diamond Sutra probably originated as a written Sanskrit text by an anonymous author and evolved over a few centuries. In India, Buddhists adopted writing early on and the earliest epigraphy and the earliest Indian manuscripts are connected to Buddhism.
[For an overview of prajñāpāramitā literature see Zacchetti (2015). On the various versions of the text see Harrison (2006, 2010, 2015)]

2. In 401 CE the Sanskrit text of the Diamond Sutra was translated into Chinese by the famous translator Kumārajīva. Although translated five more times over the next three hundred years, it was Kumārajīva's version that remained the most popular. When printing was invented in East Asia, Buddhists were again enthusiastic "early adopters" of the new medium. In 868 Kumārajīva's translation was carved on wood-blocks, the text was printed, and one copy of this print is today the oldest extant, dated book of our species.
[The print was discovered in Dunhuang, a city on the silkroad, in 1900 and is now preserved in the British Library.]

3. In the late 20th century information became increasingly stored in digital form. Buddhists again started to explore the new medium and digitized their sacred texts in various ways. Kumārajīva's translation is now part of a large digital corpus of Chinese Buddhist texts, produced and curated by an organization called CBETA. Having discarded analog ink and paper, it now exists digitally as bits and bytes encoded according to the Unicode standard.

4. In the early 21st century humans experiment with still other forms of information storage. Digital storage is brittle, the lifespan of storage media such as hard-drives or magnetic tapes is measured in decades. How could digital information be encoded for centuries? DNA molecules, which have been used by life forms on this planet successfully for c. 4 billion years to transmit information, seem worth a thought. Kept dry and dark, DNA can last for hundreds, even thousands of years, without the need for electricity. It also has a much higher information density than magnetic tape and one cubic centimeter of DNA is enough to encode all texts in the Library of Congress. Once created DNA can be copied cheaply and rapidly. The obvious bottlenecks are that encoding cultural information by synthesizing DNA is still expensive, and access and retrieval of stored information is slow. Nevertheless, as early adopters of information technology, Buddhists again explore how their texts can be stored and preserved in this new medium.
[For an overview of approaches to encoding/decoding digital data into quaternary DNA code see Wang et al., 2022. The image to the right is a Open AI DALL-E 2 output prompted with "a photorealistic image of transferring a sutra into a DNA" (2022-10-20)]

5. With the help of a Temple University Presidential Humanities and Arts Grant it was possible to produce 20 3D-printed stupas containing a capusule with DNA that encodes the Diamond Sutra in Kumārajīva's translation. The earliest printed book is now the first Buddhist text encoded in DNA. The Loretta C. Duckworth Scholars Studio at Temple Libraries organized a miniature stupa 3D-printing contest. The winner of the contest, Tom Leighton, crafted 20 stupas that we aim to distribute to interested stakeholders in the world of Buddhism. The stupas are linked to a Zenodo online repository via a QR code containing its DOI at the bottom of the stupa. We are interested in feedback about how such stupas can be integrated in ritual spaces and practices, and are open to follow-up projects concerning the long-term preservation of Buddhist texts in DNA.

B) The Project

In 2018, Marcus Bingenheimer, Rob J. Kulathinal, Matt Shoemaker, and Justin Brody decided to explore the feasibility of DNA storage for cultural heritage information, and in 2019 obtained funding from Temple University to explore encoding algorithms, produce the first Buddhist text in DNA, and embed this DNA in a 3D printed stupa. In the few years since, research in DNA digital data storage has progressed rapidly, and some of the questions we had in the beginning are already moot. Industry alliances are hard at work to standardize DNA storage. What looked as a niche interest in 2018 has become a dynamic research field at the interstices of computer science and bio-engineering.

One of the steps toward DNA storage is converting the binary digital data into a quaternary encoding scheme that expresses the four nucleic acid bases that encode information in the DNA. This necessitates the use of error correction algorithms. As part of our experiments we ported the DNA fountain algorithm by Erlich et al. (2017) to Python3 [on github].

As prices for this type of DNA synthesis were falling rapidly, it became clear the best solution would be to let an outside company handle the encoding and synthesis.

This project was made possible by a Temple University Presidential Humanities and Arts Grant (2019).

More data and documentation:
Questions? Please contact Marcus Bingenheimer

Erlich, Yaniv, Dina Zielinski.2017. “DNA Fountain enables a robust and efficient storage architecture.” Science 355: 950–954 (2017), 3 March 2017
Harrison, Paul. 2006. “Vajracchedikā Prajñāpāramitā A New English Translation of the Sanskrit Text Based on Two Manuscripts from Greater Gandhāra.” In J. Braarvig (ed.) Buddhist Manuscripts, Vol. III: 133–159. Oslo: Hermes Academic Publishing. (Manuscripts in the Schøyen Collection.)
Harrison, Paul. 2010. “Experimental core samples of Chinese translations of two Buddhist Sūtras analysed in the light of recent Sanskrit manuscript discoveries.” Journal of the International Association of Buddhist Studies Vol. 31-1/2: 205-250.
Harrison, Paul. 2015. “The British Library Vajracchedikā Manuscript - IOL San 383–387, 419–427.” In S. Karashima, J. Nagashima, K. Wille (eds.) Buddhist Manuscripts from Central Asia - The British Library Sanskrit Fragments Vol. III.2: 823-866. Tokyo: The International Research Institute for Advanced Buddhology Soka University.
Wang, C., Ma, G., Wei, D. et al. Mainstream encoding–decoding methods of DNA data storage. CCF Trans. HPC 4, 23–33 (2022).
Zacchetti, Stefano. 2015. “Prajñāpāramitā Sūtras.” In J. Silk (ed.) Brill‘s Encyclopedia of Buddhism Vol.1 (Literature and Languages), pp. 171–209. Leiden: Brill.