This ongoing project aims to develop datasets and best practices to test the knowledge of language models in the domain of Buddhist history. The focus is on open-weights models. Open-weights models are important for researchers because they allow researchers full control over their experiments. Experiments are repeatable without restrictions in the foreseeable future, because the models can be archived without constraints.
We use a dataset of multiple-four-choice questions pertaining to different periods of Buddhist history. Currently, Ollama is used as middle-ware for the experiments.
The aim of the project is to track the improvement of open-weights models in the domain of Buddhist history, to understand to what degree knowledge in that field is encoded in language models, and to identify the currently best models to talk to about Buddhist history.
As of , among the tested open-weight models Qwen3.5:27b 'knows' most about Buddhist history.
April 2026 - now
| Model | Test 2025-10 (150 questions) | Test 2026-04 (210 questions) |
|---|---|---|
| deepseek-r1:14b | 128/210 (61%) | |
| gemma2:9b | 81/150 (54%) | |
| gemma3:4b | 60/150 (40%) | |
| gemma3:12b | 93/150 (62%) | 134/210 (64%) |
| gemma3:27b | 96/150 (64%) | 139/210 (66%) |
| gemma4:e4b | 121/210 (58%) | |
| gemma4:31b | 159/210 (76%) | |
| glm-4.7-flash:q4_K_M | 158/210 (75%) | |
| llama3:8b | 84/150 (56%) | |
| llama3.1:8b | 85/150 (57%) | |
| llama4:16x17b | X[1] needs 60GB local memory | |
| mistral:7b | 72/150 (48%) | |
| mistral-nemo:12b | 124/210 (59%) | |
| mixtral:8x7b | 98/150 (65%) | 151/210 (72%) |
| olmo-3.1:32b | X[2] Olmo does not take “think=False”. Workaround slows the response time drastically. | |
| phi3:3.8b | 75/150 (50%) | |
| phi4:14b | 81/150 (54%) | 132/210 (63%) |
| qwen2.5:7b | 86/150 (57%) | |
| qwen2.5:14b | 114/150 (76%) | 158/210 (75%) |
| qwen2.5:32b | 120/150 (80%) | 165/210 (79%) |
| qwen3:8b | 102/150 (68%) | |
| qwen3:14b | 114/150 (76%) | 158/210 (75%) |
| qwen3:32b | 110/150 (73%) | |
| qwen3.5:9b | 153/210 (73%) | |
| qwen3.5:27b | 180/210 (86%) |