Aozora Bunko

This corpus represents the complete collection of out-of-copyright texts on the Aozora Bunko website ( Like “Project Gutenberg,” this is a crowd-sourced project begun in 1997 that has hand-transcribed thousands of works of Japanese fiction, poetry, drama, essays, and other genres. As of 2017, over 15,000 texts were contained on the site, representing most of the major literary figures of the modern period. Our corpus incorporates the extensive metadata provided for each of the texts by Aozora volunteers. We have also added the dates of first publication for about half of the texts. This corpus will continue to be updated as texts and authors are added to Aozora Bunko.

Search the Aozora Bunko corpus