The JMdictDB Project
This page contains infomation about the development
of a Postgresql database to support Jim Breen's Japanese-English dictionary
Jim runs these projects under the auspices of the
Electronic Dictionary Research and Development Group
The goals of this project (in priority order) are:
Discussion of this project takes place on the email@example.com
Jim Breen maintains a web page describing the JMdict project's
use of JMdictDB at
There is also some older information at
The project code is still undergoing active development and no
promises are made regarding stability or backward compatibility.
However, it is currently in use as the primary repository for the
JMdict project dictionary data and the web interface is in use for
submitting new entries and corrections to existing entries in WWWJDIC.
All the code developed for this project is GPL'd and maintained
in a publicly accessible Mercurial repository (links below).
Additional help is welcome; please post to the edict-jmdict
mailing list, or email the current principal developer at the
address at the bottom of this page.
The code currently consists of scripts to create and load JMdict (and
related data such as the JMnedict "Japanese names" file, or the Tatoeba
"examples" file) into a Postgresql database, some maintenance and other
command line tools, and a set of CGI scripts to allow access and updating
of the database using a web browser. The code was originally written in
Perl but was migrated entirely to Python in May 2008. The code
is developed and tested under Ubuntu Linux and Fedora 15 (both with Apache
web server), and Microsoft Windows XP (with IIS web server). More
information on prequisites is in the README.txt file.
- To create a database to serve as a master repository for the
information in the JMdict, EDICT, JMnedict, Examples, Kanjidic
and other related files distributed by Jim Breen and the EDRDG.
- To provide a web-based system for the submission, review, and
approval of corrections and new entries to these data.
- To provide freely available software to others who want to use
or build upon, "JMdict in a database".
- To provide an open-source replacement for the principal author's
Microsoft Access based JMdict database. :-)
conj.py is a standalone Python program that uses the conjugation
tables developed for the JMdictDB project to demonstrate how simple
a table-based Japanese word conjugator can be when using this approach.
It has been moved out of JMdictDB to a separate, independent (git) project.
The JMdictDB web interface now provides a page that lets users change
their saved settings (userid, name, email address, password). It is
accessed by clicking one's user name after logging in. An "administrator"
user privilege level has been added which which any user's settings can be
The templating system user by JMdictDB web interface was changed from
implementation of the TAL/TALES attribute language used in Zope, to
Jinja2, a somewhat more readable
and easier to use template language.
The "Submissions" link at the top of each JMdictDB page now shows an index page
with links to updates made "Today", "Yesterday", each day of this year grouped
by month, and previous years. The apperance is very close to the databaseupdates.html
that Jim Breen was maintaining. Formerly, the Submissions link showed a page of
the entries that were updated "today" only.
The contact email of the principle developer of JMdictDB has changed; the
various JMdictDB web pages and forms have been updated with the new address
in the footer.
Try it !
Access to the online test version of JMdictDB.
(Note that these links are to the web pages provided in the JMdictDB
source code. The pages linked to from WWWJDIC are very similar but
have been tweaked to the needs of WWWJDIC.)
Find and edit existing entries: search /
Add a new entry
Please feel free to try these out, including adding any real or junk
entries you want, but be aware that all changes will be thrown away
periodically and will NOT go into the real JMdict.
Code and Documentation
-- Browsable (read-only) access to the JMdictDB code Mercurial repository.
-- Issue tracker for the JMdictDB project software.
-- Download source code, latest development version (gzipped tar file).
-- The README file, includes install prerequisites and instructions (2010-03-10)
-- Comprehensive description of the database schema (2008-11-12).
-- Diagram of the database schema (200KB, 2008-11-12).
-- Source code for last version implemented in Perl, obsolete, 2008-05-03 (gzipped tar file).
Related files, but not part of JMdictDB...
The following HTML pages list all jmdict entries that share a common
kanji or reading text with at least one other entry. The entries
are sorted by the text making it relatively easy to identify
enties that are very similar and possibly should be merged.
This data is based on the 2007-01-14 version on JMdict.
Shared kanji (800KB)
Shared readings (10MB)
Matchup of Kale Stutzman's 2007-01-14
google hit counts and corresponding JMdict entries
README.txt (also included in the .zip files)
kale-u.zip UTF-8 encoded files
kale-w.zip SJIS (Windows) encoded files
Kale Stutzman's original data file in alternate encodings:
edict-gfreq.euc EUC-JP encoded
edict-gfreq.utf UTF-8 encoded