Skip to content

A clojure wrapper for Sudachi, a Japanese morphological analyzer written in Java.


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



14 Commits

Repository files navigation


A Clojure wrapper library for Sudachi, a Japanese morphological analyzer written in Java.


Quick Start

If you are using Leiningen, clone this repository to your-project/checkouts/ directory. For example, your project files should look like this:


With Leiningen's checkout dependencies feature, you can define the dependencies in your project.clj as follows:

:dependencies [[org.clojure/clojure "1.10.1"]
               [sudachi-clj "0.1.0"]

After that, you can load this library like (require [sudachi-clj.core: as sudachi]).

Sudachi Dictionary

The morphological analysis engine requires a Sudachi dictionary, which can be downloaded from here.

Once you have downloaded the Sudachi dictionary file, place it in any location (e.g. /path/to/system_core.dic).

By setting the path to the dictionary file in the environment variable SUDACHI_DICTIONARY_FILE, the system can automatically locate the dictionary. Or you can also specify the location as an argument of the start function described below.


Let's play with lein REPL:

lein repl

In order to perform morphological analysis, you must activate the Sudachi system by calling sudachi-clj.core/start function.

sudachi-clj.core=> (start)

Or you can manually specify the location of the Sudachi dictionary, instead of setting the environment variable SUDACHI_DICTIONARY_FILE.

sudachi-clj.core=> (start :dictionary-file "/path/to/system_core.dic")

Now you are ready to analyze sentences. Let's try as follows:

sudachi-clj.core=> (analyze "宇宙、生命、そして万物についての究極の疑問の答え")
[["宇宙" ["名詞" "普通名詞" "一般" "*" "*" "*"]]
 ["、" ["補助記号" "読点" "*" "*" "*" "*"]]
 ["生命" ["名詞" "普通名 詞" "一般" "*" "*" "*"]]
 ["、" ["補助記号" "読点" "*" "*" "*" "*"]]
 ["そして" ["接続詞" "*" "*" "*" "*" "*"]]
 ["万物" ["名詞" "普通名詞" "一般" "*" "*" "*"]]
 ["に" ["助詞" "格助詞" "*" "*" "*" "*"]]
 ["つい" ["動詞" "一般" "*" "*" "五段-カ行" "連用形-イ音便"]]
 ["て" ["助詞" "接続助詞" "*" "*" "*" "*"]]
 ["の" ["助詞" "格助詞" "*" "*" "*" "*"]]
 ["究極" ["名詞" "普通名詞" "一般" "*" "*" "*"]]
 ["の" ["助詞" "格助詞" "*" "*" "*" "*"]]
 ["疑問" ["名詞" "普通名詞" "一般" "*" "*" "*"]]
 ["の" ["助詞" "格助詞" "*" "*" "*" "*"]]
 ["答え" ["名詞" "普通名詞" "一般" "*" "*" "*"]]]

Note that the sudachi-clj.core/analyze function returns nil if the system is not started or stopped.

After completing the morphological analysis, it's recommended to shut down the system to free resources, using sudachi-clj.core/stop.

sudachi-clj.core=> (stop)


The settings passed to the original Java Sudachi library are defined at :sudachi-clj.config/json in resources/config.edn. If you want to change the plug-in settings and so on, look out there.

See also:


Copyright © 2019 sandmark

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http:

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at