Charactor encoding support for Elixir. using ruster & "rust-encoding" crate. (Shift-JIS, EUC-JP, Big5.. other WHATWG encoding)
If available in Hex, the package can be installed
by adding mbcs_rs
to your list of dependencies in mix.exs
:
def deps do
[
{:mbcs_rs, "~> 0.1"}
]
end
Usage:
iex> MbcsRs.encode!("日本語", "SJIS") |> MbcsRs.decode!("SJIS")
日本語
iex> MbcsRs.encode!("你好,世界", "BIG5") |> MbcsRs.decode!("BIG5")
"你好,世界"
iex> MbcsRs.encode!("한국어", "EUC-KR") |> MbcsRs.decode!("EUC-KR")
"한국어"
iex> File.stream!("KEN_ALL.CSV") \
|> Stream.map(&MbcsRs.decode!(&1,"SJIS")) \
|> Stream.filter(&String.contains?(&1,"福岡市中央区")) \
|> Enum.to_list
["40133,\"810 \",\"8100000\",\"フクオカケン\",\"フクオカシチユウオウク\",\"イカニケイサイガナイバアイ\",\"福岡県\",\"福岡市中央区\",\"以下に掲載がない場合\",0,0,0,0,0,0\n",
...
"40133,\"810 \",\"8100037\",\"フクオカケン\",\"フクオカシチユウオウク\",\"ミナミコウエン\",\"福岡県\",\"福岡市中央区\",\"南公園\",0,0,0,0,0,0\n",
"40133,\"810 \",\"8100022\",\"フクオカケン\",\"フクオカシチユウオウク\",\"ヤクイン\",\"福岡県\",\"福岡市中央区\",\"薬院\",0,0,1,0,0,0\n",
...]
Supporting Other Encodings. See WHATWG encoding spec
Rust compiler & cargo
example for alpine linux
apk add musl-dev rust cargo