Skip to content
@kuhumcst

Centre for Language Technology, University of Copenhagen

Popular repositories Loading

  1. cstlemma cstlemma Public

    Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefix, infix, suffix, circumfix). Rules are obtained by supervis…

    C++ 35 7

  2. stucco stucco Public archive

    An experimental adaptive UI toolkit.

    Clojure 31 1

  3. xml-hiccup xml-hiccup Public

    Convert XML into Hiccup in Clojure and ClojureScript.

    Clojure 20 1

  4. DanNet DanNet Public

    The Danish WordNet as an RDF graph.

    Clojure 19

  5. taggerXML taggerXML Public

    Modernized version of Eric Brill's Part Of Speech tagger.

    C++ 17 6

  6. tf-idf tf-idf Public

    A reasonably performant TF-IDF implementation.

    Clojure 12 1

Repositories

Showing 10 of 62 repositories
  • texton-Java Public

    Web-based workflow management system that computes candidate tool workflows given input file(s) and the user's requirements regarding the output. Afterwards, runs a workflow selected by the user from the list of candidates. Implemented in Bracmat (~75%) and Java (~25%).

    kuhumcst/texton-Java’s past year of commit activity
    Java 2 GPL-3.0 2 0 0 Updated Jul 23, 2024
  • texton Public

    Text Tonsorium - a toolbox that automatically arranges NLP tools in workflows and enacts them with user's inputs

    kuhumcst/texton’s past year of commit activity
    PHP 4 0 2 0 Updated Jul 23, 2024
  • cstlemma Public

    Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefix, infix, suffix, circumfix). Rules are obtained by supervised learning from a full form - lemma list.

    kuhumcst/cstlemma’s past year of commit activity
    C++ 35 GPL-2.0 7 2 0 Updated Jul 23, 2024
  • trigram Public

    Extract trigrams from lemmas in tab separated full form/lemma list. Count the number of occurences of each trigram. Spaces are added before and after each lemma.

    kuhumcst/trigram’s past year of commit activity
    0 GPL-3.0 0 0 0 Updated Jul 22, 2024
  • texton-linguistic-resources Public

    Linguistic resources for several of the tools included in the Text Tonsorium

    kuhumcst/texton-linguistic-resources’s past year of commit activity
    Roff 1 0 0 0 Updated Jul 19, 2024
  • DanNet Public

    The Danish WordNet as an RDF graph.

    kuhumcst/DanNet’s past year of commit activity
    Clojure 19 MIT 0 29 0 Updated Jul 4, 2024
  • hiccup-tools Public

    various functions for manipulating Hiccup data

    kuhumcst/hiccup-tools’s past year of commit activity
    Clojure 0 0 0 0 Updated Jul 4, 2024
  • danish-semantic-reasoning-benchmark Public

    A Danish semantic reasoning benchmark compiled from lexical semantic resources

    kuhumcst/danish-semantic-reasoning-benchmark’s past year of commit activity
    3 0 0 0 Updated Jun 18, 2024
  • kuhumcst/GEHM_zoom_corpus’s past year of commit activity
    0 0 0 0 Updated Jun 7, 2024
  • korp-setups Public

    Docker setups for all Korp installations maintained by NorS.

    kuhumcst/korp-setups’s past year of commit activity
    JavaScript 0 MIT 0 3 1 Updated May 29, 2024

Top languages

Loading…

Most used topics

Loading…