Skip to content

TurkuNLP/ATP_kurssi

Repository files navigation

ATP_kurssi

This page includes all the materials for the course KKLT0030 Automatic text processing 5 credits.

The course Moodle page has private materials, such as possible recordings and announcements.

Mon Oct 28

  • Getting started
  • Notebook 1
  • Commands
    • Getting data and printing stuff: wget, echo
    • Printing files: cat, head, tail
    • Copying, renaming, removing: cp, mv, rm
    • Others: wc -w, ls

Thur Oct 31

  • Notebook2
  • Commands: egrep, sort, uniq
  • Options
    • egrep -v, -i, -w, -c, -B, -A
    • head -n, tail -n
    • wc -l, -w
    • uniq -c, sort -r, -n
  • Pipes, especially frequency counts
    • sort | uniq -c | sort -rn

Mon Nov 4

  • Notebook3 exercises

Thur Nov 7

  • Notebook4
  • Git clone for cloning Github reports
  • Gzipped files using gzip and zcat
  • Changing characters using tr
    • Combining tr to a frequency list pipeline
    • Using tr to normalize
  • Regular expressions

Mon Nov 11

  • Notebook 5 exercies

Thur Nov 14

  • Notebook 6
  • Dependency syntax analysis pipeline
  • Sentence + token segmentation, lemmatisation, POS, dependencies
  • conllu format
  • Universal dependencies treebanks
  • Trankit parser

Mon Nov 18

  • Notebook 7
  • Running python scripts

Thur Nov 21

  • Notebook 8
  • Working on the server (Note that the exam will be on server!)

Mon Nov 25

  • Notebook 8 cont'd
  • Scripts

Thur Nov 28

  • Notebook 9

Mon Dec 2

  • Notebook 9

Thur Dec 5

  • Notebook 10
  • For loops

Mon Dec 9

Thur Dec 12

  • Exam, option 1
  • (TBA)

Thur Dec 19

  • Exam, option 2
  • (TBA)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published