-
Notifications
You must be signed in to change notification settings - Fork 14
Analysing French Universal Dependencies
Starting at release 5.1.2, there are options are possible when analysing French Universal Dependencies.
First, download the latest release, language packs, and configuration files from: https://github.com/joliciel-informatique/talismane/releases
In this case, there are no compound postags. If "des" is analysed as ADP+DET
, it will be split into two lines. If it is analysed as DET
, it will remain a single line.
The configuration file you want (replacing with the latest version of Talismane) is: talismane-fr-ud-output-5.2.0.conf
Use a command similar to :
java -Xmx2G -jar -Dconfig.file=talismane-fr-ud-output-5.2.0.conf talismane-core-5.2.0.jar --analyse --sessionId=fr --encoding=UTF8 --inFile=data/frTest.txt --outFile=data/frTest-ud.tal --logConfigFile=examples/conf/logback.xml
You may prefer compound postags which are not part of the UD tagset: ADP+DET
(for words like "du" and "aux") and ADP+PRON
(for words like "duquel", "auxquelles").
The configuration file you want (replacing with the latest version of Talismane) is: talismane-fr-ud-5.2.0.conf
Use a command similar to :
java -Xmx2G -jar -Dconfig.file=talismane-fr-ud-5.2.0.conf talismane-core-5.2.0.jar --analyse --sessionId=fr --encoding=UTF8 --inFile=data/frTest.txt --outFile=data/frTest-ud.tal --logConfigFile=examples/conf/logback.xml