Skip to content

pippokill/rir

Repository files navigation

Revised Random Indexing

Random Indexing for textual corpora

Random Indexing class: di.uniba.it.rir.cli.RI

usage: Execute Random Indexing [-c ] [-dim ] [-i ] [-lf
] [-ns ] [-o ] [-ri] [-rseed ] [-seed ]
[-st] [-sw ] [-t ] [-txt] [-w ]
-c This will discard words that appear less than times;
default is 5
-dim The vector dimension (optional, default 300)
-i Input corpus (compressed GZIP files are supported)
-lf Enable letter filter
-ns Number of negative samples (optinal, default 0)
-o Output file
-ri Save random vector
-rseed Seed for random inizialization
-seed The number of seeds (optional, default 10)
-st Use standard analyzer (default false)
-sw Stop word file
-t Threshold for downsampling frequent words (optinal,
default 0.001)
-txt Enable textual output format
-w Windows size (optional, default 5)

Random Indexing for graph

Random Indexing class: di.uniba.it.rir.cli.RIgraph

usage: Execute Graph Random Indexing [-c ] [-dim ] [-i ]
[-o ] [-r ] [-rseed ] [-seed ] [-txt] [-u]
-c This will discard verticies that appear less than
times; default is 5
-dim The vector dimension (optional, default 300)
-i Input file (compressed GZIP files are supported)
-o Output file
-r Reflective steps (optional, default 2)
-rseed Seed for random inizialization
-seed The number of seeds (optional, default 10)
-txt Enable textual output format
-u Undirected graph

Graph format: one edge for each line, see "sample_graph" file. Comment lines must start with '#'.

About

Revised Random Indexing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages