Skip to content

Basic implementation of SparseSCVB0 algorithm for LDA in Rashidul Islam, and James Foulds. "Scalable Collapsed Inference for High-Dimensional Topic Models." Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

License

Notifications You must be signed in to change notification settings

rashid-islam/SparseSCVB0

Repository files navigation

SparseSCVB0

Basic implementation of SparseSCVB0 algorithm for LDA in Rashidul Islam, and James Foulds. "Scalable Collapsed Inference for High-Dimensional Topic Models." Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

Prerequisites

  • Julia (tested on v0.6.4.1)
  • Optional Julia packages: MAT and JLD (to save the generated results)

The code is tested on windows and linux operating systems. It should work on any other platform.

Data format

The input is a single file, with one line per document where each word is separated by a space. Words in each document are represented by one-based dictionary indices. The demo is provided on NIPS corpus NIPS corpus, due to Sam Roweis. See more in data folder where NIPS.txt and NIPSdict.txt contain the corpus and dictionary, respectively.

To run the SparseSCVB0

To run SparseSCVB0 for LDA on NIPS corpus, simply run "runSparseSCVB0.jl" Julia file.

Author

License

The code to implement SparseSCVB0 is licensed under Apache License Version 2.0.

Acknowledgments

Many part of the implementation was based on the Julia code implementing original SCVB0.

About

Basic implementation of SparseSCVB0 algorithm for LDA in Rashidul Islam, and James Foulds. "Scalable Collapsed Inference for High-Dimensional Topic Models." Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages