Naver Economic News Crawler and Keyword Extractor

This project is a Python-based web crawler that searches for news articles related to the economy and extracts keywords from them.

한글과 영어(가-힣a-zA-Z)를 제외한 모든 문자를 제거 합니다. soynlp의 비지도학습을 통해 명사들만을 추출한후 trg_date에 대한 ref_date들의 단어 출현 확률을 기반으로 키워드를 정의합니다. 자세한 내용은 이곳에 있습니다.

Requirements

Python 3.x
aiohttp
beautifulsoup4
pandas

Installation

Clone the repository:

git clone https://github.com/kco4776/naver-news-keyword.git

Install the dependencies:

pip install -r requirements.txt

Usage

python main.py -t <target_date> -g <gap>

where <target_date> is the target date in the format YYYYMMDD and is the number of days to go back from the target date to extract articles from.

The crawler extracts articles from three news sites: 이데일리, 뉴스1, 서울경제

Contributing

If you want to contribute to this project, please fork the repository and create a pull request.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
model.py		model.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naver Economic News Crawler and Keyword Extractor

Requirements

Installation

Usage

Contributing

License

About

Releases

Packages

Languages

License

kco4776/naver-news-keyword

Folders and files

Latest commit

History

Repository files navigation

Naver Economic News Crawler and Keyword Extractor

Requirements

Installation

Usage

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages