Skip to content

NERtwork is a collection of scripts to help you create a network graph of co-occurring named entities using open source tools. This is done by using Stanford Named Entity Recognizer to identify named entities in the documents, then using NetworkX to create a bipartite projected network and exporting the node and edge lists for use in network vis…

License

Notifications You must be signed in to change notification settings

brandontlocke/NERtwork

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

Batch script for Named Entity Recognition

<<<<<<< HEAD

Requirements

At the moment, this only works on a Mac OSX machine.

This will need 1 or more text file (ending in .txt) to work.

You will need to download Stanford Named Entity Recognizer

Folder Setup

This script, as is, will run Stanford NER on every text file within a folder. This expects that the stanford-ner-2018-02-27 folder, all of the text files, and the batchner.sh script are all within the same folder.

├── project folder
|   ├── stanford-ner-2018-02-27
|   └── batchner.sh
|   └──file1.txt
|   └──file2.txt
|   └──file3.txt
|   └──file4.txt
|   └──file5.txt
|   └──file6.txt
|   └──etc.

If you're familiar with shell scripting and file navigation, you can fairly easily restructure this.

Running the Script

In terminal, navigate to the folder containing these files and type sh batchner.sh. This will take a bit to run, but will print all of the results into a file in the same folder called entities.csv

Folder Setup

This script, as is, will run Stanford NER on every text file within a folder. This expects that the stanford-ner-2018-02-27 folder, all of the text files, and the batchner.sh script are all within the same folder.

── project folder
├── stanford-ner-2018-02-27
├── file1.txt
├── file2.txt
├── file1.txt
├── file2.txt
├── batchner.sh

Running the Script

In terminal, navigate to the folder containing these files and type sh batchner.sh. This will take a bit to run, but will print all of the results into a file called entities.csv

origin/master

Notes

As new versions of Stanford NER come out, the filepath will change and will need to be updated

About

NERtwork is a collection of scripts to help you create a network graph of co-occurring named entities using open source tools. This is done by using Stanford Named Entity Recognizer to identify named entities in the documents, then using NetworkX to create a bipartite projected network and exporting the node and edge lists for use in network vis…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages