Skip to content

Commit

Permalink
Fixing classpath, formatting in loglinear, jar naming. Also moved CoN…
Browse files Browse the repository at this point in the history
…LLBenchmark to itest.
  • Loading branch information
keenon authored and Stanford NLP committed Oct 5, 2015
1 parent 51b26f4 commit 77c5e80
Show file tree
Hide file tree
Showing 59 changed files with 2,437 additions and 2,997 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Stanford CoreNLP
================

Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, and Spanish. Stanford CoreNLP is an integrated framework, which make it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government.
Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, and mark up the structure of sentences in terms of phrases and word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for Arabic, (mainland) Chinese, French, German, and Spanish. Stanford CoreNLP is an integrated framework, which make it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools on it with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, government, and industry.

The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v3 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute.

Expand Down
2 changes: 1 addition & 1 deletion doc/loglinear/QUICKSTART.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ loglinear package quickstart:

First, read the ConcatVector section in ARCH.txt.

To jump straight into working code, go read generateSentenceModel() in edu.stanford.nlp.loglinear.learning.CoNLLBenchmark.
To jump straight into working code, go read generateSentenceModel() in edu.stanford.nlp.loglinear.CoNLLBenchmark.

#####################################################

Expand Down
2 changes: 1 addition & 1 deletion doc/loglinear/README.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
For an explanation of how everything fits together, see ARCH.txt

For a quick runnable object, go run edu.stanford.nlp.loglinear.learning.CoNLLBenchmark in core's test package.
For a quick runnable object, go run edu.stanford.nlp.loglinear.CoNLLBenchmark in core's test package.

For a tutorial, see QUICKSTART.txt

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
package edu.stanford.nlp.loglinear.learning;
package edu.stanford.nlp.loglinear;

import edu.stanford.nlp.loglinear.inference.CliqueTree;
import edu.stanford.nlp.loglinear.learning.AbstractBatchOptimizer;
import edu.stanford.nlp.loglinear.learning.BacktrackingAdaGradOptimizer;
import edu.stanford.nlp.loglinear.learning.LogLikelihoodFunction;
import edu.stanford.nlp.loglinear.model.ConcatVector;
import edu.stanford.nlp.loglinear.model.GraphicalModel;
import edu.stanford.nlp.util.HashIndex;
Expand Down
32 changes: 6 additions & 26 deletions itest/src/edu/stanford/nlp/naturalli/OpenIEITest.java
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import org.junit.Test;

import java.util.*;
import java.util.stream.Collectors;

import static org.junit.Assert.*;

Expand Down Expand Up @@ -48,16 +49,13 @@ public void assertExtracted(String expected, String text) {
found = true;
}
}
assertTrue("The extraction (" + expected.replace("\t", "; ") + ") was not found in '" + text + "'", found);
assertTrue("The extraction '" + expected + "' was not found in '" + text + "'", found);
}

public void assertExtracted(Set<String> expectedSet, String text) {
public void assertExtracted(Set<String> expected, String text) {
Collection<RelationTriple> extractions = annotate(text).get(NaturalLogicAnnotations.RelationTriplesAnnotation.class);
String actual = StringUtils.join(
extractions.stream().map(x -> x.toString().substring(x.toString().indexOf("\t") + 1).toLowerCase()).sorted(),
"\n");
String expected = StringUtils.join(expectedSet.stream().map(String::toLowerCase).sorted(), "\n");
assertEquals(expected, actual);
Set<String> guess = extractions.stream().filter(x -> x.confidence > 0.1).map(RelationTriple::toString).collect(Collectors.toSet());
assertEquals(StringUtils.join(expected.stream().sorted(), "\n").toLowerCase(), StringUtils.join(guess.stream().map(x -> x.substring(x.indexOf("\t") + 1)).sorted(), "\n").toLowerCase());
}

public void assertEntailed(String expected, String text) {
Expand Down Expand Up @@ -88,29 +86,12 @@ public void testBasicExtractions() {
assertExtracted("cats\thave\ttails", "some cats have tails");
}

@Test
public void testPaperExamples() {
// assertExtracted("Fish\tlike to\tswim", "Fish like to swim"); // Parse is persistently broken

assertExtracted("Tom\tfighting\tJerry", "Tom and Jerry are fighting.");
assertExtracted("cats\tis with\ttails", "There are cats with tails.");
assertExtracted("IBM\thas\tresearch group", "IBM's research group.");
assertExtracted("rabbits\teat\tvegetables", "All rabbits eat vegetables.");
}

@Test
public void testOtherExamples() {
// Preconj (but, parser currently fails)
// assertExtracted("Mary\tis\tbeautiful", "Mary is both beautiful and smart.");
// assertExtracted(Collections.EMPTY_SET, "Mary is neither beautiful and smart.");
}

@Test
public void testExtractionsGeorgeBoyd() {
assertExtracted(new HashSet<String>() {{
add("George Boyd\tjoined on\t21 february 2013");
add("George Boyd\tjoined for\tremainder");
add("George Boyd\tjoined for\tremainder of season");
add("George Boyd\tjoined on\t21 february 2013");
add("George Boyd\tjoined on\tloan");
add("George Boyd\tjoined on\tloan from peterborough united");
}}, "On 21 February 2013 George Boyd joined on loan from Peterborough United for the remainder of the season.");
Expand Down Expand Up @@ -151,7 +132,6 @@ public void testExtractionsObamaWikiThree() {
add("He\twas\tcommunity organizer");
// add("He\tearning\tlaw degree");
add("He\tearning\this law degree");
add("community organizer\tis in\tChicago");
}}, "He was a community organizer in Chicago before earning his law degree.");
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,15 @@ public void testDependencyParserEnglishSD() {
}

// Lower because we're evaluating on PTB + extraDevTest, not just PTB
private static final double EnglishUdLas = 88.72648417258083;
private static final double EnglishUdLas = 84.9873;

/**
* Test that the NN dependency parser performance doesn't change.
*/
public void testDependencyParserEnglishUD() {
DependencyParser parser = new DependencyParser();
parser.loadModelFile("/u/nlp/data/depparser/nn/distrib-2015-04-16/english_UD.gz");
double las = parser.testCoNLL("/u/nlp/data/depparser/nn/data/dependency_treebanks/UD-converted/dev.conll", null);
double las = parser.testCoNLL("/u/nlp/data/depparser/nn/data/dependency_treebanks/USD/dev.conll", null);
assertEquals(String.format("English UD LAS should be %.2f but was %.2f",
EnglishUdLas, las), EnglishUdLas, las, 1e-4);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ public void testSentiment() {

@Test
public void testOpenie() {
testAnnotators("tokenize,ssplit,pos,lemma,depparse,natlog,openie");
testAnnotators("tokenize,ssplit,pos,depparse,natlog,openie");
}

@Test
Expand Down Expand Up @@ -415,12 +415,6 @@ public void testSerializeNatLog() {
testAnnotators("tokenize,ssplit,pos,lemma,depparse,natlog");
}


@Test
public void testGender() {
testAnnotators("tokenize,ssplit,pos,gender");
}

/**
* Is the protobuf annotator "CoreNLP complete?"
* That is, does it effectively save every combination of annotators possible?
Expand Down
153 changes: 153 additions & 0 deletions liblocal/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
-----------------------------------------------------------------------
antlr-runtime.jar

ORIGINAL JAR NAME: antlr-runtime-3.1.2.jar

VERSION: 3.1.2

RELEASE DATE: Feb 21, 2009

SOURCE AVAILABLE: Maven Central

DESCRIPTION: ANTLR runtime, for compiled software

URL: http:https://www.antlr.com

USED BY:
The Quickcheck library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling

-----------------------------------------------------------------------
hamcrest-core.jar

ORIGINAL JAR NAME: hamcrest-core-1.3.jar

VERSION: 1.3

RELEASE DATE: Jul, 2010

SOURCE AVAILABLE: Maven Central

DESCRIPTION: Hamcrest shennanigans, for JUnit

URL: http:https://www.hamcrest.org

USED BY:
The JUnit library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling

-----------------------------------------------------------------------
javaruntype.jar

ORIGINAL JAR NAME: javaruntype-1.2.jar

VERSION: 1.2

RELEASE DATE: Aug, 2010

SOURCE AVAILABLE: Maven Central

DESCRIPTION: Something for Quickcheck

URL: http:https://www.javaruntype.org

USED BY:
The Quickcheck library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling

-----------------------------------------------------------------------
junit-quickcheck-core.jar

ORIGINAL JAR NAME: junit-quickcheck-core-0.4-beta-3.jar

VERSION: 0.4-beta-3

RELEASE DATE: Nov, 2013

SOURCE AVAILABLE: Maven Central

DESCRIPTION: Quickcheck, runs random inputs and validates outputs

URL: https://github.com/pholser/junit-quickcheck

USED BY:
The Quickcheck library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling

-----------------------------------------------------------------------
junit-quickcheck-generators.jar

ORIGINAL JAR NAME: junit-quickcheck-generators-0.4-beta-3.jar

VERSION: 0.4-beta-3

RELEASE DATE: Nov, 2013

SOURCE AVAILABLE: Maven Central

DESCRIPTION: Quickcheck, runs random inputs and validates outputs

URL: https://github.com/pholser/junit-quickcheck

USED BY:
The Quickcheck library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling

-----------------------------------------------------------------------
junit-theories.jar

ORIGINAL JAR NAME: junit-theories-4.12.jar

VERSION: 4.12

RELEASE DATE: Dec, 2014

SOURCE AVAILABLE: Maven Central

DESCRIPTION: JUnit theories run JUnit against a number of inputs

URL: junit.org

USED BY:
The Quickcheck library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling

-----------------------------------------------------------------------
ognl.jar

ORIGINAL JAR NAME: ognl-3.05.jar

VERSION: 3.05

RELEASE DATE: Apr, 2012

SOURCE AVAILABLE: Maven Central

DESCRIPTION: Object graph navigation library, used by Quickcheck

URL: https://commons.apache.org/proper/commons-ognl/

USED BY:
The Quickcheck library

LAST UPDATE: 2015/10/5

LAST UPDATE BY: Keenon Werling
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@ public class LogConditionalObjectiveFunction<L, F> extends AbstractStochasticCac

/** Multithreading gradient calculations is a bit cheaper if you reuse the threads. */
protected int threads = Execution.threads;
protected ExecutorService executorService = Executors.newFixedThreadPool(threads);

@Override
public int domainDimension() {
Expand Down Expand Up @@ -326,7 +325,7 @@ private void calculateCLbatch(double[] x) {
CountDownLatch latch = new CountDownLatch(threads);
for (int i = 0; i < threads; i++) {
runnables[i] = new CLBatchDerivativeCalculation(threads, i, null, x, derivative.length, latch);
executorService.execute(runnables[i]);
new Thread(runnables[i]).start();
}
try {
latch.await();
Expand Down Expand Up @@ -684,7 +683,7 @@ public double calculateStochasticUpdate(double[] x, double xscale, int[] batch,
CountDownLatch latch = new CountDownLatch(threads);
for (int i = 0; i < threads; i++) {
runnables[i] = new CLBatchDerivativeCalculation(threads, i, batch, x, x.length, latch);
executorService.execute(runnables[i]);
new Thread(runnables[i]).start();
}
try {
latch.await();
Expand Down Expand Up @@ -1005,7 +1004,7 @@ protected void rvfcalculate(double[] x) {
CountDownLatch latch = new CountDownLatch(threads);
for (int i = 0; i < threads; i++) {
runnables[i] = new RVFDerivativeCalculation(threads, i, x, derivative.length, latch);
executorService.execute(runnables[i]);
new Thread(runnables[i]).start();
}
try {
latch.await();
Expand Down
Loading

0 comments on commit 77c5e80

Please sign in to comment.