public class ClauseSplitterSearchProblem
extends java.lang.Object
For usage at test time, load a model from
ClauseSplitter.load(String), and then take the top clauses of a given tree
with topClauses(double, int), yielding a list of
SentenceFragments.
ClauseSearcher searcher = ClauseSearcher.factory("/model/path/");
List<SentenceFragment> sentences = searcher.topClauses(threshold);
For training, see ClauseSplitter.train(Stream, File, File).
| Modifier and Type | Class and Description |
|---|---|
static interface |
ClauseSplitterSearchProblem.Action
An action being taken; that is, the type of clause splitting going on.
|
static interface |
ClauseSplitterSearchProblem.Featurizer
Mostly just an alias, but make sure our featurizer is serializable!
|
class |
ClauseSplitterSearchProblem.State
A search state.
|
static class |
ClauseSplitterSearchProblem.TrainingOptions
The options used for training the clause searcher.
|
| Modifier and Type | Field and Description |
|---|---|
boolean |
assumedTruth
The assumed truth of the original clause.
|
static ClauseSplitterSearchProblem.Featurizer |
DEFAULT_FEATURIZER
The default featurizer to use during training.
|
protected static java.util.Map<java.lang.String,java.util.List<java.lang.String>> |
HARD_SPLITS
A specification for clause splits we _always_ want to do.
|
protected static java.util.Set<java.lang.String> |
INDIRECT_SPEECH_LEMMAS
A set of words which indicate that the complement clause is not factual, or at least not necessarily factual.
|
int |
sentenceLength
The length of the sentence, as determined from the tree.
|
SemanticGraph |
tree
The tree to search over.
|
| Modifier | Constructor and Description |
|---|---|
|
ClauseSplitterSearchProblem(SemanticGraph tree,
boolean assumedTruth)
Create a clause searcher which searches naively through every possible subtree as a clause.
|
protected |
ClauseSplitterSearchProblem(SemanticGraph tree,
boolean assumedTruth,
java.util.Optional<Classifier<ClauseSplitter.ClauseClassifierLabel,java.lang.String>> isClauseClassifier,
java.util.Optional<java.util.function.Function<Triple<ClauseSplitterSearchProblem.State,ClauseSplitterSearchProblem.Action,ClauseSplitterSearchProblem.State>,Counter<java.lang.String>>> featurizer)
Create a searcher manually, suppling a dependency tree, an optional classifier for when to split clauses,
and a featurizer for that classifier.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
search(IndexedWord root,
java.util.function.Predicate<Triple<java.lang.Double,java.util.List<Counter<java.lang.String>>,java.util.function.Supplier<SentenceFragment>>> candidateFragments,
Classifier<ClauseSplitter.ClauseClassifierLabel,java.lang.String> classifier,
java.util.Map<java.lang.String,? extends java.util.List<java.lang.String>> hardCodedSplits,
java.util.function.Function<Triple<ClauseSplitterSearchProblem.State,ClauseSplitterSearchProblem.Action,ClauseSplitterSearchProblem.State>,Counter<java.lang.String>> featurizer,
java.util.Collection<ClauseSplitterSearchProblem.Action> actionSpace,
int maxTicks)
The core implementation of the search.
|
void |
search(java.util.function.Predicate<Triple<java.lang.Double,java.util.List<Counter<java.lang.String>>,java.util.function.Supplier<SentenceFragment>>> candidateFragments)
Search, using the default weights / featurizer.
|
void |
search(java.util.function.Predicate<Triple<java.lang.Double,java.util.List<Counter<java.lang.String>>,java.util.function.Supplier<SentenceFragment>>> candidateFragments,
Classifier<ClauseSplitter.ClauseClassifierLabel,java.lang.String> classifier,
java.util.Map<java.lang.String,java.util.List<java.lang.String>> hardCodedSplits,
java.util.function.Function<Triple<ClauseSplitterSearchProblem.State,ClauseSplitterSearchProblem.Action,ClauseSplitterSearchProblem.State>,Counter<java.lang.String>> featurizer,
int maxTicks)
Search from the root of the tree.
|
java.util.List<SentenceFragment> |
topClauses(double thresholdProbability,
int maxClauses)
Get the top few clauses from this searcher, cutting off at the given minimum
probability.
|
protected static final java.util.Map<java.lang.String,java.util.List<java.lang.String>> HARD_SPLITS
protected static final java.util.Set<java.lang.String> INDIRECT_SPEECH_LEMMAS
public final SemanticGraph tree
public final boolean assumedTruth
public final int sentenceLength
public static final ClauseSplitterSearchProblem.Featurizer DEFAULT_FEATURIZER
protected ClauseSplitterSearchProblem(SemanticGraph tree, boolean assumedTruth, java.util.Optional<Classifier<ClauseSplitter.ClauseClassifierLabel,java.lang.String>> isClauseClassifier, java.util.Optional<java.util.function.Function<Triple<ClauseSplitterSearchProblem.State,ClauseSplitterSearchProblem.Action,ClauseSplitterSearchProblem.State>,Counter<java.lang.String>>> featurizer)
ClauseSplitter.load(String) instead of this
constructor.tree - The dependency tree to search over.assumedTruth - The assumed truth of the tree (relevant for natural logic inference). If in doubt, pass in true.isClauseClassifier - The classifier for whether a given dependency arc should be a new clause. If this is not given, all arcs are treated as clause separators.featurizer - The featurizer for the classifier. If no featurizer is given, one should be given in search(java.util.function.Predicate, Classifier, Map, java.util.function.Function, int), or else the classifier will be useless.ClauseSplitter.load(String)public ClauseSplitterSearchProblem(SemanticGraph tree, boolean assumedTruth)
tree - The dependency tree to search over.assumedTruth - The truth of the premise. Almost always True.public java.util.List<SentenceFragment> topClauses(double thresholdProbability, int maxClauses)
thresholdProbability - The threshold under which to stop returning clauses. This should be between 0 and 1.maxClauses - A hard limit on the number of clauses to return.SentenceFragment objects, representing the top clauses of the sentence.public void search(java.util.function.Predicate<Triple<java.lang.Double,java.util.List<Counter<java.lang.String>>,java.util.function.Supplier<SentenceFragment>>> candidateFragments)
topClauses(double, int) may be a more convenient method for
an end user.candidateFragments - The callback function for results. The return value defines whether to continue searching.public void search(java.util.function.Predicate<Triple<java.lang.Double,java.util.List<Counter<java.lang.String>>,java.util.function.Supplier<SentenceFragment>>> candidateFragments, Classifier<ClauseSplitter.ClauseClassifierLabel,java.lang.String> classifier, java.util.Map<java.lang.String,java.util.List<java.lang.String>> hardCodedSplits, java.util.function.Function<Triple<ClauseSplitterSearchProblem.State,ClauseSplitterSearchProblem.Action,ClauseSplitterSearchProblem.State>,Counter<java.lang.String>> featurizer, int maxTicks)
candidateFragments - The callback function.classifier - The classifier for whether an arc should be on the path to a clause split, a clause split itself, or neither.featurizer - The featurizer to use during search, to be dot producted with the weights.search(Predicate)protected void search(IndexedWord root, java.util.function.Predicate<Triple<java.lang.Double,java.util.List<Counter<java.lang.String>>,java.util.function.Supplier<SentenceFragment>>> candidateFragments, Classifier<ClauseSplitter.ClauseClassifierLabel,java.lang.String> classifier, java.util.Map<java.lang.String,? extends java.util.List<java.lang.String>> hardCodedSplits, java.util.function.Function<Triple<ClauseSplitterSearchProblem.State,ClauseSplitterSearchProblem.Action,ClauseSplitterSearchProblem.State>,Counter<java.lang.String>> featurizer, java.util.Collection<ClauseSplitterSearchProblem.Action> actionSpace, int maxTicks)
root - The root word to search from. Traditionally, this is the root of the sentence.candidateFragments - The callback for the resulting sentence fragments.
This is a predicate of a triple of values.
The return value of the predicate determines whether we should continue searching.
The triple is a triple of
Supplier.classifier - The classifier for whether an arc should be on the path to a clause split, a clause split itself, or neither.featurizer - The featurizer to use. Make sure this matches the weights!actionSpace - The action space we are allowed to take. Each action defines a means of splitting a clause on a dependency boundary.