Machine Learning and NLP – Page 2

FSharp.NLP.Stanford.Parser justification or StackOverflow questions understanding.

21/07/201325/02/2021F#, Machine Learning and NLP8 Comments

Some weeks ago, I announced FSharp.NLP.Stanford.Parser and now I want to clarify the goals of this project and show an example of usage.

First of all, this is not an attempt to re-implement some functionality of Stanford Parser. It is just a tiny dust layer that aimed to simplify interaction with Java collections (especially Iterable interface) and bring the power of F# constructs (like pattern matching and discrimination unions) to the code that deals with tagging results.

Task

Let’s start with some sample NLP task: We want to show related questions before user asks a new one (as it works on StackOverflow). There are many possible solutions for this task. Let’s look at one that at the first step tries to understand key phrases that identify this question and runs the search using them.

Approach

First of all, let’s choose some real questions from StackOverflow to analyze them:

Now we can use Stanford Parser GUI to visualize the structure of these questions:

As you can see this question is about “F# project” and “object browser”

This question about "WebSharper", "Mono 3.0" and "Mac" — This question is about “WebSharper”, “Mono 3.0” and “Mac”

This one about "extra methods", "type providers" and "F#" — This one is about “extra methods”, “type providers” and “F#”

The last one about "MonoDevelop" and "F# projects". — The last one is about “MonoDevelop” and “F# projects”.

We can notice that all phrases that we have selected are parts of noun phrases(NP). As a first solution we can try to analyze tags in the tree and select NP that contains word level tags like (NN,NNS,NNP,NNPS).

Solution

#r @"..\packages\IKVM.7.3.4830.0\lib\IKVM.Runtime.dll"
#r @"..\packages\IKVM.7.3.4830.0\lib\IKVM.OpenJDK.Core.dll"
#r @"..\packages\Stanford.NLP.Parser.3.2.0.0\lib\ejml-0.19-nogui.dll"
#r @"..\packages\Stanford.NLP.Parser.3.2.0.0\lib\stanford-parser.dll"

open edu.stanford.nlp.parser.lexparser
open edu.stanford.nlp.trees
open System

let model = @"d:\englishPCFG.ser.gz";

let options = [|"-maxLength"; "500";"-retainTmpSubcategories"; "-MAX_ITEMS"; "500000";"-outputFormat"; "penn,typedDependenciesCollapsed"|]
let lp = LexicalizedParser.loadModel(model, options)

let tlp = PennTreebankLanguagePack();
let gsf = tlp.grammaticalStructureFactory();

open java.util
let toSeq (iter:Iterator) =
    let rec loop (x:Iterator) = 
        seq { 
            yield x.next()
            if x.hasNext() then 
                yield! (loop x)
            }
    loop iter

let getTree question = 
    let toke = tlp.getTokenizerFactory().getTokenizer(new java.io.StringReader(question));
    let sentence = toke.tokenize();
    lp.apply(sentence)

let getKeyPhrases (tree:Tree) = 
    let isNPwithNNx (node:Tree)= 
        if (node.label().value() <> "NP") then false
        else node.getChildrenAsList().iterator()
             |> toSeq 
             |> Seq.cast<Tree>
             |> Seq.exists (fun x-> 
                let y = x.label().value()
                y= "NN" || y = "NNS" || y = "NNP" || y = "NNPS")
    let rec foldTree acc (node:Tree) = 
        let acc = 
            if (node.isLeaf()) then acc
            else node.getChildrenAsList().iterator()
                 |> toSeq 
                 |> Seq.cast<Tree>
                 |> Seq.fold 
                    (fun state x -> foldTree state x)
                    acc
        if isNPwithNNx node 
          then node :: acc
          else acc
    foldTree [] tree

let questions = 
    [|"How to make an F# project work with the object browser";
      "How can I build WebSharper on Mono 3.0 on Mac?";
      "Adding extra methods as type extensions in F#";
      "How to get MonoDevelop to compile F# projects?"|]

questions
|> Seq.iter (fun question ->
    printfn "Question : %s" question
    question 
    |> getTree 
    |> getKeyPhrases
    |> List.rev
    |> List.iter (fun p ->
        p.getLeaves().iterator() 
        |> toSeq 
        |> Seq.cast<Tree> 
        |> Seq.map(fun x-> x.label().value()) 
        |> Seq.toArray
        |> printfn "\t%A")
)

If you run this script, you will see the following:

Question : How to make an F# project work with the object browser
[|”an”; “F”; “#”; “project”; “work”|]
[|”the”; “object”; “browser”|]
Question : How can I build WebSharper on Mono 3.0 on Mac?
[|”WebSharper”|]
[|”Mono”; “3.0”|]
[|”Mac”|]
Question : Adding extra methods as type extensions in F#
[|”extra”; “methods”|]
[|”type”; “extensions”|]
[|”F”; “#”|]
Question : How to get MonoDevelop to compile F# projects?
[|”MonoDevelop”|]
[|”F”; “#”; “projects”|]

It is almost what we have expected. Results are good enough, but we can simplify the code and make it more readable using FSharp.NLP.Stanford.Parser.

#r @"..\packages\IKVM.7.3.4830.0\lib\IKVM.Runtime.dll"
#r @"..\packages\IKVM.7.3.4830.0\lib\IKVM.OpenJDK.Core.dll"
#r @"..\packages\Stanford.NLP.Parser.3.2.0.0\lib\ejml-0.19-nogui.dll"
#r @"..\packages\Stanford.NLP.Parser.3.2.0.0\lib\stanford-parser.dll"
#r @"..\packages\FSharp.NLP.Stanford.Parser.0.0.3\lib\FSharp.NLP.Stanford.Parser.dll"

open edu.stanford.nlp.parser.lexparser
open edu.stanford.nlp.trees
open System
open FSharp.IKVM.Util
open FSharp.NLP.Stanford.Parser

let model = @"d:\englishPCFG.ser.gz";

let options = [|"-maxLength"; "500";"-retainTmpSubcategories"; "-MAX_ITEMS"; "500000";"-outputFormat"; "penn,typedDependenciesCollapsed"|]
let lp = LexicalizedParser.loadModel(model, options)

let tlp = PennTreebankLanguagePack();
let gsf = tlp.grammaticalStructureFactory();

let getTree question = 
    let toke = tlp.getTokenizerFactory().getTokenizer(new java.io.StringReader(question));
    let sentence = toke.tokenize();
    lp.apply(sentence)

let getKeyPhrases (tree:Tree) = 
    let isNNx = function
        | Label NN | Label NNS | Label NNP | Label NNPS -> true
        | _ -> false
    let isNPwithNNx = function
        | Label NP as node 
            when node.getChildrenAsList() |> Iterable.castToSeq<Tree> |> Seq.exists isNNx
            -> true
        | _ -> false
    let rec foldTree acc (node:Tree) = 
        let acc = 
            if (node.isLeaf()) then acc
            else node.getChildrenAsList()
                 |> Iterable.castToSeq<Tree>
                 |> Seq.fold 
                    (fun state x -> foldTree state x)
                    acc
        if isNPwithNNx node 
          then node :: acc
          else acc
    foldTree [] tree

let questions = 
    [|"How to make an F# project work with the object browser";
      "How can I build WebSharper on Mono 3.0 on Mac?";
      "Adding extra methods as type extensions in F#";
      "How to get MonoDevelop to compile F# projects?"|]

questions
|> Seq.iter (fun question ->
    printfn "Question : %s" question
    question 
    |> getTree 
    |> getKeyPhrases
    |> List.rev
    |> List.iter (fun p ->
        p.getLeaves()
        |> Iterable.castToArray<Tree>
        |> Array.map(fun x-> x.label().value()) 
        |> printfn "\t%A")
)

Look more carefully at getKeyPhrases function. All tags are strongly typed now. You can be sure that you will never make a typo, code is more readable and self explained:

STTags

Rattle for F# devs

16/07/201325/02/2021F#, Machine Learning and NLP1 Comment

The strange thing happens, Rattle is an awesome tool but it is not so well known for devs as it should be. We definitely need to fix this.

Rattle (the R Analytical Tool To Learn Easily) presents statistical and visual summaries of data, transforms data into forms that can be readily modelled, builds both unsupervised and supervised models from the data, presents the performance of models graphically, and scores new datasets.

At first, we need to install new package from CRAN. To do so, just open R console and type the following:

install.packages("rattle")

Here, you need to check that you have RProvider installed.

Install-Package RProvider

Now we are ready to start.

#I @"..\packages\RProvider.1.0.0\lib"
#r "RDotNet.dll"
#r "RProvider.dll"

open RProvider.rattle
R.rattle() |> ignore

Execute this short snippet and you should see Rattle start screen similar to the following: You are ready to study your data without a single line of code.

Load you data from wide range of sources:

Explore your data using strongest statistic technics:

Test the nature of your data:

Transform your data:

Cluster your data:

Identify relationships or affinities:

Experiment with different models on your data, before implementing any of them in your favorite language:

Evaluate quality of your model:

Learn your data!

Upd: If you are interested in it, then I can recommend the following book.

Stanford Log-linear Part-Of-Speech Tagger is available on NuGet

14/07/201325/02/2021F#, Machine Learning and NLP35 Comments

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.

There is one more tool that has become ready on NuGet today. It is a Stanford Log-linear Part-Of-Speech Tagger. This is a third one Stanford NuGet package published by me, previous ones were a “Stanford Parser“ and “Stanford Named Entity Recognizer (NER)“. I have already posted about this tool with guidance on how to recompile it and use from F# (see “NLP: Stanford POS Tagger with F# (.NET)“). Please follow next steps to get started:

Install-Package Stanford.NLP.POSTagger
Download models from The Stanford NLP Group site.
Extract models from ’models‘ folder.
You are ready to start.

F# Sample

For more details see source code on GitHub.

let model = @"..\..\..\..\temp\stanford-postagger-2013-06-20\models\wsj-0-18-bidirectional-nodistsim.tagger"

let tagReader (reader:Reader) =
    let tagger = MaxentTagger(model)
    MaxentTagger.tokenizeText(reader)
    |> Iterable.toSeq
    |> Seq.iter (fun sentence ->
        let tSentence = tagger.tagSentence(sentence :?> List)
        printfn "%O" (Sentence.listToString(tSentence, false))
    )

let tagFile (fileName:string) =
    tagReader (new BufferedReader(new FileReader(fileName)))

let tagText (text:string) =
    tagReader (new StringReader(text))

C# Sample

For more details see source code on GitHub.

public static class TaggerDemo
{
    public const string Model =
        @"..\..\..\..\temp\stanford-postagger-2013-06-20\models\wsj-0-18-bidirectional-nodistsim.tagger";

    private static void TagReader(Reader reader)
    {
        var tagger = new MaxentTagger(Model);
        foreach (List sentence in MaxentTagger.tokenizeText(reader).toArray())
        {
             var tSentence = tagger.tagSentence(sentence);
             System.Console.WriteLine(Sentence.listToString(tSentence, false));
        }
    }

    public static void TagFile (string fileName)
    {
        TagReader(new BufferedReader(new FileReader(fileName)));
    }

    public static void TagText(string text)
    {
        TagReader(new StringReader(text));
    }
}

As a result of both samples you will see the same output. For example, if you start program with these parameters:

1 text "A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads 
text in some language and assigns parts of speech to each word (and other token), 
such as noun, verb, adjective, etc., although generally computational 
applications use more fine-grained POS tags like 'noun-plural'."

Then you will see following on your screen:

A/DT Part-Of-Speech/NNP Tagger/NNP -LRB-/-LRB- POS/NNP Tagger/NNP -RRB-/-RRB- 
is/VBZ a/DT piece/NN of/IN software/NN that/WDT reads/VBZ text/NN in/IN some/DT 
language/NN and/CC assigns/VBZ parts/NNS of/IN speech/NN to/TO each/DT word/NN 
-LRB-/-LRB- and/CC other/JJ token/JJ -RRB-/-RRB- ,/, such/JJ as/IN noun/JJ ,/, 
verb/JJ ,/, adjective/JJ ,/, etc./FW ,/, although/IN generally/RB computational/JJ 
applications/NNS use/VBP more/RBR fine-grained/JJ POS/NNP tags/NNS like/IN `/`` 
noun-plural/JJ '/'' ./.

Stanford Named Entity Recognizer (NER) is available on NuGet

12/07/201325/02/2021F#, Machine Learning and NLP18 Comments

Update (2017, July 24): Links and/or samples in this post might be outdated. The latest version of samples is available on new Stanford.NLP.NET site.

One more tool from Stanford NLP product line became available on NuGet today. It is the second library that was recompiled and published to the NuGet. The first one was the “Stanford Parser“. The second one is Stanford Named Entity Recognizer (NER). I have already posted about this tool with guidance on how to recompile it and use from F# (see “NLP: Stanford Named Entity Recognizer with F# (.NET)“). There are some other interesting things happen, NER is kind of hot topic. I recently saw a question about C# NER on CodeProject, Flo asked me about NER in the comment of another post. So, I am happy to make it wider available. The flow of use is as follows:

Install-Package Stanford.NLP.NER
Download models from The Stanford NLP Group site.
Extract models from ’classifiers‘ folder.
You are ready to start.

F# Sample

F# sample is pretty much the same as in ”NLP: Stanford Named Entity Recognizer with F# (.NET)” post. For more details see source code on GitHub.

let main file =
    let classifier =
        CRFClassifier.getClassifierNoExceptions(
             @"..\..\..\..\temp\stanford-ner-2013-06-20\classifiers\english.all.3class.distsim.crf.ser.gz")
    // For either a file to annotate or for the hardcoded text example,
    // this demo file shows two ways to process the output, for teaching
    // purposes.  For the file, it shows both how to run NER on a String
    // and how to run it on a whole file.  For the hard-coded String,
    // it shows how to run it on a single sentence, and how to do this
    // and produce an inline XML output format.
    match file with
    | Some(fileName) ->
        let fileContents = File.ReadAllText(fileName)
        classifier.classify(fileContents)
        |> Iterable.toSeq
        |> Seq.cast<java.util.List>
        |> Seq.iter (fun sentence ->
            sentence
            |> Iterable.toSeq
            |> Seq.cast<CoreLabel>
            |> Seq.iter (fun word ->
                 printf "%s/%O " (word.word()) (word.get(CoreAnnotations.AnswerAnnotation().getClass()))
            )
            printfn ""
        )
    | None ->
        let s1 = "Good afternoon Rajat Raina, how are you today?"
        let s2 = "I go to school at Stanford University, which is located in California."
        printfn "%s\n" (classifier.classifyToString(s1))
        printfn "%s\n" (classifier.classifyWithInlineXML(s2))
        printfn "%s\n" (classifier.classifyToString(s2, "xml", true));
        classifier.classify(s2)
        |> Iterable.toSeq
        |> Seq.iteri (fun i coreLabel ->
            printfn "%d\n:%O\n" i coreLabel
        )

C# Sample

C# version is quite similar. For more details see source code on GitHub.

class Program
{
    public static CRFClassifier Classifier =
        CRFClassifier.getClassifierNoExceptions(
             @"..\..\..\..\temp\stanford-ner-2013-06-20\classifiers\english.all.3class.distsim.crf.ser.gz");

    // For either a file to annotate or for the hardcoded text example,
    // this demo file shows two ways to process the output, for teaching
    // purposes.  For the file, it shows both how to run NER on a String
    // and how to run it on a whole file.  For the hard-coded String,
    // it shows how to run it on a single sentence, and how to do this
    // and produce an inline XML output format.

    static void Main(string[] args)
    {
        if (args.Length > 0)
        {
            var fileContent = File.ReadAllText(args[0]);
            foreach (List sentence in Classifier.classify(fileContent).toArray())
            {
                foreach (CoreLabel word in sentence.toArray())
                {
                    Console.Write( "{0}/{1} ", word.word(), word.get(new CoreAnnotations.AnswerAnnotation().getClass()));
                }
                Console.WriteLine();
            }
        } else
        {
            const string S1 = "Good afternoon Rajat Raina, how are you today?";
            const string S2 = "I go to school at Stanford University, which is located in California.";
            Console.WriteLine("{0}\n", Classifier.classifyToString(S1));
            Console.WriteLine("{0}\n", Classifier.classifyWithInlineXML(S2));
            Console.WriteLine("{0}\n", Classifier.classifyToString(S2, "xml", true));

            var classification = Classifier.classify(S2).toArray();
            for (var i = 0; i < classification.Length; i++)
            {
                Console.WriteLine("{0}\n:{1}\n", i, classification[i]);
            }
        }
    }
}

As a result of both samples you will see the following output:

Don/PERSON Syme/PERSON is/O an/O Australian/O computer/O scientist/O and/O a/O 
Principal/O Researcher/O at/O Microsoft/ORGANIZATION Research/ORGANIZATION ,/O 
Cambridge/LOCATION ,/O U.K./LOCATION ./O He/O is/O the/O designer/O and/O 
architect/O of/O the/O F/O #/O programming/O language/O ,/O described/O by/O 
a/O reporter/O as/O being/O regarded/O as/O ``/O the/O most/O original/O new/O 
face/O in/O computer/O languages/O since/O Bjarne/PERSON Stroustrup/PERSON 
developed/O C/O +/O +/O in/O the/O early/O 1980s/O ./O
Earlier/O ,/O Syme/PERSON created/O generics/O in/O the/O ./O NET/O Common/O 
Language/O Runtime/O ,/O including/O the/O initial/O design/O of/O generics/O 
for/O the/O C/O #/O programming/O language/O ,/O along/O with/O others/O 
including/O Andrew/PERSON Kennedy/PERSON and/O later/O Anders/PERSON 
Hejlsberg/PERSON ./O Kennedy/PERSON ,/O Syme/PERSON and/O Yu/PERSON also/O 
formalized/O this/O widely/O used/O system/O ./O
He/O holds/O a/O Ph.D./O from/O the/O University/ORGANIZATION of/ORGANIZATION 
Cambridge/ORGANIZATION ,/O and/O is/O a/O member/O of/O the/O WG2/O .8/O 
working/O group/O on/O functional/O programming/O ./O He/O is/O a/O co-author/O 
of/O the/O book/O Expert/O F/O #/O 2.0/O ./O
In/O the/O past/O he/O also/O worked/O on/O formal/O specification/O ,/O 
interactive/O proof/O ,/O automated/O verification/O and/O proof/O description/O 
languages/O ./O

Stanford Parser is available on NuGet for F# and C#

11/07/201325/02/2021F#, Machine Learning and NLP55 Comments

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.

I have already wrote small series of posts about porting of Stanford NLP Products to .NET using IKVM.NET. The first was about Stanford Parser “NLP: Stanford Parser with F# (.NET)“. It shows how to recompile and use parser from F#. Recently I wrote one more post “FSharp.NLP.Stanford.Parser available on NuGet” that announced already recompiled version of Stanford Parser included into NuGet package with some helpers functionality for F# devs.

As I see, it is still not so simple as it should be. I’ve seen sometimes questions from C# guys about different NLP tasks with answers pointing to my “The Stanford Natural Language Processing Samples, in F#” repository (like this). Probably, it is no so easy to find the latest version of IKVM.NET Compiler (it is not included into IKVM.NET NuGet package) and manage to quickly rebuild Stanford Parser from the scratch for the first time.

I have decided to create a NuGet package for clear porting of Stanford Parser to .NET with strongly signed assemblies and without dependencies to F#. My primary goal has been to find a clear, simple and intuitive way to try NLP magic from .NET for all NLP lovers. Now, it is simpler then ever:

Install-Package Stanford.NLP.Parser
Download models from The Stanford NLP Group site.
Extract models from ‘stanford-parser-3.2.0-models.jar‘ (just unzip it)
You are ready to start.

F# Sample

F# sample is not much different from one mentioned in “NLP: Stanford Parser with F# (.NET)” post. For more details see source code on GitHub.

let demoDP (lp:LexicalizedParser) (fileName:string) =
    // This option shows loading and sentence-segment and tokenizing
    // a file using DocumentPreprocessor
    let tlp = PennTreebankLanguagePack();
    let gsf = tlp.grammaticalStructureFactory();
    // You could also create a tokenizer here (as below) and pass it
    // to DocumentPreprocessor
    DocumentPreprocessor(fileName)
    |> Iterable.toSeq
    |> Seq.cast<List>
    |> Seq.iter (fun sentence ->
        let parse = lp.apply(sentence);
        parse.pennPrint();

        let gs = gsf.newGrammaticalStructure(parse);
        let tdl = gs.typedDependenciesCCprocessed(true);
        printfn "\n%O\n" tdl
    )

let demoAPI (lp:LexicalizedParser) =
    // This option shows parsing a list of correctly tokenized words
    let sent = [|"This"; "is"; "an"; "easy"; "sentence"; "." |]
    let rawWords = Sentence.toCoreLabelList(sent)
    let parse = lp.apply(rawWords)
    parse.pennPrint()

    // This option shows loading and using an explicit tokenizer
    let sent2 = "This is another sentence."
    let tokenizerFactory = PTBTokenizer.factory(CoreLabelTokenFactory(), "")
    use sent2Reader = new StringReader(sent2)
    let rawWords2 = tokenizerFactory.getTokenizer(sent2Reader).tokenize()
    let parse = lp.apply(rawWords2)

    let tlp = PennTreebankLanguagePack()
    let gsf = tlp.grammaticalStructureFactory()
    let gs = gsf.newGrammaticalStructure(parse)
    let tdl = gs.typedDependenciesCCprocessed()
    printfn "\n%O\n" tdl

    let tp = new TreePrint("penn,typedDependenciesCollapsed")
    tp.printTree(parse)

let main fileName =
    let lp = LexicalizedParser.loadModel(@"...\englishPCFG.ser.gz")
    match fileName with
    | Some(file) -> demoDP lp file
    | None -> demoAPI lp

C# Sample

C# version is quite similar. For more details see source code on GitHub.

public static class ParserDemo
{
    public static void DemoDP(LexicalizedParser lp, string fileName)
    {
        // This option shows loading and sentence-segment and tokenizing
        // a file using DocumentPreprocessor
        var tlp = new PennTreebankLanguagePack();
        var gsf = tlp.grammaticalStructureFactory();
        // You could also create a tokenizer here (as below) and pass it
        // to DocumentPreprocessor
        foreach (List sentence in new DocumentPreprocessor(fileName))
        {
            var parse = lp.apply(sentence);
            parse.pennPrint();

            var gs = gsf.newGrammaticalStructure(parse);
            var tdl = gs.typedDependenciesCCprocessed(true);
            System.Console.WriteLine("\n{0}\n", tdl);
        }
    }

    public static void DemoAPI(LexicalizedParser lp)
    {
        // This option shows parsing a list of correctly tokenized words
        var sent = new[] { "This", "is", "an", "easy", "sentence", "." };
        var rawWords = Sentence.toCoreLabelList(sent);
        var parse = lp.apply(rawWords);
        parse.pennPrint();

        // This option shows loading and using an explicit tokenizer
        const string Sent2 = "This is another sentence.";
        var tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "");
        var sent2Reader = new StringReader(Sent2);
        var rawWords2 = tokenizerFactory.getTokenizer(sent2Reader).tokenize();
        parse = lp.apply(rawWords2);

        var tlp = new PennTreebankLanguagePack();
        var gsf = tlp.grammaticalStructureFactory();
        var gs = gsf.newGrammaticalStructure(parse);
        var tdl = gs.typedDependenciesCCprocessed();
        System.Console.WriteLine("\n{0}\n", tdl);

        var tp = new TreePrint("penn,typedDependenciesCollapsed");
        tp.printTree(parse);
    }

    public static void Start(string fileName)
    {
         var lp =LexicalizedParser.loadModel(Program.ParserModel);
         if (!String.IsNullOrEmpty(fileName))
              DemoDP(lp, fileName);
         else
              DemoAPI(lp);
    }
}

As a result of both samples you will see the following output:

Loading parser from serialized file ..\..\..\..\StanfordNLPLibraries\
stanford-parser\stanford-parser-2.0.4-models\englishPCFG.ser.gz ... 
done [1.5 sec].
(ROOT
 (S
 (NP (DT This))
 (VP (VBZ is)
 (NP (DT an) (JJ easy) (NN sentence)))
 (. .)))

[nsubj(sentence-4, This-1), cop(sentence-4, is-2), det(sentence-4, another-3), 
root(ROOT-0, sentence-4)]
(ROOT
 (S
 (NP (DT This))
 (VP (VBZ is)
 (NP (DT another) (NN sentence)))
 (. .)))
nsubj(sentence-4, This-1)
cop(sentence-4, is-2)
det(sentence-4, another-3)
root(ROOT-0, sentence-4)