Update (2017, July 24): Links and/or samples in this post might be outdated. The latest version of samples is available on new Stanford.NLP.NET site.
One more tool from Stanford NLP product line became available on NuGet today. It is the second library that was recompiled and published to the NuGet. The first one was the “Stanford Parser“. The second one is Stanford Named Entity Recognizer (NER). I have already posted about this tool with guidance on how to recompile it and use from F# (see “NLP: Stanford Named Entity Recognizer with F# (.NET)“). There are some other interesting things happen, NER is kind of hot topic. I recently saw a question about C# NER on CodeProject, Flo asked me about NER in the comment of another post. So, I am happy to make it wider available. The flow of use is as follows:
- Install-Package Stanford.NLP.NER
- Download models from The Stanford NLP Group site.
- Extract models from ’classifiers‘ folder.
- You are ready to start.
F# Sample
F# sample is pretty much the same as in ”NLP: Stanford Named Entity Recognizer with F# (.NET)” post. For more details see source code on GitHub.
let main file =
let classifier =
CRFClassifier.getClassifierNoExceptions(
@"..\..\..\..\temp\stanford-ner-2013-06-20\classifiers\english.all.3class.distsim.crf.ser.gz")
// For either a file to annotate or for the hardcoded text example,
// this demo file shows two ways to process the output, for teaching
// purposes. For the file, it shows both how to run NER on a String
// and how to run it on a whole file. For the hard-coded String,
// it shows how to run it on a single sentence, and how to do this
// and produce an inline XML output format.
match file with
| Some(fileName) ->
let fileContents = File.ReadAllText(fileName)
classifier.classify(fileContents)
|> Iterable.toSeq
|> Seq.cast<java.util.List>
|> Seq.iter (fun sentence ->
sentence
|> Iterable.toSeq
|> Seq.cast<CoreLabel>
|> Seq.iter (fun word ->
printf "%s/%O " (word.word()) (word.get(CoreAnnotations.AnswerAnnotation().getClass()))
)
printfn ""
)
| None ->
let s1 = "Good afternoon Rajat Raina, how are you today?"
let s2 = "I go to school at Stanford University, which is located in California."
printfn "%s\n" (classifier.classifyToString(s1))
printfn "%s\n" (classifier.classifyWithInlineXML(s2))
printfn "%s\n" (classifier.classifyToString(s2, "xml", true));
classifier.classify(s2)
|> Iterable.toSeq
|> Seq.iteri (fun i coreLabel ->
printfn "%d\n:%O\n" i coreLabel
)
C# Sample
C# version is quite similar. For more details see source code on GitHub.
class Program
{
public static CRFClassifier Classifier =
CRFClassifier.getClassifierNoExceptions(
@"..\..\..\..\temp\stanford-ner-2013-06-20\classifiers\english.all.3class.distsim.crf.ser.gz");
// For either a file to annotate or for the hardcoded text example,
// this demo file shows two ways to process the output, for teaching
// purposes. For the file, it shows both how to run NER on a String
// and how to run it on a whole file. For the hard-coded String,
// it shows how to run it on a single sentence, and how to do this
// and produce an inline XML output format.
static void Main(string[] args)
{
if (args.Length > 0)
{
var fileContent = File.ReadAllText(args[0]);
foreach (List sentence in Classifier.classify(fileContent).toArray())
{
foreach (CoreLabel word in sentence.toArray())
{
Console.Write( "{0}/{1} ", word.word(), word.get(new CoreAnnotations.AnswerAnnotation().getClass()));
}
Console.WriteLine();
}
} else
{
const string S1 = "Good afternoon Rajat Raina, how are you today?";
const string S2 = "I go to school at Stanford University, which is located in California.";
Console.WriteLine("{0}\n", Classifier.classifyToString(S1));
Console.WriteLine("{0}\n", Classifier.classifyWithInlineXML(S2));
Console.WriteLine("{0}\n", Classifier.classifyToString(S2, "xml", true));
var classification = Classifier.classify(S2).toArray();
for (var i = 0; i < classification.Length; i++)
{
Console.WriteLine("{0}\n:{1}\n", i, classification[i]);
}
}
}
}
As a result of both samples you will see the following output:
Don/PERSON Syme/PERSON is/O an/O Australian/O computer/O scientist/O and/O a/O Principal/O Researcher/O at/O Microsoft/ORGANIZATION Research/ORGANIZATION ,/O Cambridge/LOCATION ,/O U.K./LOCATION ./O He/O is/O the/O designer/O and/O architect/O of/O the/O F/O #/O programming/O language/O ,/O described/O by/O a/O reporter/O as/O being/O regarded/O as/O ``/O the/O most/O original/O new/O face/O in/O computer/O languages/O since/O Bjarne/PERSON Stroustrup/PERSON developed/O C/O +/O +/O in/O the/O early/O 1980s/O ./O Earlier/O ,/O Syme/PERSON created/O generics/O in/O the/O ./O NET/O Common/O Language/O Runtime/O ,/O including/O the/O initial/O design/O of/O generics/O for/O the/O C/O #/O programming/O language/O ,/O along/O with/O others/O including/O Andrew/PERSON Kennedy/PERSON and/O later/O Anders/PERSON Hejlsberg/PERSON ./O Kennedy/PERSON ,/O Syme/PERSON and/O Yu/PERSON also/O formalized/O this/O widely/O used/O system/O ./O He/O holds/O a/O Ph.D./O from/O the/O University/ORGANIZATION of/ORGANIZATION Cambridge/ORGANIZATION ,/O and/O is/O a/O member/O of/O the/O WG2/O .8/O working/O group/O on/O functional/O programming/O ./O He/O is/O a/O co-author/O of/O the/O book/O Expert/O F/O #/O 2.0/O ./O In/O the/O past/O he/O also/O worked/O on/O formal/O specification/O ,/O interactive/O proof/O ,/O automated/O verification/O and/O proof/O description/O languages/O ./O

Nowadays, Atlassian products become more and more popular. Different companies and teams start using Jira and Confluence for project management. It would be good to have an ability to communicate with these services from .NET. As you probably know, Jira and Confluence are pure Java applications. Both applications provide SOAP and REST services. REST is a new target for Atlassian, they focused on it and do not touch SOAP anymore. So SOAP services live with all their bugs inside and even 







