Sergey Tihon's Blog

Dropbox for .NET developers

29/09/201329/09/2013F#14 Comments

Some days ago, I was faced with the task of developing Dropbox connector that should be able to enumerate and download files from Dropbox. The ideal case for me is a wrapper library for .NET 3.5 with an ability to authorize in Dropbox without user interaction. This is a list of .NET libraries/components that are currently available:

Sprint.NET and Xamarin component are not my options for now. DropNet also does not fit my needs, because it is .NET 4+ only. But if your application is for .NET 4+, then DropNet should be the best choice for you. I chose SharpBox, it looks like a dead project – no commits since 2011, but nevertheless the latest version is available on NuGet.

At the beginning, you need to go to Dropbox App Console and create a new app. Click on “Create app” button and answer to the questions like in the picture below.

When you finish all these steps, you will get an App key and App secret, please copy them somewhere – you will need them in future. Now we are ready to create our application. Let’s create a new F# project and add AppLimit.CloudComputing.SharpBox package from NuGet.

After package is downloaded, go to packages\AppLimit.CloudComputing.SharpBox.1.2.0.542\lib\net40-full folder, find and start DropBoxTokenIssuer.exe application.

Fill Application Key and Application Secret with values that you received during app creation, fill Output-File path with c:\token.txt and click “Authorize”. Wait some seconds(depends on your Internet connection) and follow the steps that will appear in browser control on the form – you will need to authorize in Dropbox with your Dropbox account and grant access to your files for your app. When file with your token will be created, you can click on “Test Token” button to make sure that it is correct.

Using token file, you are able to work with Dropbox files without direct user interaction, as shown in the sample below:

open System.IO
open AppLimit.CloudComputing.SharpBox

[<EntryPoint>]
let main argv =
    let dropBoxStorage = new CloudStorage()
    let dropBoxConfig = CloudStorage.GetCloudConfigurationEasy(nSupportedCloudConfigurations.DropBox)
    // load a valid security token from file
    use fs = File.Open(@"C:\token.txt", FileMode.Open, FileAccess.Read, FileShare.None)
    let accessToken = dropBoxStorage.DeserializeSecurityToken(fs)
    // open the connection
    let storageToken = dropBoxStorage.Open(dropBoxConfig, accessToken);

    for folder in dropBoxStorage.GetRoot() do
        printfn "%s" (folder.Name)

    dropBoxStorage.Close()
    0

F# Weekly #38, 2013

23/09/201304/10/2013F# Weekly1 Comment

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

F# support will be standard part of Xamarin studio (not a plugin).
New F# Refactor integration in FSharpBinding.
Phillip Trelford presented Mario jumps on FunScript (source code)
TsunamiIDE updated the run button.
Monodevelop 4.1.9 was released on Gentoo.

Videos/Presentations

“Linear Data Structures” by Jack Fox. (slides)
“FSharp Agents” by Rachel Reese (slides and code)
Episode 283: Rachel Reese on F# Agents
“F# Server-side programming” by Dave Thomas.
“F# in the enterprise” by Dave Thomas.

Blogs

Isaac Abraham wrote “Lightweight callsite caching with F# and C#“.
EFYTimes.com posted “10 Programming Languages That Will Change The IT World“.
Don Syme wrote about “A New F# Meetup – The Paris F# Group“.
Michael Newton blogged “Teaching F# to C# Devs“.
Tomas Petricek posted “How many tuple types are there in C#?“.
Nicolas wrote about “Continuation Passing Style“.
Neil Danson posted “Simple SpriteKit demo in F#“.
Dave Thomas blogged “Spritekit Particle Fun“.
Jon Harrop wrote about “Downloading stock prices“.

That’s all for now. Have a great week.

Previous F# Weekly edition – #37

F# Weekly #37 2013

16/09/2013F# Weekly1 Comment

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

FSharp.Data 1.1.10 was released with a bunch of bugfixes.
RProvider 1.0.1 was released with latest R.NET and RDotNet.FSharp inside.
Francesco De Vittori works on type inference for Smalltalk.
Immo Landwerth announced that “Immutable collections are now RC“.

Blogs

Visual Studio FSharp Team posted “Visual Studio 2013 RC and Send to FSI“.
Sergey Tihon blogged “Stanford Word Segmenter is available on NuGet“.
Phil Trelford wrote about “k-means clustering“.
Valdis Iljuconoks posted “Matching the patterns“.
Danny Tuppeny ‏blogged “Opt-in Nulls; an F# Feature Worth Switching For?“.
Jon Harrop wrote
- “Ten F# one-liners to impress your friends“
- “A simple Fast Fourier Transform (FFT)” (another FFT sample code)
- “Color wheel“
Rachel Appel wrote “Understanding Your Language Choices for Developing Modern Apps“.
brianberns posted “F# language warts“.
Kapil Garg shared “FSharp -Top tweets (10/09/2013)“.

That’s all for now. Have a great week.

Previous F# Weekly edition – #36

Stanford Word Segmenter is available on NuGet

09/09/201325/02/2021F#, Machine Learning and NLP2 Comments

Update (2014, January 3): Links and/or samples in this post might be outdated. The latest version of samples are available on new Stanford.NLP.NET site.

Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages require more extensive token pre-processing, which is usually called segmentation.

The Stanford Word Segmenter currently supports Arabic and Chinese. The provided segmentation schemes have been found to work well for a variety of applications.

One more tool from Stanford NLP Software Package become ready on NuGet today. It is a Stanford Word Segmenter. This is a fourth one Stanford NuGet package published by me, previous ones were a “Stanford Parser“, “Stanford Named Entity Recognizer (NER)” and “Stanford Log-linear Part-Of-Speech Tagger“. Please follow next steps to get started:

Install-Package Stanford.NLP.Segmenter
Download models from The Stanford NLP Group site.
Extract models from ’data‘ folder.
You are ready to start.

F# Sample

For more details see source code on GitHub.

open java.util
open edu.stanford.nlp.ie.crf

[<EntryPoint>]
let main argv =
if (argv.Length <> 1) then
printf "usage: StanfordSegmenter.Csharp.Samples.exe filename"
else
let props = Properties();
props.setProperty("sighanCorporaDict", @"..\..\..\..\temp\stanford-segmenter-2013-06-20\data") |> ignore
props.setProperty("serDictionary", @"..\..\..\..\temp\stanford-segmenter-2013-06-20\data\dict-chris6.ser.gz") |> ignore
props.setProperty("testFile", argv.[0]) |> ignore
props.setProperty("inputEncoding", "UTF-8") |> ignore
props.setProperty("sighanPostProcessing", "true") |> ignore

let segmenter = CRFClassifier(props)
segmenter.loadClassifierNoExceptions(@"..\..\..\..\temp\stanford-segmenter-2013-06-20\data\ctb.gz", props)
segmenter.classifyAndWriteAnswers(argv.[0])
0

C# Sample

For more details see source code on GitHub.

using java.util;
using edu.stanford.nlp.ie.crf;

namespace StanfordSegmenter.Csharp.Samples
{
class Program
{
static void Main(string[] args)
{
if (args.Length != 1)
{
System.Console.WriteLine("usage: StanfordSegmenter.Csharp.Samples.exe filename");
return;
}

var props = new Properties();
props.setProperty("sighanCorporaDict", @"..\..\..\..\temp\stanford-segmenter-2013-06-20\data");
props.setProperty("serDictionary", @"..\..\..\..\temp\stanford-segmenter-2013-06-20\data\dict-chris6.ser.gz");
props.setProperty("testFile", args[0]);
props.setProperty("inputEncoding", "UTF-8");
props.setProperty("sighanPostProcessing", "true");

var segmenter = new CRFClassifier(props);
segmenter.loadClassifierNoExceptions(@"..\..\..\..\temp\stanford-segmenter-2013-06-20\data\ctb.gz", props);
segmenter.classifyAndWriteAnswers(args[0]);
}
}
}

F# Weekly #36 2013

09/09/201309/09/2013F# Weekly3 Comments

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

New feature was proposed to F# – Implement interface by expression (example is here)
New F# Cheatsheet in PDF and HTML format using FSharp.Formatting tool was published.
James announced a new F# game Twaddle(FunScript) that is available on Android.
F# User Group Sydney was registered.
Taha Hachana presented “Geolocation Options“.
Canopy was updated up to 0.8.5.

Videos/Presentations

“F# Driver for MongoDB” by Max Hirschorn.
“Functional Programming, F#, Type Providers and Dynamic Languages” by Richard Minerich.
“Windows Phone 7.5 Application Development with F#” by Lohith G.N.

Blogs

Mathias Brandewinder blogged “First steps with Accord.NET SVM in F#“.
Mathias Brandewinder wrote “Field notes from the F# tour“.
Jon Harrop wrote “F# for Visualization update“.
bratfizyk posted “Learning F# – Graph Coloring using Microsoft Solver Foundation“.
Faisal Waris wrote about “Talking to your car – with OpenXC, Android, Xamarin & F#“.
Paulmichael Blasucci blogged “The Month NYC Ran Out of Excuses Not to Learn F#“.
Phil Trelford posted “Try 10 Programming Languages in 10 minutes“.
Christian Horsdal published “Playtime: Riak, Azure, F#“.
Nikos Baxevanis blogged “How to enable Code Analysis for F# projects“.
Kapil Garg wrote “Top Tweets“.
Yan Cui published “Binary and JSON serializer benchmarks updated“.

That’s all for now. Have a great week.

Previous F# Weekly edition – #35

F# Weekly #35 2013

02/09/2013F# Weekly1 Comment

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

Canopy 0.8.0 was released with PhantomJS support.
Brahma.FSharp(F# quotation to OpenCL translator) is available on NuGet.
Will Smith embedded FSI in Quake3 (part 2).
Paris F# User Group was registered and growing up.
Code, build and run – WebSharper mobile apps in CloudSharper
Taha Hachana presented “Native HTML5 Drag & Drop“.

Videos/Presentations

“But we are a C# shop” by ericksoa07.
“Piglets to the rescue” by Loic Denuziere, Ernesto Rodriguez, Adam Granicz.

Blogs

Onorio Catenacci wrote “First Detroit F# Meetup“.
Phil Trelford posted “F# on Android“.
Tomas Petricek shared “Hello New York. Learn some F#!“.
Onorio Catenacci wrote “F# Tip of the Week (26 August 2013)“
Phil Trelford posted “Balls“.
Tomas Petricek shared “Update on the F# Deep Dives book“.
Dmitry Morozov posted “More on “Clarity of Intent”“.
Dmitry Morozov posted “Progressive F# Tutorials Teaser“.
Anton Kropp published “ParsecClone on nuget“.
Phil Trelford wrote “Building a game in a day“.
Mauricio Scheffer shared “Objects and functional programming“.
Mathias Brandewinder blogged “CSV Type Provider, now with more awesome“.
Max Hirschhorn wrote about “Enhancing the F# developer experience with MongoDB“
Sergey Tihon posted “MSR-SPLAT Overview for F#“.

P.S. 100 Most Influential Books According to Stack Overflow.

That’s all for now. Have a great week.

Previous F# Weekly edition – #34

MSR-SPLAT Overview for F# (.NET NLP)

01/09/201325/02/2021Machine Learning and NLP1 Comment

Some weeks ago, Microsoft Research announced NLP toolkit called MSR SPLAT. It is time to play with it and take a look what it can do.

Statistical Parsing and Linguistic Analysis Toolkit is a linguistic analysis toolkit. Its main goal is to allow easy access to the linguistic analysis tools produced by the Natural Language Processing group at Microsoft Research. The tools include both traditional linguistic analysis tools such as part-of-speech taggers and parsers, and more recent developments, such as sentiment analysis (identifying whether a particular of text has positive or negative sentiment towards its focus)

SPLAT has a nice Silverlight DEMO app that lets you try all available functionality.

SPLAT also has WCF and RESTful endpoints, but if you want to use them, you need to request an access key(please email to Pallavi Choudhury). For more details, please read an overview article “MSR SPLAT, a language analysis toolkit“.

Important links:

MSR SPLAT official project page.
Silverlight DEMO app deployed to Windows Azure.
Article: “MSR SPLAT, a language analysis toolkit“

Test Drive

I have received my GUID with example of using Json service from C# that you can find below.

private static void CallSplatJsonService()
{
    var requestStr = String.Format("http://msrsplat.cloudapp.net/SplatServiceJson.svc/Analyzers?language={0}&json=x", "en");

    string language = "en";
    string input = "I live in Seattle";
    string analyzerList = "POS_tags,Tokens";
    string appId = "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX";

    string requestAnanlyse = String.Format("http://msrsplat.cloudapp.net/SplatServiceJson.svc/Analyze?language={0}&analyzers={1}&appId={2}&json=x&input={3}",
        language, analyzerList, appId, input);

    var request = WebRequest.Create(requestAnanlyse);
    request.ContentType = "application.json; charset=utf-8";
    request.Method = "GET";
    string postData = String.Format("/{0}?language={1}&json=x","Analyzers", "en");

    using(Stream s = request.GetResponse().GetResponseStream())
    {
        using(StreamReader sr = new StreamReader(s))
        {
            var jsonData = sr.ReadToEnd();
            Console.WriteLine(jsonData);
        }
    }
}

In following samples, I used WCF endpoint since WsdlService Type Provider can dramatically simplify access to the service.

#r "FSharp.Data.TypeProviders.dll"
#r "System.ServiceModel.dll"
#r "System.Runtime.Serialization.dll"
open System
open Microsoft.FSharp.Data.TypeProviders

type MSRSPLAT = WsdlService<"http://msrsplat.cloudapp.net/SplatService.svc?wsdl">
let splat = MSRSPLAT.GetBasicHttpBinding_ISplatService()

In the first call we ask the SPLAT to return list of supported languages splat.Languages() and you will see [|”en”; “bg”|] (English and Bulgarian). The mystical Bulgaria… I do not know why, but NLP guys like Bulgaria. There is something special for NLP :).

The Stanford NLP Group has relocated in Bulgaria … temporarily.

— Stanford NLP Group (@stanfordnlp) August 3, 2013

The next call is splat.Analyzers(“en”) that returns list of all analyzers that are available for English language (All of them are available from DEMO app)

“Base Forms-LexToDeriv-DerivFormsC#”
“Chunker-SpecializedChunks-ChunkerC++”
“Constituency_Forest-PennTreebank3-SplitMerge”
“Constituency_Tree-PennTreebank3-SplitMerge”
“Constituency_Tree_Score-Score-SplitMerge”
“CoRef-PennTreebank3-UsingMentionsAndHeadFinder”
“Dependency_Tree-PennTreebank3-ConvertFromConstTree”
“Katakana_Transliterator-Katakana_to_English-Perceptron”
“Lemmas-LexToLemma-LemmatizerC#”
“Named_Entities-CONLL-CRF”
“POS_Tags-PennTreebank3-cmm”
“Semantic_Roles-PropBank-kristout”
“Semantic_Roles_Scores-PropBank-kristout”
“Sentiment-PosNeg-MaxEntClassifier”
“Stemmer-PorterStemmer-PorterStemmerC#”
“Tokens-PennTreebank3-regexes”
“Triples-SimpleTriples-ExtractFromDeptree”

This is a list of full names of analyzers that are available for now. The part of the analyzer’s name that you have to pass to the service to perform corresponding analysis is highlighted in bold. To perform the analysis, you need to have an access guid and pass it as an email to splat.Analyze method. It is probably a typo, but as it is. Let’s call all analyzers on the one of our favorite sentences “All your types are belong to us” and look at the result.

let appId = "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"

let analyzers = String.Join(",", splat.Analyzers("en")
                |> Array.map (fun s -> s.Split([|'-'|]).[0]))
let text = "All your types are belong to us"
let bag = splat.Analyze("en", analyzers, text, appId)

bag.Analyses

The result is

[["0-all","1-you","2-types","3-are","4-belong","5-to","6-us"]]";
  "["[NP All your types] [VP are] [VP belong] [PP to] [NP us] \u000a"]";
  "["@@All your types are belong to us\u000d\u000a0\u0009G_DT..."]";
  "["(TOP (S (NP (PDT All) (PRP$ your) (NNS types)) (VP (VBP are) (VP (VB belong) (PP (TO to) (NP (PRP us)))))))"]";
  "[-2.2476917857427452]";
  "[[{"LengthInTokens":3,"Sentence":0,"StartTokenOffset":0}],[{"LengthInTokens":1,"Sentence":0,"StartTokenOffset":1}],[{"LengthInTokens":1,"Sentence":0,"StartTokenOffset":6}]]";
  "[[{"Parent":3,"Tag":"PDT","Word":"All"},{"Parent":3,"Tag":"PRP$","Word":"your"},{"Parent":4,"Tag":"NNS","Word":"types"},{"Parent":0,"Tag":"VBP","Word":"are"},{"Parent":4,"Tag":"VB","Word":"belong"},{"Parent":5,"Tag":"TO","Word":"to"},{"Parent":6,"Tag":"PRP","Word":"us"}]]";
  "["14.50%: アリオータイプサレベロングタス","13.27%: オールユアタイプサレベロングタス","13.26%: アルユアタイプサレベロングタス","13.26%: アリオールタイプサレベロングタス","11.34%: アリオウルタイプサレベロングタス","7.81%: アルルユアタイプサレベロングタス","7.10%: アリアウータイプサレベロングタス","6.60%: アリアウルタイプサレベロングタス","6.46%: アリーオータイプサレベロングタス","6.40%: アリオータイプサリベロングタス"]";
  "[["All","your","type","are","belong","to","us"]]";
  "[{"Len":0,"Offset":0,"Tokens":[]}]";
  "[["DT","PRP$","NNS","VBP","IN","TO","PRP"]]";
  "[["4-4\/belong[A1=0-2\/All_your_types, A1=5-6\/to_us]"]]";
  "[[-0.33393750773577313]]";
  "{"Classification":"pos","Probability":0.59141720028208355}";
  "[["All","your","type","ar","belong","to","us"]]";
  "[{"Len":31,"Offset":0,"Tokens":[{"Len":3,"NormalizedToken":"All","Offset":0,"RawToken":"All"},{"Len":4,"NormalizedToken":"your","Offset":4,"RawToken":"your"},{"Len":5,"NormalizedToken":"types","Offset":9,"RawToken":"types"},{"Len":3,"NormalizedToken":"are","Offset":15,"RawToken":"are"},{"Len":6,"NormalizedToken":"belong","Offset":19,"RawToken":"belong"},{"Len":2,"NormalizedToken":"to","Offset":26,"RawToken":"to"},{"Len":2,"NormalizedToken":"us","Offset":29,"RawToken":"us"}]}]";
  "[["are_belong_to(types, us)"]]"|]

As you see, service returns result as string[]. All result strings are readable for human eyes and formatted according to “NLP standards”, but some of them are really hard to parse programmatically. FSharp.Data and JSON Type Provider can help with strings that contain correct Json objects.

For example, if you need to use “Sentiment-PosNeg-MaxEntClassifier” analyzer in strongly typed way, then you can do it as follows:

#r @"..\packages\FSharp.Data.1.1.9\lib\net40\FSharp.Data.dll"
open FSharp.Data

type SentimentsProvider = JsonProvider<""" {"Classification":"pos","Probability":0.59141720028208355} """>

let bag2 = splat.Analyze("en", "Sentiment", "I love F#.", appId)
let sentiments = SentimentsProvider.Parse(bag2.Analyses.[0])

printfn "Class:'%s' Probability:'%M'"
    (sentiments.Classification) (sentiments.Probability)

For analyzers like “Constituency_Tree-PennTreebank3-SplitMerge” you need to write custom parser that proceses bracket expression (“(TOP (S (NP (PDT All) (PRP$ your) (NNS types)) (VP (VBP are) (VP (VB belong) (PP (TO to) (NP (PRP us)))))))”) and builds a tree for you. If you are lazy to do it yourself (you should be so), you can download SilverlightSplatDemo.xap and decompile source code. All parsers are already implemented there for DEMO app. But this approach is not so easy as it should be.

Summary

MSR SPLAT looks like a really powerful and promising toolkit. I hope that it continues growing.

The only wish is an API improvement. I think there should be possible to use services in a strongly typed way. The easiest way is to add an ability to get all results as Json without any cnf forms and so on. Also it can be achieved by changing WCF service and exposing analysis results in a typed way instead of string[].

F# Weekly #34 2013

26/08/201326/08/2013F# Weekly1 Comment

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

New Xamarin.Android 4.8 was released with F# Support!
Full suite of F# project options in Xamarin Studio on Mac OS X: create apps, unit tests, target OpenGL.
TsunamiIDE announced Tsunami Chrome Extension.
PowerShellTypeProvider with SharePoint 2013 support is available on Nuget.
Minsk F# User Group was registered.
Foq 1.1 was released.
The {m}brace team has been selected to present at the PLOS 2013 workshop!
Increased demand for F#-devs in the UK during 2013.
Vote if you are interested in F# covariance/contravariance.
Taha Hachana presented “Navigation using the Location Object“.
Taha Hachana presented “HTML5 charting with TsunamiIDE“.
New open opportunities were added to Type Providers lists.

Videos/Presentations

“Dependent Type Providers” by David Raymond Christiansen.

Blogs

Anton Kropp wrote “Coding Dojo: a gentle introduction to Machine Learning with F# review“.
Onorio Catenacci shared “F# Tip Of The Week (Week of August 19, 2013)“.
Steve Gilham posted “F# asynchrony and Task -> Task<unit> conversion“.
Matthew Adams blogged “Dealing with Repetitive Tasks – Recursion in F#“.
Lou Franco wrote about “The 0-Liner“.
Thomas Jaskula posted “From OOP to FP : About dependency injection and higher-order functions“.
Samuel Neff wrote “Machine Learning with F# and C# side-by-side“.

That’s all for now. Have a great week.

Previous F# Weekly edition – #33

F# Weekly #33 2013

19/08/2013F# Weekly1 Comment

Welcome to F# Weekly,

A roundup of F# content from this past week:

News

New {m}brace version was released with cloud disposables and brand new documentation.
Fantomas extension v0.7.0 is available in VS gallery with new features and various improvements.
F# User Groups Map was updated.
MonoDevelop 4.0.12 was released.
Over 180 people were registered to an F# meetup in Boston!

Videos/Presentations

“Generative Art Hands On with F#” by Phillip Trelford. (code samples)
“Introduction To Stateful Monads In F#” by James Litsios.

Blogs

Jack Fox posted “Gaining FsCheck Fluency through Transparency“.
Onorio Catenacci wrote “F# Tip Of The Week (Week of August 12, 2013)“.
Richard Minerich blogged “All Machine Learning Platforms are Terrible (but some less so)“.
Neil Danson shared “F# and Monogame Part 4 – Content Pipeline“.
Thomas Jaskula posted “My 2 weeks trip from OOP to FP with F#“.
Anton Kropp wrote “F# class getter fun“.
Faisal Waris published “FAKE Script for ClickOnce Packaging of F# Apps“.
Gary Evans blogged “RunKeeper Visualizations“.
Phillip Trelford posted “Generative Art“.
Phillip Trelford posted “The Kids Are Alright“.
Dawid Kowalski blogged “With F# up to speed – month later“.

That’s all for now. Have a great week.

Previous F# Weekly edition – #32

Sergey Tihon's Blog

Running Home Assistant & Matter Server on a UGREEN NAS: A Deep Dive into Thread Device Commissioning

Pi-hole DNS on UGOS PRO 1.6.0.2917

F# Weekly | Behind The Scenes

New IKVM 8.2 & MavenReference for .NET projects

Announcing OpenXML Package Explorer for VS Code

`dotnet watch` with Microsoft.Identity.Web or custom IDistributedCache

ML.NET Recommendation Engine: Pitfall of One-Class Matrix Factorization

Introducing Clippit, get your slides out of PPTX.

HashiCorp Vault and TLS Certificate Authentication for .NET Applications (Comprehensive guide)

Be better WPF / MvvmLight developer in 2018

Dropbox for .NET developers

F# Weekly #38, 2013

F# Weekly #37 2013

Stanford Word Segmenter is available on NuGet

F# Sample

C# Sample

F# Weekly #36 2013

F# Weekly #35 2013

MSR-SPLAT Overview for F# (.NET NLP)

Test Drive

Summary

F# Weekly #34 2013

F# Weekly #33 2013