F# Weekly #45, 2012

Welcome to F# Weekly,

Enjoy F# reading list from this past week.

Blogs & Tutorials

Resources & Books

Upcoming events

That’s all for now.  Have a great week and remember you can message me on twitter (@sergey_tihon) with any F# news or use hashtag #fsharpweekly.

 

Previous edition of the F# Weekly – #44

Seven SharePoint Videos from //build/ 2012 Conference

Personalized Search on the Largest Flash Sale Site in America

Today  posted a presentation about the example of using Solr in the sale site.

“This talk provides a tour of how Apache Solr is used to power search for America’s largest flash sale site, http://www.gilt.com. We show how to address the challenges of providing listings for fast moving inventory in a search space personalized for each of our members. The solution, built on Play Framework comprises less than 4,000 lines of code, and provides response times of 40ms on average.”

“F# Weekly” under the hood

Under the F# Weekly news preparation lies a simple F# script.

This script uses Twitterizer2  – one of the simplest Twitter client libraries for .NET. Source code ia available on GitHub, binaries are available through NuGet.

Script logic is relatively simple. First of all, collect a list of queries for Twitter.

    let tweets = ["#fsharp";"#fsharpx";"@dsyme";"#websharper";"#fsharpweekly"]

Then make a call to the Twitter Search API for each query, concatenate the results for last week and sort all tweets by creation date.

                |> List.map (getTweets (DateTime.Now - TimeSpan.FromDays(7.0)))
                |> List.concat
                |> List.sortBy (fun t -> t.CreatedDate)

Then leave only ‘en’ news and filter out tweets without links and RT leaving only first occurrence of each unique link.

                |> List.filter (fun t -> t.Language = "en")
                |> filterUniqLinks

Also in the source code below you can find console printing method for results verification and html printing method for further manual results review.

Feel free to use it in your social researches.

#r "Twitterizer2.dll"

open Twitterizer
open Twitterizer.Entities
open System
open System.Net
open System.Text.RegularExpressions;

let getTweets (sinceDate:DateTime) query =
    let rec collect pageNum =
        let options = SearchOptions(NumberPerPage = 100, SinceDate = sinceDate, PageNumber = pageNum);
        printfn "Loading %d-%d" (pageNum*options.NumberPerPage) ((pageNum+1)*options.NumberPerPage)
        let result = TwitterSearch.Search(query, options);
        if (result.Result <> RequestResult.Success || result.ResponseObject.Count = 0)
            then List.empty
            else result.ResponseObject |> List.ofSeq |> List.append (collect (pageNum+1))
        collect 1 |> List.rev

let urlRegexp = Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&amp;\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?", RegexOptions.IgnoreCase);

let filterUniqLinks (tweets: TwitterSearchResult list) =
    let hash = new System.Collections.Generic.HashSet<string>();
    tweets |> List.fold
        (fun acc t ->
            let mathces = urlRegexp.Matches(t.Text)
            if (mathces.Count = 0) then acc
            else let urls =
                   [0 .. (mathces.Count-1)]
                       |> List.map (fun i -> mathces.[i].Value)
                       |> List.filter (fun url -> not(hash.Contains(url)))
                 if (List.isEmpty urls) then acc
                 else urls |> List.iter(fun url -> hash.Add(url) |> ignore)
                      t :: acc)
        [] |> List.rev

let printTweets (tweets: TwitterSearchResult list) =
    tweets |> List.iter (fun t ->
        printfn "%15s : %s : %s" t.FromUserScreenName (t.CreatedDate.ToShortDateString()) t.Text)

let tweets = ["#fsharp";"#fsharpx";"@dsyme";"#websharper";"#fsharpweekly"]
                |> List.map (getTweets (DateTime.Now - TimeSpan.FromDays(7.0)))
                |> List.concat
                |> List.sortBy (fun t -> t.CreatedDate)
                |> List.filter (fun t -> t.Language = "en")
                |> filterUniqLinks
printfn "Tweets count : %d" tweets.Length
printTweets tweets

let printTweetsInHtml filename (tweets: TwitterSearchResult list) =
    let formatTweet (text:string) =
        let matches = urlRegexp.Matches(text)
        seq {0 .. (matches.Count-1)}
            |> Seq.fold (
                fun (t:string) i ->
                    let url = matches.[i].Value
                    t.Replace(url, (sprintf "<a href=\"%s\" target=\"_blank\">%s</a>" url url)))
                text
    let rows =
      tweets
        |> List.mapi (fun i t ->
            let id = (tweets.Length - i)
            let text = formatTweet(t.Text)
            sprintf "<table id=\"%d\"><tr><td rowspan=\"2\" width=\"30\">%d</td><td rowspan=\"2\" width=\"80\"><a href=\"javascript:remove('%d')\">Remove</a><td rowspan=\"2\"><a href=\"https://twitter.com/%s\" target=\"_blank\"><img src=\"%s\"/></a></td><td><b>%s</b></td></tr><tr><td>Created : %s <br></td></tr></table>"
                     id id id t.FromUserScreenName t.ProfileImageLocation text (t.CreatedDate.ToString()))
        |> List.fold (fun s r -> s+"&nbsp;"+r) ""
    let html = sprintf "<html><head><script>function remove(id){return (elem=document.getElementById(id)).parentNode.removeChild(elem);}</script></head><body>%s</body></html>" rows
    System.IO.File.WriteAllText(filename, html)

printTweetsInHtml "d:\\tweets.html" tweets

10 things I hate about Git

Exactly the same emotions

Steve Bennett blogs

Git is the source code version control system that is rapidly becoming the standard for open source projects. It has a powerful distributed model which allows advanced users to do tricky things with branches, and rewriting history. What a pity that it’s so hard to learn, has such an unpleasant command line interface, and treats its users with such utter contempt.

1. Complex information model

The information model is complicated – and you need to know all of it. As a point of reference, consider Subversion: you have files, a working directory, a repository, versions, branches, and tags. That’s pretty much everything you need to know. In fact, branches are tags, and files you already know about, so you really need to learn three new things. Versions are linear, with the odd merge. Now Git: you have files, a working tree, an index, a local repository, a remote repository, remotes…

View original post 2,004 more words

F# Weekly #44, 2012

Friends don’t let friends use null.
Don Syme

Welcome to F# Weekly,

A roundup of F# content from this past week:

Blogs & Tutorials

Resources & Books

F# Snippets

News from the past

 That’s all for now.  Have a great week and remember you can message me on twitter (@sergey_tihon) with any F# news or use hashtag #fsharpweekly.

P.S. Please share with me hashtags that you use for F# projects and F# related technologies.

Previous edition of the F# Weekly – #43

F# Weekly #43, 2012

A brief summary of what happened in the blogs last week:

P.S. Thanks to Don Syme and Richard Minerich for idea.

OData Type Provider with SharePoint 2010

‘Type Providers’ is an extremely cool F# feature that was introduced with F 3.0 and shipped with VS 2012 and .NET 4.5. You can find Type providers’ explanation  here and details about OData type provider here.

It maybe not a news, but OData Type Provider works pretty well with SharePoint 2010.

SharePoint 2010 has a OData Service. You can find mode details about this service at the MSDN article Query SharePoint Foundation with ADO.NET Data Services

One special thing that distinguishes SharePoint service from the others is that you should be authenticated. Just set your credentials into created data content before using it. You can find full code sample below.


#r "FSharp.Data.TypeProviders.dll"
#r "System.Data.Services.Client.dll"

open Microsoft.FSharp.Data.TypeProviders
open System.Net

type sharepoint = ODataService<"http://server_name/_vti_bin/listdata.svc">
let web = sharepoint.GetDataContext() in
    web.Credentials <-
        (NetworkCredential("user_name", "password", "domain") :> ICredentials)

(web.Documents)
    |> Seq.iter (fun item -> printfn "%A : %s" item.Modified item.Name)