Introducing Clippit, get your slides out of PPTX.

logoTL;TR

Clippit is .NETStandard 2.0 library that allows you to easily and efficiently extract all slides from PPTX presentation into one-slide presentations or compose slides back together into one presentation.

Why?

PowerPoint is still the most popular way to present information. Sales and marketing people regularly produce new presentations. But when they work on new presentation they often reuse slides from previous ones. Here is the gap: when you compose presentation you need slides that can be reused, but result of your work is the presentation and you are not generally interested in “slide management”.

One of my projects is enterprise search solution, that help people find Office documents across different enterprise storage systems. One of cool features is that we let our users find particular relevant slide from presentation rather than huge pptx with something relevant on slide 57.

How it was done before

Back in the day, Microsoft PowerPoint allowed us to “Publish slides” (save each individual slide into separate file). I am absolutely sure that this button was in PowerPoint 2013 and as far as I know was removed from PowerPoint 365 and 2019 versions.

d596b287-af31-4ecb-a8e0-092b4dbbed2b.jpeg

When this feature was in the box, you could use Microsoft.Office.Interop.PowerPoint.dll to start instance of PowerPoint and communicate with it using COM Interop.

public void PublishSlides(string sourceFileName, string resultDirectory)
{
Application ppApp = null;
try
{
ppApp = new Application
{
DisplayAlerts = PpAlertLevel.ppAlertsNone
};
const MsoTriState msoFalse = MsoTriState.msoFalse;
var presentation = ppApp.Presentations.Open(sourceFileName,
msoFalse, msoFalse, msoFalse);
presentation.PublishSlides(resultDirectory, true, true);
presentation.Close();
}
finally
{
ppApp?.Quit();
}
}

view raw
PP-Interop.cs
hosted with ❤ by GitHub

But server-side Automation of Office has never been recommended by Microsoft. You were never able to reliably scale your automation, like start multiple instances to work with different document (because you need to think about active window and any on them may stop responding to your command). There is no guarantee that File->Open command will return control to you program, because for example if file is password-protected Office will show popup and ask for password and so on.

That was hard, but doable. PowerPoint guarantees that published slides will be valid PowerPoint documents that user will be able to open and preview. So it is worth playing the ceremony of single-thread automation with retries, timeouts and process kills (when you do not receive control back).

Over the time it became clear that it is the dead end. We need to keep the old version of PowerPoint on some VM and never touch it or find a better way to do it.

The History

Windows only solution that requires MS Office installed on the machine and COM interop is not something that you expect from modern .NET solution.

Ideally it should be .NETStandard library on NuGet that platform-agnostic and able to solve you task anywhere and as fast as possible. But there was nothing on Nuget few months ago.

If you ever work with Office documents from C# you know that there is an OpenXml library that opens office document, deserialize it internals to an object model, let you modify it and then save it back. But OpenXml API is low-level and you need to know a lot about OpenXml internals to be able to extract slides with images, embedding, layouts, masters and cross-references into new presentation correctly.

If you google more you will find that there is a project “Open-Xml-PowerTools” developed by Microsoft since 2015 that have never been officially released on NuGet. Currently this project is archived by OfficeDev team and most actively maintained fork belongs to EricWhiteDev (No NuGet.org feed at this time).

Open-Xml-PowerTools has a feature called PresentationBuilder that was created for similar purpose – compose slide ranges from multiple presentations into one presentation. After playing with this library, I realized that it does a great job but does not fully satisfy my requirements:

  • Resource usage are not efficient, same streams are opened multiple times and not always properly disposed.
  • Library is much slower than it could be with proper resource management and less GC pressure.
  • It generates slides with total size much larger than original presentation, because it copies all layouts when only one is needed.
  • It does not properly name image parts inside slide, corrupt file extensions and does not support SVG.

It was a great starting point, but I realized that I can improve it. Not only fix all mentioned issues and improve performance 6x times but also add support for new image types, properly extract slide titles and convert then into presentation titles, propagate modification date and erase metadata that does not belong to slide.

How it is done today

So today, I am ready to present the new library Clippit that is technically a fork of most recent version of EricWhiteDev/Open-Xml-PowerTools that is extended and improved for one particular use case: extracting slides from presentation and composing them back efficiently.

All classes were moved to Clippit namespace, so you can load it side-by-side with any version of Open-Xml-PowerTools if you already use it.

The library is already available on NuGet, it is written using C# 8.0 (nullable reference types), compiled for .NET Standard 2.0, tested with .NET Core 3.1. It works perfectly on macOS/Linux and battle-tested on hundreds of real world PowerPoint presentations.

New API is quite simple and easy to use

var presentation = new PmlDocument(sourceFile);
// Pubslish slides
var slides = PresentationBuilder.PublishSlides(presentation).ToList();
// Save slides into files
foreach (var slide in slides)
{
var targetPath = Path.Combine(targetDir, Path.GetFileName(slide.FileName))
slide.SaveAs(targetPath);
}
// Compose slides back into one presentation
var sources = slides.Select(x => new SlideSource(x, keepMaster:true)).ToList();
PresentationBuilder.BuildPresentation(sources)
.SaveAs(newFileName);

view raw
Clippit-101.cs
hosted with ❤ by GitHub

P.S. Stay tuned, there will be more OpenXml goodness.

SharePoint 2013: Content Enrichment and performance of 3rd party services

There are a lot of cases when people write Content Enrichment service for SharePoint 2013 to integrate it with some 3rd party tagging/enrichment API. Such integration very often leads to huge crawl time degradation, the same happened in my case too.

The first thought was to measure the time that CPES (Content Processing Enrichment Service) spends on calls. I wrapped all my calls to external system in timers and run Full Crawl again. The results were awful, calls were in average from 10 to 30 times slower from CPES than from POC console app. Code was the same and of course perfect, so, probably my 3rd party service was not able to handle such an incredible load that was generated by my super-fast SharePoint Farm.

If you are in the same situation with fast SharePoint, awesome CPES implementation and bad 3rd party service, you probably should try async integration approach described in “Advanced Content Enrichment in SharePoint 2013 Search” by Richard diZerega.

But what if the problem is in my source code… what if execution environment affects execution time of REST calls… Some time later, I found a nice post “Quick Ways to Boost Performance and Scalability of ASP.NET, WCF and Desktop Clients” from Omar Al Zabir. This article describes the notion of Connection Manager:

By default, system.net has two concurrent connections per IP allowed. This means on a webserver, if it’s calling WCF service on another server or making any outbound call to a particular server, it will only be able to make two concurrent calls. When you have a distributed application and your web servers need to make frequent service calls to another server, this becomes the greatest bottleneck

WHOA! That means that it does not matter how fast our SharePoint Search service is. Our CPES executes no more than two calls to 3rd party service at the same time and queues other. Let’s fix this unpleasant injustice in web.config:

 <system.net>
   <connectionManagement>
     <add address="*" maxconnection="128"/>
   </connectionManagement>
 </system.net>

Now performance degradation should be fixed and if your 3rd party service is good enough your crawl time will be reasonable, otherwise you need to go back to advice from Richard diZereg.

SharePoint 2013: Content Enrichment for Large Files

There are a couple of guides on how to write Content Enrichment services for SharePoint 2013. One of them is official MSDN article “How to: Use the Content Enrichment web service callout for SharePoint Server“.

This article advice you two configuration steps to adjust max size of document that will be processed by CPES (Content Processing Enrichment Service).

  1. Modify web.config to accept messages up to 8 MB, and configure readerQuotas to be a sufficiently large value.
    <bindings>
     <basicHttpBinding>
       <!-- The service will accept a maximum blob of 8 MB. -->
       <binding maxReceivedMessageSize = "8388608">
       <readerQuotas maxDepth="32"
         maxStringContentLength="2147483647"
         maxArrayLength="2147483647" 
         maxBytesPerRead="2147483647" 
         maxNameTableCharCount="2147483647" /> 
       <security mode="None" />
       </binding>
     </basicHttpBinding>
    </bindings>
    
  2. Modify SPEnterpriseSearchContentEnrichmentConfiguration.
    $ssa = Get-SPEnterpriseSearchServiceApplication
    $config = New-SPEnterpriseSearchContentEnrichmentConfiguration
    $config.Endpoint = http://Site_URL/ContentEnrichmentService.svc
    $config.InputProperties = "Author", "Filename"
    $config.OutputProperties = "Author"
    $config.SendRawData = $True
    $config.MaxRawDataSize = 8192
    Set-SPEnterpriseSearchContentEnrichmentConfiguration –SearchApplication
    $ssa –ContentEnrichmentConfiguration $config
    

The concept is generally good, but what if you need to process files larger than 8MB? Let’s try to increase this number up to 300Mb for example (I think that ideally this limit should be not less than max file size allowed for your web apps).

Let’s change both values and run full crawl of SharePoint site. After that, if you are lucky, you will see something like that in your “Error Breakdown”:

crawl_errors_001

WAT? Something went wrong, but what it was … Let’s investigate ULS logs on the machine with Search Service. After a couple of unforgettable minutes of reading ULS logs, I’ve found the following error message:

[Microsoft.CrawlerFlow-cb9134ec-91c6-4bac-89f9-a0cc9fe1e481] Microsoft.Ceres.Evaluation.Engine.ErrorHandling.HandleExceptionHelper : Evaluation failure detected: Operator : ContentEnrichment Operator type : ContentEnrichmentClient Error id : 3206 Correlation id : 60ef1afd-038a-4f64-8230-2b2493923f80 Partition id : 0c37852b-34d0-418e-91c6-2ac25af4be5b Message : Failed to send the item to the content processing enrichment service. 49691C90-7E17-101A-A91C-08002B2ECDA9:#9: https://mysite.com/MyDoc.pptx id : ssic://780174 System.ServiceModel.EndpointNotFoundException: There was no endpoint listening
at http://MyServer:8081/ContentProcessingEnrichmentService.svc that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. —> System.Net.WebException: The remote server returned an error: (404) Not Found.
at System.Net.HttpWebRequest.GetResponse()
at System.ServiceModel.Channels.HttpChannelFactory`1.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout) –

WAT? “The remote server returned an error: (404) Not Found”. Does the service not exist sometimes? How could it be? Let’s go to IIS log (on the machine where your CPES is installed). Path to the IIS logs should look similar to this c:\inetpub\logs\LogFiles\W3SVC3\.

iislog

It is true – CPES sometimes returns 404.13 status. Let’s google what this status code means.

404.13 – Content length too large. The request contains a Content-Length header. The value of the Content-Length header is larger than the limit that is allowed for the server.

Seems that IIS is not ready yet to receive our 300Mb files. There is one more parameter in web config that should be tweaked to  handle really large files, this parameter is maxAllowedContentLength (default value is 30000000, that is ~30Mb). Let’s change it in web.config:

 <system.webServer>
   <security>
     <requestFiltering>
       <requestLimits maxAllowedContentLength="314572800" />
     </requestFiltering>
   </security>
 </system.webServer>

Recrawl your content once again, and Voila, strange errors gone! Enjoy your content enrichment!)

Sharepoint 2013 Search Ranking and Relevancy Part 1: Let’s compare to FS14

Search Unleashed

I’m very happy to do some “guest” blogging for my good friend Leo and continue diving into various search-related topics.  In this and upcoming posts, I’d like to jump right into something that interests me very much, and that is taking a look at what makes some documents more relevant than others as well as what factors influence rank score calculations.

Since Sharepoint 2013 is already out, I’d like to touch upon a question that comes up often when someone is considering moving from FAST ESP or FAST for Sharepoint 2010 to Sharepoint 2013 :  “So how are rank scores calculated in Sharepoint 2013 Search as opposed to previous FAST versions”?

In upcoming posts, I will go more into “internals” of the current Sharepoint 2013 ranking model as well as introduce the basics of relevancy calculation concepts that apply across many search engines and are not necessarily specific to FAST…

View original post 686 more words

FAST Search Server 2010 for SharePoint Versions

Talbott Crowell's Software Development Blog

Here is a table that contains a comprehensive list of FAST Search Server 2010 for SharePoint versions including RTM, cumulative updates (CU’s), and hotfixes. Please let me know if you find any errors or have a version not listed here by using the comments.

BuildReleaseComponentInformationSource (Link to Download)
14.0.4763.1000RTMFAST Search ServerMark   van Dijk 
14.0.5128.5001October 2010 CUFAST Search ServerKB2449730Mark   van Dijk 
14.0.5136.5000February 2011 CUFAST Search ServerKB2504136Mark   van Dijk 
14.0.6029.1000Service Pack 1FAST Search ServerKB2460039Todd   Klindt
14.0.6109.5000August 2011 CUFAST Search ServerKB2553040Todd   Klindt
14.0.6117.5002February 2012 CUFAST Search ServerKB2597131Todd   Klindt
14.0.6120.5000April 2012 CUFAST Search ServerKB2598329Todd   Klindt
14.0.6126.5000August 2012 CUFAST Search ServerKB2687489Mark   van Dijk 
14.0.6129.5000October 2012 CUFAST Search ServerKB2760395Todd…

View original post 245 more words

Selective crawling in SharePoint 2010 (with F# & Selenium)

SPcanopySharePoint Search Service Applications have two modes for crawling content:

  • Full Crawl that re-crawls all documents from Content Source.
  • Incremental Crawl that crawls documents modified since the previous one.

But it is really not enough if you are working on search driven apps. (More about SharePoint crawling you can read in Brian Pendergrass “SP2010 Search *Explained: Crawling” post).

Search applications are a special kind of applications that force you to be iterative. Generally, you work with large amount of data and you cannot afford to do full crawl often, because it is a slow process. There is another reason why it is slow: more intelligent search requires more time to indexing. We can not increase computations in query time, because it directly affects users’ satisfaction. Crawling time is the only place for intelligence.

Custom document processing pipeline stages are tricky a bit. Generally, you can find some documents in your hundreds of thousands or millions corpus, which failed on your custom stage or were processed in a wrong way. These may happen because of anything (wrong URL format, corrupted file, locked document, lost connection, unusual encoding, too large file size, memory issue, BSOD on the crawling node, power outage and even due to the bug in the source code 🙂 ) Assume you were lucky to find documents where your customizations work wrong and even fix them. There is a question how to test your latest changes? Do you want to wait some days to check whether it works on these files or not? I think no… You probably want to have an ability to re-crawl some items and verify your changes.

Incremental crawl does not solve the problem. It is really hard to find all files that you want to re-crawl and modify them somehow. Sometimes modification is not possible at all. What to do in such situation?

Search Service Applications have an UI for high level monitoring of index health (see the picture below). There you can check the crawl status of document by URL and even re-crawl on individual item.

re-crawl-item

SharePoint does not provide an API to do it from code. All that we have is a single ASP.NET form in Central Administration. If you make a further research and catch call using Fiddler then you can find target code that process request. You can decompile SharePoint assemblies and find that some mysterious SQL Server stored procedure was called to  add you document into processing queue (read more about that stuff  in Mikael Svenson’s answer on FAST Search for SharePoint forum).

Ahh… It is already hard enough, just a pain and no fun. Even if we find where to get or how to calculate all parameters to stored procedure, it does not solve  all our problems. Also we need to find a way to collect all URLs of buggy documents that we want to re-crawl. It is possible to do so using SharePoint web services, I have already posted about that (see “F# and FAST Search for SharePoint 2010“). If you like this approach, please continue the research. I am tired here.

Canopy magic

Why should I go so far in SharePoint internals for such a ‘simple’ task. Actually, we can automate this task through UI. We have a Canopy – good UI automation Selenium wrapper for F#. All we need is to write some lines of code that start browser, open the page and click some buttons many times. For sure this solution have some disadvantages:

  1. You should be a bit familiar with Selenium, but this one is easy to fix.
  2. It will be slow. It works for hundreds document, maybe for thousands, but no more. ( I think that if you need to re-crawl millions of documents you can run a full crawl).

Also such approach has some benefits:

  1. It is easy to code and to use.
  2. It is flexible.
  3. It solves another problem – you can use Canopy for grabbing document URLs directly from the search result page or the other one.

All you need to start with Canopy is to download NuGet package and web driver for your favorite browser (Chrome WebDrover, IE WebDriver). The next steps are pretty straightforward: reference three assemblies, configure web driver location if it is different from default ‘c:\’ and start browser:

#r @"..\packages\Selenium.Support.2.33.0\lib\net40\WebDriver.Support.dll"
#r @"..\packages\Selenium.WebDriver.2.33.0\lib\net40\WebDriver.dll"
#r @"..\packages\canopy.0.7.7\lib\canopy.dll"

open canopy

configuration.chromeDir <- @"d:\"
start chrome

Be careful, Selenium, Canopy and web drivers are high intensively developed projects – newest versions maybe different from mentioned above. Now, we are ready to automate the behavior, but here is a little trick. To show up a menu we need to click on the area marked red on the screenshot below, but we should not touch the link inside this area. To click on the element in the specified position, we need to use Selenium advanced user interactions capabilities.

canopy_click

let sendToReCrawl url =
    let encode (s:string) = s.Replace(" ","%20")
    try
        let encodedUrl = encode url
        click "#ctl00_PlaceHolderMain_UseAsExactMatch" // Select "Exact Match"
        "#ctl00_PlaceHolderMain_UrlSearchTextBox" << encodedUrl
        click "#ctl00_PlaceHolderMain_ButtonFilter" // Click "Search" Button

        elements "#ctl00_PlaceHolderMain_UrlLogSummaryGridView tr .ms-unselectedtitle"
        |> Seq.iter (fun result ->
            OpenQA.Selenium.Interactions.Actions(browser)
                  .MoveToElement(result, result.Size.Width-7, 7)
                  .Click().Perform() |> ignore
            sleep 0.05
            match someElement "#mp1_0_2_Anchor" with
            | Some(el) -> click el
            | _ -> failwith "Menu item does not found."
        )
   with
   | ex -> printfn "%s" ex.Message

let recrawlDocuments logViewerUrl pageUrls =
    url logViewerUrl // Open LogViewer page
    click "#ctl00_PlaceHolderMain_RadioButton1" // Select "Url or Host name"
    pageUrls |> Seq.iteri (fun i x ->
        printfn "Processing item #%d" i;
        sendToReCrawl x)

That is all. I think that all other parts should be easy to understand. Here, CSS selectors used to specify elements to interact with.

Another one interesting part is grabbing URLs from search results page. It can be useful and it is easy to automate, let’s do it.

let grabSearchResults pageUrl =
    url pageUrl
    let rec collectUrls() =
        let urls =
            elements ".srch-Title3 a"
            |> List.map (fun el -> el.GetAttribute("href"))
        printfn "Loaded '%d' urls" (urls.Length)
        match someElement "#SRP_NextImg" with
        | None -> urls
        | Some(el) ->
            click el
            urls @ (collectUrls())
     collectUrls()

Finally, we are ready to execute all this stuff. We need to specify two URLs: first one is to the page with search results where we get all URLs, second one is to the logviewer page in you Search Service Application in Central Administration(do not forget to replace them in the sample above). Almost all SharePoint web applications require authentication, you can pass your login and password directly in URL as it done in the sample above.

grabSearchResults "http://LOGIN:PASSWORD@SEARVER_NAME/Pages/results.aspx?dupid=1025426827030739029&start1=1"
|> recrawlDocuments "http://LOGIN:PASSWORD@SEARVER_NAME:CA_POST/_admin/search/logviewer.aspx?appid={5095676a-12ec-4c68-a3aa-5b82677ca9e0}"

Microsoft Patents related to SharePoint 2013 Search

Extremely useful to understand what is under the hood of SharePoint 2013 Search.

Insights into search black magic

While investigating relevancy calculation in new SharePoint 2013, I did a research and it turned out that there are a lot publicly available patents which cover search in SharePoint, most of them are done in scope of Microsoft Research programs, according to names of inventors. Although there is no direct evidence that it was implemented exactly as described in patents, I created an Excel spreadsheet which mimiques logic described in patents which actual values from SharePoint – and numbers match!

I hope most curious of you will find it helpful to deep dive into Enterprise Search relevancy and better understand what happens behind the curtain.

Enterprise relevancy ranking using a neural network

Internal ranking model representation schema

Techniques to perform relative ranking for search results

Ranking and providing search results based in part on a number of click-through features

Document length as…

View original post 28 more words

Explain rank in Sharepoint 2013 Search

Add your thoughts here… (optional)

Insights into search black magic

Default ranking model in Sharepoint 2013 is completely different from what we’ve seen in FS4SP and is definitely a step forward comparing to SP2010. In a nutshell it uses multiple features (to take into account query terms, it’s proximity; document click-through, authority, anchor text, url depth etc ) which are mixed with help of neural network as a final step. Details can be found in patent claim http://www.google.com/patents/US8296292.

Hopefully there’s a way to bring more light into this black magic. I’ve modified default Display Template and added “Explain Rank” link to each item. This link redirects user to ExplainRank page which hosted under {search_center_url}/_layouts/15/ folder.

1 - starwars

I used

  • d=ctx.CurrentItem.Path
  • q=ctx.DataProvider.$2_3.k  (have found this hidden property using trial & error method)
  • another option is to extract value for q= from QueryText from ctx.ListData.Properties.SerializedQuery which value was<Query Culture=”en-US” EnableStemming=”True” EnablePhonetic=”False” EnableNicknames=”False” IgnoreAllNoiseQuery=”True” SummaryLength=”180″ MaxSnippetLength=”180″ DesiredSnippetLength=”90″ KeywordInclusion=”0″ QueryText=”star wars” QueryTemplate=”{searchboxquery}” TrimDuplicates=”True”…

View original post 150 more words

F# and FAST Search for SharePoint 2010

If you are a SharePoint developer, an Enterprise Search developer or an employee of a large corporation with Global Search through private internal infrastructure then you may be interested in search automation. Deployment of FAST Search Server 2010 for SharePoint (F4SP) is out of the current post’s scope (you can follow TechNet F4SP Deployment Guide if you need).

F# 3.0 comes with feature called “type providers” that helps you to simplify your life in daily routine. For the case of WCF, the Wsdl type provider allows us to automate the proxy generation. Here we need to note that, F# 3.0 works only on the .NET 4.0 and later, but SharePoint 2010 server side runs exclusively on the .NET 3.0 64bit. Let’s see how this works together.

Connecting to the web service

Firstly, we create an empty F# Script file.

#r "System.ServiceModel.dll"
#r "FSharp.Data.TypeProviders.dll"
#r "System.Runtime.Serialization.dll"

open System
open System.Net
open System.Security
open System.ServiceModel
open Microsoft.FSharp.Data.TypeProviders

[<Literal>]
let SearchServiceWsdl = "https://SharePoint2010WebAppUrl/_vti_bin/search.asmx?WSDL"
type SharePointSearch = Microsoft.FSharp.Data.TypeProviders.WsdlService<SearchServiceWsdl>

At this point, the type provider creates proxy classes in the background. The only thing we need to do is to configure the access security. The following code tested on the two SharePoint 2010 farms with NTLM authentication and HTTP/HTTPS access protocols.

let getSharePointSearchService() =
    let binding = new BasicHttpBinding()
    binding.MaxReceivedMessageSize <- 10000000L
    binding.Security.Transport.ClientCredentialType <- HttpClientCredentialType.Ntlm
    binding.Security.Mode <- if (SearchServiceWsdl.StartsWith("https"))
                                 then BasicHttpSecurityMode.Transport
                                 else BasicHttpSecurityMode.TransportCredentialOnly

    let serviceUrl = SearchServiceWsdl.Remove(SearchServiceWsdl.LastIndexOf('?'))
    let service = new SharePointSearch.ServiceTypes.
                        QueryServiceSoapClient(binding, EndpointAddress(serviceUrl))
    //If server located in another domain then we may authenticate manually
    //service.ClientCredentials.Windows.ClientCredential
    //  <- (Net.NetworkCredential("User_Name", "Password"))
    service.ClientCredentials.Windows.AllowedImpersonationLevel
        <- System.Security.Principal.TokenImpersonationLevel.Delegation;
    service

let searchByQueryXml queryXml =
    use searchService = getSharePointSearchService()
    let results = searchService.QueryEx(queryXml)
    let rows = results.Tables.["RelevantResults"].Rows
    [for i in 0..rows.Count-1 do
        yield (rows.[i].ItemArray) |> Array.map (sprintf "%O")]

Building search query XML

To query F4SP we use the same web service as for build-in SharePoint 2010 search, but with a bit different query XML. The last thing that we need to do is to build query.You can find query XML syntax on Microsoft.Search.Query Schema, but it is hard enough to work with it using official documentation. There is a very useful CodePlex project called FAST Search for Sharepoint MOSS 2010 Query Tool which provides a user-friendly query builder interface.

F4SPQueryTool

FAST Query Language (FQL) Syntax

FAST has its own query syntax(FQL Syntax) that can be directly used through SharePoint Search Web Service.

let getFQLQueryXml (fqlString:string) =
  """<QueryPacket Revision="1000">
       <Query>
         <Context>
           <QueryText language="en-US" type="FQL">{0}</QueryText>
         </Context>
         <SupportedFormats Format="urn:Microsoft.Search.Response.Document.Document" />
         <ResultProvider>FASTSearch</ResultProvider>
         <Range>
           <StartAt>1</StartAt>
           <Count>5</Count>
         </Range>
         <EnableStemming>false</EnableStemming>
         <EnableSpellCheck>Off</EnableSpellCheck>
         <IncludeSpecialTermsResults>false</IncludeSpecialTermsResults>
         <IncludeRelevantResults>true</IncludeRelevantResults>
         <ImplicitAndBehavior>false</ImplicitAndBehavior>
         <TrimDuplicates>true</TrimDuplicates>
         <Properties>
           <Property name="Url" />
           <Property name="Write" />
           <Property name="Size" />
         </Properties>
       </Query>
     </QueryPacket>"""
  |> (fun queryTemplate -> String.Format(queryTemplate,fqlString))

let fqlQueryResults =
  """and(string("Functional Programming", annotation_class="user", mode="phrase"),
         or("fileextension":string("ppt", mode="phrase"),
            "fileextension":string("pptx", mode="phrase")))
     AND filter(and(isdocument:1))"""
  |> getFQLQueryXml |> searchByQueryXml

Keyword Query Syntax

FAST also supports native SharePoint Keyword Query Syntax.

let getKeywordQueryXml (keywordString:string) =
  """<QueryPacket Revision="1000">
       <Query>
         <Context>
           <QueryText language="en-US" type="STRING">{0}</QueryText>
         </Context>
         <SupportedFormats Format="urn:Microsoft.Search.Response.Document.Document" />
         <ResultProvider>FASTSearch</ResultProvider>
         <Range>
           <StartAt>1</StartAt>
           <Count>5</Count>
         </Range>
         <EnableStemming>false</EnableStemming>
         <EnableSpellCheck>Off</EnableSpellCheck>
         <IncludeSpecialTermsResults>false</IncludeSpecialTermsResults>
         <IncludeRelevantResults>true</IncludeRelevantResults>
         <ImplicitAndBehavior>false</ImplicitAndBehavior>
         <TrimDuplicates>true</TrimDuplicates>
         <Properties>
           <Property name="Url" />
           <Property name="Write" />
           <Property name="Size" />
         </Properties>
       </Query>
     </QueryPacket>"""
  |> (fun queryTemplate -> String.Format(queryTemplate,keywordString))

let simpleKeywordQueryResults =
  """"Functional Programming" scope:"Documents" (fileextension:"PPT" OR fileextension:"PPTX")"""
  |> getKeywordQueryXml |> searchByQueryXml

Query Syntax Summary

One of the principal differences between two syntaxes is that Keyword Query needs to be converted into FQL on the SharePoint side. Keyword syntax also supports scope conditions, which will be converted into FQL filters. For example “scope:”Documents”” will be translated into ” filter(and(isdocument:1))” (In the case when Documents scope exists in the SharePoint Query Service Application).  Unfortunately, we can not specify SharePoint scope in FQL query.