Friday, August 28, 2009

Stemming in Search

Stemming is a process of reducing a word by removing some pattern. For example 'Search', 'Searches' and 'Searching' has the same origin and if user search with any of the words then content with 'search' keyword should be included in the result. So when user search with keyword 'Searching' how will you ensure that any content with keyword 'search' is included in the result? You can do so by applying stemming to the user search text. So if user searches with 'Searching' then the stemming process will remove the 'ing' from 'searching' and you will get the 'search'. Then you can use this keyword 'search' to use for searching content in your system.

The process of applying Stemming is shown in the following diagram:

 

image

So we can use a stemming library to parse the user search text and then we can get the root of each word user provided. Then we can search with stemming keywords. This will increase the chance of getting result from user perspective. You can get the open source stemming library from the following link:

http://tartarus.org/~martin/PorterStemmer/

Saturday, August 1, 2009

XSLT Debugging

XSLT is a great way of transforming XML document. But most of the developers who works with XSLT doesn't know that XSLT can be debugged in Visual Studio. Debugger in Visual Studio supports breakpoint, viewing XSLT executing steps, variable watching capability etc. There are two ways to debug XSLT in visual studio.

First Approcah

In this approach you don't need to write any code. You need your XSLT and XML file ready.

1. Open you XSLT file in Visual Studio.

2. On the properties of the XSLT file you will find Input property. Select the XML file you want to debug with this XSLT.

3. Start Debug

 

 

Second Approach

In this approach you need to write few lines of code to debug XSLT. For XSLT debugging from code you need to use the XslCompiledTransform Class.  The following code block use XslCompiledTransform to debug the XSLT. You can pass boolean value to XslCompiledTransform constructor to debug or not. The code section below is self explanatory and has proper comment to understand. You have need to have two files, source (XML) and styelsheet (XSLT). Then you need to specify the output file name which the XML after  applying stylesheet to XML. In the code section below I have created the full path name by appending current application directory with "Xmls/filename".

string sourceFile = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Xmls/XMLFile.xml");

        string stylesheet = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Xmls/XSLTFile.xslt");

        string outputFile = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Xmls/output.xml");

       // Enable XSLT debugging.

      XslCompiledTransform xslt = new XslCompiledTransform(true);

 

      // Compile the style sheet.

      xslt.Load(stylesheet);

 

      // Execute the XSLT transform.

      FileStream outputStream = new FileStream(outputFile, FileMode.Append);

      xslt.Transform(sourceFile, null, outputStream);

Now to debug all you need to do to put breakpoint at the line where transform takes place (xslt.Transform.....). when the breakpoint hits and if you press F11 then you will stepped into XSLT debugging.

 

Few more Information

While in debug you can get the current node by putting "self::node()" in the immediate window or watch window. You can also get the particular node value by writing any XPath  query like "./FirstName/text()". Here you will get the First Name with respect to current node. The text() is a function which returns the node's text value.