I have recently updated my search extensions project to enable ranked search results. This enables a user to search for a term within a property but also order the results by the most relevant according to the number of hits.
Full source code can be found here: https://github.com/ninjanye/searchextensions
The SearchExtensions nuget package is also available by running the following
PM> Install-Package NinjaNye.SearchExtensions
The Goal
The thought behind a ranked search is to enable users to easily search there data collections and determine which results are more relevant to others.
How to use it
A ranked search is called in the same way as a regular search:
var result = queryableData.RankedSearch(x => x.Property, "searchTerm");
This produces the following SQL when used with a sql data provider. Notice that all the searching and ranking is done in SQL (not in memory)
SELECT
[Project1].[C1] AS [C1],
[Project1].[Property] AS [Property]
...
FROM ( SELECT
[Extent1].[Property] AS [Property],
...
(( CAST(LEN([Extent1].[Property]) AS int)) -
( CAST(LEN(REPLACE([Extent1].[Property], N'searchTerm', N'')) AS int)))
/ 10 AS [C1]
FROM [dbo].[Table] AS [Extent1]
WHERE [Extent1].[Property] LIKE N'%searchTerm%'
) AS [Project1]
###How it was built (Expression Trees)
So here is the implementation. Firstly, to represent my ranked result I have the following interface
public interface IRanked<out T>
{
int Hits { get; }
T Item { get; }
}
… with the following concrete class
internal class Ranked<T> : IRanked<T>
{
public int Hits { get; set; }
public T Item { get; set; }
}
The RankedSearch extension method
public static class RankedSearchExtensions
{
public static IQueryable<IRanked<T>> RankedSearch<T>(this IQueryable<T> source,
Expression<Func<T, string>> stringProperty,
string searchTerm)
{
var parameterExpression = stringProperty.Parameters[0];
var hitCountExpression = CalculateHitCount(stringProperty, searchTerm);
var rankedInitExpression = ConstructRankedResult<T>(hitCountExpression,
parameterExpression);
var selectExpression =
Expression.Lambda<Func<T, Ranked<T>>>(rankedInitExpression, parameterExpression);
return source.Search(stringProperty, searchTerm)
.Select(selectExpression);
}
The first thing this method does is call CalculateHitCount which creates an expression that represents counting the number of times a search term occurs. I am using the following method to count occurrences so that this can be used by all providers, specifically SQL.
Note: Always write down the code you are trying to build to help visualize the expression tree
x => x.Name.Length - x.Name.Replace([searchTerm], "").Length) / [searchTerm].Length;
In terms of building the above as an expression tree, this was accomplished as follows:
private static BinaryExpression CalculateHitCount<T>(Expression<Func<T, string>> stringProperty,
string searchTerm)
{
Expression searchTermExpression = Expression.Constant(searchTerm);
// Store term length to work out how many search terms were found
Expression searchTermLengthExpression = Expression.Constant(searchTerm.Length);
// Empty string expression to replace search terms with
Expression emptyStringExpression = Expression.Constant("");
PropertyInfo stringLengthProperty = typeof (string).GetProperty("Length");
//Calculate the length of property
var lengthExpression = Expression.Property(stringProperty.Body, stringLengthProperty);
// Replace searchTerm with empty string in property
MethodInfo replaceMethod = typeof(string).GetMethod("Replace",
new[] {typeof (string), typeof (string)});
var replaceExpression = Expression.Call(stringProperty.Body, replaceMethod,
searchTermExpression, emptyStringExpression);
// Calculate length of replaced string
var replacedLengthExpression = Expression.Property(replaceExpression, stringLengthProperty);
// Calculate the difference between the property and the replaced property
var charDiffExpression = Expression.Subtract(lengthExpression, replacedLengthExpression);
// Divide the character difference by the number of characters in the
// search term to get the amount of occurrences
return Expression.Divide(charDiffExpression, searchTermLengthExpression);
}
The second part of a RankSearch is to initialize a Ranked search result holding the hit count as well as returning the original item. We already have the hit count expression using the method above. We now need to build an expression tree that uses the hit count and builds a ranked result.
The equivalent lambda I want to build is as follows:
x => new Ranked<T>{ Hits = [hitCountExpression], Item = x}
This is represented as the following expression tree. It is fairly simple as it is simple initializing our ranked result:
private static Expression ConstructRankedResult<T>(Expression hitCountExpression,
ParameterExpression parameterExpression)
{
var rankedType = typeof (Ranked<T>);
// Construct the object
var rankedCtor = Expression.New(rankedType);
// Assign hitCount to Hits property
var hitProperty = rankedType.GetProperty("Hits");
var hitValueAssignment = Expression.Bind(hitProperty, hitCountExpression);
//Assign record to Item property
var itemProperty = rankedType.GetProperty("Item");
var itemValueAssignment = Expression.Bind(itemProperty, parameterExpression);
// Initialize Ranked object with property assignments
return Expression.MemberInit(rankedCtor, hitValueAssignment, itemValueAssignment);
}
Get in touch
I’m not entirely happy with the method name RankedSearch as it suggests the result is ordered by default. This is not the case as the user can order the results as they see fit. RankedSearch simply provides an occurrence (hit) count of the search term. If you have a suggestion as to a better method name, please get in touch via the comments below, twitter, or emailing me using the link in the header
I am currently implementing the RankedSearch feature for use with multiple properties and multiple search terms (a future post, no doubt) but if you have any ideas as to future features or enhancements, then, again, please get in touch using the normal channels.