SearchExtensions: Multiple property search support with ranking in c#
The updates are coming thick and fast. I am pleased to say that I have extended NinjaNye.SearchExtensions to include support for searching with ranked results for both IQueryable
and IEnumerable
.
The SearchExtensions nuget package is available by running the following in your package manage console
<p class="nuget-badge"><code>PM> Install-Package NinjaNye.SearchExtensions</code></p>Performing a ranked search
Performing a ranked search imitates the current Search
functionality but instead returns an IQueryable<IRanked<T>>
// Retrieve ranked search results where FirstName or LastName contains "john"
var result = context.Users.RankedSearch("john", u => u.FirstName, u => u.LastName);
So what is IRanked<T>
. Well, It is simply as follows:
public interface IRanked<out T>
{
int Hits { get; }
T Item { get; }
}
Meaning given the follwing users...
<table class="table table-condensed table-striped table-bordered" > <thead> <tr> <th>Id</th><th>FirstName</th><th>MiddleNames</th><th>LastName</th> </tr> </thead> <tbody> <tr><td>53</td><td>John</td><td></td><td>Nye</td></tr> <tr><td>54</td><td>Johnny</td><td>Jimmy</td><td>Johnson</td></tr> <tr><td>55</td><td>Jimmy</td><td>James</td><td>Smith</td></tr> </tbody> </table>The above RankedSearch()
query return the following data:
Notice it strips out Jimmy Smith since there are no matches, and instead of simply returning the data as Search
function would, a RankedSearch()
builds a new IRanked result which includes data about the search, specifically the amount of times a search term was hit.
Order a ranked search by most relevant
The returned IRanked<T>
result is still IQueryable
meaning you can the sort your ranked search results as you wish. The following example is fairly typical in that we are ordering our results by the most Hits.
// Retrieve ranked search results where FirstName or LastName contains "john"
var result = context.Users.RankedSearch("john", u => u.FirstName, u => u.LastName)
.OrderByDescending(r => r.Hits);
The SQL produced (when using a sql provider)
In order to count the number of occurrences (or hits) a term appears in a way that can translate to SQL, I have constructed an expression tree equivalent to the following Lambda for each property and search term
[property].Length - ([property].Replace([searchTerm], "").Length) / [searchTerm].Length
For our user search example this translates to:
([user.FirstName].Length - ([user.FirstName].Replace("john", "").Length) / "john".Length)
+ // Add the two hit counts together for each property
([user.LastName].Length - ([user.LastName].Replace("john", "").Length) / "john".Length)
This, coupled with some null checks produces SQL similar to the following
SELECT
[Project1].[Id] AS [Id],
[Project1].[C1] AS [C1],
[Project1].[FirstName] AS [FirstName],
[Project1].[MiddleNames] AS [MiddleNames],
[Project1].[LastName] AS [LastName],
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[MiddleNames] AS [MiddleNames],
[Extent1].[LastName] AS [LastName],
((( CAST(LEN(CASE WHEN ([Extent1].[FirstName] IS NULL) THEN N'' ELSE [Extent1].[FirstName] END) AS int)) - ( CAST(LEN(REPLACE(CASE WHEN ([Extent1].[FirstName] IS NULL) THEN N'' ELSE [Extent1].[FirstName] END, N'john', N'')) AS int))) / 4)
+
((( CAST(LEN(CASE WHEN ([Extent1].[LastName] IS NULL) THEN N'' ELSE [Extent1].[LastName] END) AS int)) - ( CAST(LEN(REPLACE(CASE WHEN ([Extent1].[LastName] IS NULL) THEN N'' ELSE [Extent1].[LastName] END, N'john', N'')) AS int))) / 4) AS [C1]
FROM [dbo].[Users] AS [Extent1]
WHERE (CASE WHEN ([Extent1].[FirstName] IS NULL) THEN N'' ELSE [Extent1].[FirstName] END LIKE N'%john%') OR (CASE WHEN ([Extent1].[LastName] IS NULL) THEN N'' ELSE [Extent1].[LastName] END LIKE N'%john%')
) AS [Project1]
ORDER BY [Project1].[C1] DESC
That is all pretty difficult to read but the important part is that as part of the query you can see the following sql for each property and each search term
((( CAST(LEN( -- Get length of property
CASE WHEN ([Extent1].[FirstName] IS NULL) -- Check property for null value
THEN N'' -- Substitute null for empty string
ELSE [Extent1].[FirstName] -- Use property value
END) AS int))
- -- Minus the property without search term
( CAST(LEN( -- Get the length of the replaced property
REPLACE( -- Replace search term in property
CASE WHEN ([Extent1].[FirstName] IS NULL) --Null check repeated
THEN N''
ELSE [Extent1].[FirstName]
END, N'john', N'')) AS int)))
/ 4) -- Divide diff by the search term length
Hopefully this helps to explain the latest feature of the SearchExtensions project. I look forward to hearing your comments.