Re-ranking Cognitive Search results with Machine Learning for better search relevance

This article is contributed. See the original author and article here.

Are you looking for ways to fine-tune your model relevance? Sometimes developers create a customized ranking model to re-rank the results returned by Azure Cognitive Search. This allows them to use application-specific context as part of that model. To help facilitate this, Azure Cognitive Search is introducing a new query parameter called featuresMode. When this parameter is set, the response will contain information used to compute the search score of retrieved documents, which can be leveraged to train a re-ranking model using a Machine Learning approach.

We have created a new sample and tutorial that walks you through the learning to rank process end-to-end, with steps for designing, training, testing, and consuming a ranking model. The tutorial shows you how to extract features using the featuresMode parameter and train a ranking model to increase total search relevance as measured by the offline NDCG metric.

For customers who are less familiar with machine learning, a learn-to-rank method re-ranks top results based on a machine learning model. The re-ranking process can incorporate clickthrough data or domain expertise as a reflection of what is truly relevant to users. The is a visualization of the components of a learn-to-rank method used in the tutorial.

Legend	Description
Data	The articles and search statistics that reside in Azure Blob storage.
Search Index	Azure Cognitive Search ingests the data into a search index.
Re-ranker	Queries against the index produce scores and scoring features that are used to train a machine learning model based on labels derived from clickthrough data. After the model is trained, you can use it to re-rank your documents.
Judgement labels	To train the machine learning model, you need to have labeled data that contains signal for what documents are most relevant for different queries. One way to do this is to collect clickthrough data to understand which documents are most popular. Another mechanism may be to find human judges to label the most relevant documents.

The featuresMode parameter is currently in preview and can be accessed through the Azure Cognitive Search REST APIs.

Sample Request

POST https://[service name].search.windows.net/indexes/[index name]/docs/search?api-version=[api-version]

Content-Type: application/json

api-key: [admin or query key]

Request Body

{

“search”: “.net core”,

“featuresMode”: “enabled”,

“select”: “title_en_us, description_en_us”,

“searchFields”: “body_en_us,description_en_us,title_en_us,apiNames,urlPath,searchTerms, keyPhrases_en_us”,

“scoringStatistics”: “global”
}

Sample Response

{

“value”: [

{

“@search.score”: document_score (if a text query was provided),

“@search.highlights”: {

field_name: [ subset of text, … ],

…

“@search.features”: {

“field_name_1”: {

“uniqueTokenMatches”: 1.0,

“similarityScore”: 0.29541412,

“termFrequency”: 2

“field_name_2”: {

“uniqueTokenMatches”: 3.0,

“similarityScore”: 1.75345345,

“termFrequency”: 6

…

]
}

If you are interested in this new capability, contact us at azuresearchrelevance@microsoft.com

References

Search Ranking Tutorial Github
FeaturesMode REST API Reference

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.

Re-ranking Cognitive Search results with Machine Learning for better search relevance

References

Submit a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

We look forward to meeting you