This article is contributed. See the original author and article here.
Are you looking for ways to fine-tune your model relevance? Sometimes developers create a customized ranking model to re-rank the results returned by Azure Cognitive Search. This allows them to use application-specific context as part of that model. To help facilitate this, Azure Cognitive Search is introducing a new query parameter called featuresMode. When this parameter is set, the response will contain information used to compute the search score of retrieved documents, which can be leveraged to train a re-ranking model using a Machine Learning approach.
We have created a new sample and tutorial that walks you through the learning to rank process end-to-end, with steps for designing, training, testing, and consuming a ranking model. The tutorial shows you how to extract features using the featuresMode parameter and train a ranking model to increase total search relevance as measured by the offline NDCG metric.
For customers who are less familiar with machine learning, a learn-to-rank method re-ranks top results based on a machine learning model. The re-ranking process can incorporate clickthrough data or domain expertise as a reflection of what is truly relevant to users. The is a visualization of the components of a learn-to-rank method used in the tutorial.
Legend |
Description |
Data |
The articles and search statistics that reside in Azure Blob storage. |
Search Index |
Azure Cognitive Search ingests the data into a search index. |
Re-ranker |
Queries against the index produce scores and scoring features that are used to train a machine learning model based on labels derived from clickthrough data. After the model is trained, you can use it to re-rank your documents. |
Judgement labels |
To train the machine learning model, you need to have labeled data that contains signal for what documents are most relevant for different queries. One way to do this is to collect clickthrough data to understand which documents are most popular. Another mechanism may be to find human judges to label the most relevant documents. |
The featuresMode parameter is currently in preview and can be accessed through the Azure Cognitive Search REST APIs.
Sample Request
POST https://[service name].search.windows.net/indexes/[index name]/docs/search?api-version=[api-version]
Content-Type: application/json
api-key: [admin or query key]
Request Body
{
“search”: “.net core”,
“featuresMode”: “enabled”,
“select”: “title_en_us, description_en_us”,
“searchFields”: “body_en_us,description_en_us,title_en_us,apiNames,urlPath,searchTerms, keyPhrases_en_us”,
“scoringStatistics”: “global”
}
Sample Response
{
“value”: [
{
“@search.score”: document_score (if a text query was provided),
“@search.highlights”: {
field_name: [ subset of text, … ],
…
},
“@search.features”: {
“field_name_1”: {
“uniqueTokenMatches”: 1.0,
“similarityScore”: 0.29541412,
“termFrequency”: 2
},
“field_name_2”: {
“uniqueTokenMatches”: 3.0,
“similarityScore”: 1.75345345,
“termFrequency”: 6
},
…
},
…
},
…
]
}
If you are interested in this new capability, contact us at azuresearchrelevance@microsoft.com
References
Search Ranking Tutorial Github
FeaturesMode REST API Reference
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.
Recent Comments