This article is contributed. See the original author and article here.

VinodSoni_0-1710544975699.png


 


Azure Cognitive Search & OpenAI Output can be effectively restricted with the help of Azure Entra Security Groups. With Azure Entra Security Groups, organizations can limit access to an Azure search instance or an OpenAI Output instance based on group membership of the user. This ensures that users only have access to the data within the scope of their job responsibilities. Azure Entra Security Groups also provide advanced authentication and authorization services for Azure services, offering additional layers of security for organizations to protect their data.


 


Azure OpenAI service is being used to create more interactive & intelligent chatbots. A key use case is being able to have the OpenAI service respond to user requests using your own data.


 


Why filter search results from Azure Cognitive Search


 


Cognitive Search is a search engine that catalogues all the documents, databases, etc. you provide it. However, there may be situations where you want an index of large amounts of data, but you don’t want every user in healthcare organization to have access to everything.


 



  • Protected Health Information (PHI) data

  • HR data

  • Classified data


For these situations, you need to adjust the search results based on the user’s identity (The medical professionals, such as doctors, nurses, and other health care workers should have access to PHI data, while other people who are not involved or not authorized  should not see it).


 


With security filters, Azure Cognitive Search supports this use case. When you get search results, security filters let you give extra information to restrict results to only data the user can access.


There are three steps required to implement security filtering


 



  • Create an index that includes a field for security filtering (such as Azure Entra security group IDs)

  • Include which Azure Entra security group IDs are allowed to see the data on initial index of each document

  • Include the list of Azure Entra security group IDs that the user is a part of so the security filtering can be applied on each query


Create an index that includes a field for security filtering


 


A security filtering field is required when you create a Cognitive Search index. This field should be filterable and not retrievable.


 


 Example REST API call


 

POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2023-10-01-preview

{

     "name": "securedfiles", 

     "fields": [

         {"name": "file_id", "type": "Edm.String", "key": true, "searchable": false },

         {"name": "file_name", "type": "Edm.String", "searchable": true },

         ...

         {"name": "group_ids", "type": "Collection(Edm.String)", "filterable": true, "retrievable": false }

     ]

 }

 


Example C#


 

var index = new SearchIndex(options.SearchIndexName)

{

    Fields =

    {

        new SimpleField("file_id", SearchFieldDataType.String) { IsKey = true, ... },

        new SimpleField("file_name", SearchFieldDataType.String) { ... },

        ...

        new SimpleField("group_ids", SearchFieldDataType.Collection(SearchFieldDataType.String))

            { IsFilterable = true, IsHidden = true },

    },

    ...

};
await indexClient.CreateIndexAsync(index);

 


Include which Azure Entra security group IDs are allowed to see the data on initial index of each document


 


Each time a new document is uploaded & indexed, you need to include the list of Azure Entra security group IDs that are allowed to have this document in their search results. These Azure Entra security group IDs are GUIDs.


 


Example REST API call


 

{

    "value": [

        {

            "@search.action": "upload",

            "file_id": "1",

            "file_name": "secured_file_a",

            "file_description": "File access is restricted to the medical professionals, such as doctors, nurses",

            "group_ids": ["entra_security_group_id1"]

        },

        {

            "@search.action": "upload",

            "file_id": "2",

            "file_name": "secured_file_b",

            "file_description": " File access is restricted to the medical professionals, such as doctors, nurses, and other health care workers.",

            "group_ids": ["entra_security_group_id1", " entra_security_group_id2"]

        },

        {

            "@search.action": "upload",

            "file_id": "3",

            "file_name": "secured_file_c",

            "file_description": "File access is restricted to third parties and law enforcements",

            "group_ids": ["entra_security_group_id3", " entra_security_group_id5"]

        }

    ]

}

 


Example C#


 

var searchClient = await GetSearchClientAsync(options);

var batch = new IndexDocumentsBatch();

foreach (var section in sections)

{

    batch.Actions.Add(new IndexDocumentsAction(

        IndexActionType.MergeOrUpload,

        new SearchDocument

        {

            ["file_id"] = section.Id,

            ["file_name"] = section.SourceFile,

            ["group_ids"] = section.GroupIds

        }

     ));



    IndexDocumentsResult result = await searchClient.IndexDocumentsAsync(batch);

    ...

}

 


Provide the IDs of the Azure Entra security groups that the user belongs to so that each query can have security filtering applied to it.


 


For every query, add the Azure Entra security group IDs that the user belongs to (that are relevant to this application) to the list. Use an OData query to format this.


 


Example REST API call


 

POST https://[service name].search.windows.net/indexes/securedfiles/docs/search?api-version=2023-10-01-preview

Content-Type: application/json 

api-key: [admin or query key]

{

   "filter":"group_ids/any(g:search.in(g, ' entra_security_group_id1, entra_security_group_id2'))" 

}

 


Example C#


 

...

var filter = $"group_ids/any(g:search.in(g, '{string.Join(", ", user.Claims.Where(x => x.Type == "groups").Select(x => x.Value))}'))";

 }



 SearchOptions searchOption = new SearchOptions

 {

     Filter = filter,

     QueryType = SearchQueryType.Semantic,

     QueryLanguage = "en-us",

     QuerySpeller = "lexicon",

     SemanticConfigurationName = "default",

     Size = top,

     QueryCaption = useSemanticCaptions ? QueryCaptionType.Extractive : QueryCaptionType.None,

 };



var searchResultResponse = await searchClient.SearchAsync(query, searchOption, cancellationToken);

 


 My GitHub Reposiotry contains an example implementation (with security filtering using Azure Entra Security groups).

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.