Azure OpenAI Insights

The use of Azure OpenAI within businesses is no longer a question. However, issues such as cost and performance optimization, especially during scaling, still tend to temper the ambitions of many customers.

Azure Monitor addresses these questions through the analysis of collected data. Today, I would like to discuss an Azure Workbook created by Microsoft teams. This workbook aims to monitor the usage of Azure Open AI. It is available here : Azure-OpenAI-Insights: Workbook for Azure OpenAI service.

Azure Monitor provides the option to import workbooks into the workspace, and the necessary steps are detailed in the repository. The remainder of this article will be a presentation of the monitoring implemented.

Overview

The “Overview” tab provides detailed informations about the resources deployed by subscriptions, by resource groups, by locations, by types, by public or private networks…

This can be an interesting way of rationalizing usage and therefore costs within your company. Do you really need resources deployed in different locations? What cognitive services are associated with your OpenAI resources? …

Data governance can also be based on some of these metrics. Does your security policy allow access via public networks? Is data transit to other regions desirable or even permitted?

As you can see in the previous image, I deployed Azure OpenAI resources in three different locations. The dashboard also indicates the presence of other cognitive service resources (Vision, Speech and Form Recognizer renamed Document Intelligence). The interface also provides a detailed list of resources (image below).

Monitoring

The second tab is called “Monitor” and allows the user to explore Azure OpenAI resources Metrics. The interface resumes the main metrics (Monitoring Azure OpenAI Service) such as number of requests, processed tokens (inference or prompt), processed finetuned training hours, provision-managed utilization.

The screenshots above show the number of requests and token consumption. In my case, as the resource was dedicated to this article, there were very few requests. I’ve attached the screenshot shared by the Microsoft teams for comparison.

The proposed metrics are useful for tracking the adoption of a Gen-IA tool by your employees (number of requests, token usage), and we can even go so far as to imagine comparative studies between different company teams, different countries (based on different localizations), or different models (or model versions).

These metrics can also be useful for cost analysis: whether we’re talking about token consumption, provisioned capacity utilization or the number of training hours associated with a model.

Insights

The third tab, called “Insights” enables users to analyse the Azure OpenAI resources logs. This tab requires the user to activate “Diagnostic Setting” within Log Analytics. The screenshots below illustrate various possible aggregations of log information: model, average call duration, API Operation Name, IP…

The first image shows the number of deployed models by OpenAI model type (gpt-4, gpt-35-turbo…). The second shows the same thing, but by API Operation Name (chat, completion, embeddings, image generation…) and by deployment. The 3rd shows the average duration of calls by IP address. And finally, the last one shows all the logs.

Conclusion

This workbook is an interesting first step towards monitoring your OpenAI resources. You’ll be able to observe token usage and consumption effortlessly. It is possible to build your own dashboards using the metrics provided by Azure; but this workbook has the advantage of requiring no expertise in setting up log-based analytics: the Kusto queries (KQL) to exploit the logs are made for you. So you can focus on monitoring your business.

Leave a comment