ADC

AI gateway - Observability

May 22, 2026

Contributed by:

NetScaler AI gateway collects AI specific metrics and logs and exposes them to Splunk by default. A sample Splunk dashboard can be downloaded from the Citrix Download website to visualize the metrics and logs exported by the AI gateway.

Entity: server_svc_cfg

Metric Name	Description
si_tot_llm_input_tokens	Total Number of input tokens processed by the server
si_tot_llm_output_tokens	Total Number of output tokens processed by the server
si_tot_llm_tokens	Total Number of tokens (input + output)
si_cur_llm_tpm	Number of total (input + output) tokens per frequency interval
si_err_llm_token_limit_hit_on_server	Number of times the token limit reached on the server
si_cur_llm_latency	Token latency for this server
si_llm_tokenspermin	Configured value of token limit for the server
si_err_llm_token_limit	Number of times the token limit reached for the service in NetScaler

Entity: vserver_lb

Metric Name	Description
vsvr_llm_apptype	Configured Large Language Model (LLM) app type for the virtual server (Currently Azure OpenAI)
vsvr_err_llm_unsupported_request	Error counter when NetScaler receives an unsupported request
si_tot_llm_input_tokens	Total number of input tokens processed by the load balancing virtual server
si_tot_llm_output_tokens	Total number of output tokens processed by the load balancing virtual server
si_tot_llm_tokens	Total number of tokens (input + output) processed by the load balancing virtual server

Entity: vserver_cs

Metric Name	Description
si_tot_llm_input_tokens	Total number of input tokens processed by the content switching virtual server
si_tot_llm_output_tokens	Total number of output tokens processed by the content switching virtual server
si_tot_llm_tokens	Total number of tokens (input + output) processed by the content switching virtual server
si_cur_llm_tpm	Number of total (input + output) tokens per frequency interval
vsvr_llm_apptype	Configured Large Language Model (LLM) app type for the virtual server (Currently Azure OpenAI)

Entity: cs_pol

Metric Name	Description
pcb_hits	Number of hits on the policy on this binding.
pcb_undef_hits	Number of undef hits on the policy on this binding.

Note:

These counters are not exported by default and need to be added in the schema file. For the analytics time series profile using the schema, run -metrics DISABLED followed by -metrics ENABLED to refresh any change in schema.json.

Refer to the NetScaler observability integrations on sending metrics :

Web Insight records

The following fields are exported as part of Web Insight records if there are rate limit alerts.

JSON Field Name	Description
rate_limit_identifier_name	Configured name of `ns limitidentifer`.
rate_limit_selector_stream_name	Stream name based on selector expressions for which rate-limiting was applied
rate_limit_mode	Configured Rate limit mode
rate_limit_threshold	Configured Rate limiting threshold per stream.
rate_limit_value	Value at which rate-limiting was applied.

Note:

These fields are not exported by default and need to be added in the data format file. If the data format file is changed then use the update analytics profile <profile name> -data FormatFile <filename> command to ensure that the analytics profile is using the updated data format file.

Set the log_all_json_field attribute in the NetScaler CPX YAML file to send all the JSON fields for insights. If the log_all_json_field attribute is not set, then the data format file in the NetScaler CPX must be updated manually for the relevant fields, which is not recommended for the NetScaler CPX form factor.

The rate-limiting logs can be sent to Splunk. For information on sending logs to Splunk, see Export transaction logs directly from NetScaler to Splunk.

Usage tracking

Usage tracking allows you to track the input and output tokens or requests based on criteria such as team, user, application. NetScaler expects that the AI application sends the attributes such as the userid or teamid in HTTP header (such as X-user-id or X-org-id). This feature uses processed insights for tracking.

JSON Field Name	Description
observationPointId	An identifier of an Observation Point that is unique per Observation Domain.
nsPartitionId	An identifier of the NetScaler partition exporting the records.
stream_usecase	Stream Insights use case.
stream_sess_name	Stream Insights Stream session name.
stream_iden_name	Stream Insights Stream identifier name.
Requests	Number of requests consumed in the stream.
Bandwidth	Bandwidth used in the stream.
Connections	Number of active connections in the streams.
Resptime	Average response time.
Tokens	Number of input and output tokens consumed for LLM traffic in the stream.
stream_sort_key	Sort Identifier for the Top N results (Example: REQUESTS, TOKENS).
Timestamp	Timestamp of the export.

Here is a sample configuration where the tokens are being tracked per user and the user-id is sent in X-user-id HTTP header.

Create a Stream selector. In this step, the statistics are aggregated for the user id.

add stream selector <stream selector name> <rule>
<!--NeedCopy-->

Example:

add stream selector user_header "HTTP.REQ.HEADER(\"X-user-id\")"
<!--NeedCopy-->

Create a stream identifier.
```
add stream identifier <stream identifier name> <stream selector name> -interval <interval in mins> -logInterval <log interval in minutes> -logLimit <log limit> -sort TOKENS -trackTransactions TOKENS

```
Example:
```
add stream identifier si_gpt41_user_token testheader -interval 10 -logInterval 10 -logLimit 20 -sort TOKENS -trackTransactions TOKENS

```
In this configuration:
- Interval: Number of minutes of data to use when calculating session statistics (number of requests, number of tokens). The interval is a moving window that keeps the most recently collected data. Older data is discarded at regular intervals.
- logInterval: Time interval in minutes for logging the collected objects. The log interval must be greater than or equal to the interval of the stream identifier.
- logLimit: Maximum number of objects to be logged in the log interval.
Create a collector service for Splunk.
```
add service <collector> <splunk-server-ip-address> <protocol> <port>

```
Example:
```
add service splunk_service 10.102.34.155 HTTP 8088

```
In this configuration:
- ip-address: Splunk server IP address.
- collector-name: Name of the collector.
- protocol: Specify the protocol as HTTP or SSL.
- port: Port number.

Create analytics profile of type stream analytics and enable topN.

add analytics profile <profile-name> -type <insight> -collectors <collector-name> -analyticsAuthToken "<auth-scheme> <authorization-parameters>" -analyticsEndpointContentType "application/json" -analyticsEndpointUrl <endpoint-url> -topn ENABLED
<!--NeedCopy-->

Example:

add analytics profile topn_stream_profile -type streaminsight -topn ENABLED -analyticsAuthToken "Splunk 0471e73f-ee4b-44c3-90db-2461341d7b24" -analyticsEndpointUrl "/services/collector/event" -analyticsEndpointContentType "application/json" -collector splunk -dataFormatFile splunk_new1.txt
<!--NeedCopy-->

Bind analytics profile to stream identifiers.

bind stream identifier <stream identifier name> -analyticsProfile <analytics profile name>
<!--NeedCopy-->

Example:

bind stream identifier si_gpt41_user_token -analyticsProfile topn_stream_profile
<!--NeedCopy-->

Create a responder policy to collect stats for the given identifier.

add responder policy pol_collect_gpt41_user_token 'analytics.stream("si_gpt41_user_token").COLLECT_STATS' NOOP
<!--NeedCopy-->

Bind the responder policy to the target AI gateway virtual server for which the traffic must be analyzed by the identifier. To enable the same stream identifier to process traffic from multiple virtual servers, bind the responder policy to all the virtual servers.

bind lb <LBVserver Name> -policyName <Responder Policy Name> -priority 1 -gotoPriorityExpression NEXT -type REQUEST
<!--NeedCopy-->

Example:

bind lb vserver gpt-4.1 -policyName pol_collect_gpt41_user_token -priority 220 -gotoPriorityExpression NEXT -type REQUEST
<!--NeedCopy-->

The official version of this product documentation is in English. Any non-English version is solely provided for your convenience and may include machine-translated content. For more information, please refer to the Machine Translation Disclaimer on Cloud Software Group home.

Was this helpful

NetScaler Secure Deployment Guide

AI gateway - Observability

May 22, 2026

Contributed by:

May 22, 2026

Contributed by:

AI gateway - Observability

Entity: server_svc_cfg

Entity: vserver_lb

Entity: vserver_cs

Entity: cs_pol

Web Insight records

Usage tracking

In this article