ADC

AI gateway - Observability

NetScaler AI gateway collects AI specific metrics and logs and exposes them to Splunk by default. A sample Splunk dashboard can be downloaded from the Citrix Download website to visualize the metrics and logs exported by the AI gateway.

Entity: server_svc_cfg

Metric Name Description
si_tot_llm_input_tokens Total Number of input tokens processed by the server
si_tot_llm_output_tokens Total Number of output tokens processed by the server
si_tot_llm_tokens Total Number of tokens (input + output)
si_cur_llm_tpm Number of total (input + output) tokens per frequency interval
si_err_llm_token_limit_hit_on_server Number of times the token limit reached on the server
si_cur_llm_latency Token latency for this server
si_llm_tokenspermin Configured value of token limit for the server
si_err_llm_token_limit Number of times the token limit reached for the service in NetScaler

Entity: vserver_lb

Metric Name Description
vsvr_llm_apptype Configured Large Language Model (LLM) app type for the virtual server (Currently Azure OpenAI)
vsvr_err_llm_unsupported_request Error counter when NetScaler receives an unsupported request
si_tot_llm_input_tokens Total number of input tokens processed by the load balancing virtual server
si_tot_llm_output_tokens Total number of output tokens processed by the load balancing virtual server
si_tot_llm_tokens Total number of tokens (input + output) processed by the load balancing virtual server

Note:

These counters are not exported by default and need to be added in the schema file. For the analytics time series profile using the schema, run -metrics DISABLED followed by -metrics ENABLED to refresh any change in schema.json.

Refer to the NetScaler observability integrations on sending metrics :

Web Insight records

The following fields are exported as part of Web Insight records if there are rate limit alerts.

JSON Field Name Description
rate_limit_identifier_name Configured name of ns limitidentifer.
rate_limit_selector_stream_name Stream name based on selector expressions for which rate-limiting was applied
rate_limit_mode Configured Rate limit mode
rate_limit_threshold Configured Rate limiting threshold per stream.
rate_limit_value Value at which rate-limiting was applied.

Note:

These fields are not exported by default and need to be added in the data format file. If the data format file is changed then use the update analytics profile <profile name> -data FormatFile <filename> command to ensure that the analytics profile is using the updated data format file.

The rate-limiting logs can be sent to Splunk. For information on sending logs to Splunk, see Export transaction logs directly from NetScaler to Splunk.

AI gateway - Observability