Skip to main content

Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting us. A member of our team will be in touch shortly. Close

An error occurred while submitting your form. Please try again or file a bug report. Close

Monitoring (COS)

The Canonical Observability Stack (COS) is a set of tools that facilitates gathering, processing, visualizing, and setting up alerts on telemetry signals generated by workloads in and outside of Juju.

The OpenSearch charm can use COS to connect to Grafana and Prometheus to use monitoring, alert rules, and log features.

See: How to enable monitoring via COS and Grafana

Summary

Metrics

Prometheus metrics are automatically installed as an OpenSearch plugin: The Prometheus Exporter Plugin for OpenSearch

The meaning of the metrics collected can be found in the upstream documentation:

Alert Rules

The charm deploys a pre-configured set of Prometheus alert rules by default.

To ensure you are referencing the latest default alert rules, check the source file of alert definitions in the repository’s prometheus_alerts.yaml file.

Default alert rules

Alert Severity Notes
OpenSearchScrapeFailed critical Triggered when the prometheus scrape fails.
OpenSearchClusterRed critical Triggered when the health status of the cluster is red, meaning that principal shards are not allocated.
OpenSearchClusterYellowTemp warning Triggered when shards are still reallocating or initializing.
OpenSearchClusterYellow warning Triggered when some replicas shards are unassigned. Might require scale the application to host all shards
OpenSearchWriteRequestsRejectionJumps warning Triggered when the write request rejection is bigger than 5%. Might indicate that the node may not keep up with the indexing speed.
OpenSearchNodeDiskLowWatermarkReached warning Triggered when disks reach 85% of the capacity.
OpenSearchNodeDiskHighWatermarkReached high Triggered when disks reach 90% of the capacity.
OpenSearchJVMHeapUseHigh alert Triggered when the JVM Heap usage in a node reaches 75%.
OpenSearchHostSystemCPUHigh alert Triggered when system CPU usage in a node reaches 90%.
OpenSearchProcessCPUHigh alert Triggered when process CPU usage in a node reaches 90%.
OpenSearchThrottling warning Triggered when a cluster is throttling. Might indicate that is necessary to review indexing request rate, index lifecycle or scale the application.
OpenSearchThrottlingTooLong critical Triggered when a cluster is constantly throttling for at least 20 minutes. Might indicate that is necessary to review indexing request rate, index lifecycle or scale the application.

Logs

All the logs from the OpenSearch payload are available in the Grafana web interface at Home > Explore

To get OpenSearch logs, go to the Label filters field and set to juju_application = opensearch, select one operation, e.g. Line contains and run the query.

See also: How to connect to the Grafana web interface

Last updated 9 months ago. Help improve this document in the forum.