Skip to main content

Prometheus

Prometheus acts as a collector by "scraping" (collecting) metrics from different "targets" (endpoints or exporters). This guide walks you through setting up a basic Prometheus server to start collecting metrics. We assume a Linux-like environment for the example commands.

Collector Setup

Step 1: Download Prometheus

  1. Go to the official Prometheus downloads page.
  2. Download the latest stable version of Prometheus for your operating system (e.g., prometheus-2.XX.X.linux-amd64.tar.gz).
  3. Extract the downloaded archive into a directory of your choice. For example, if you downloaded it to ~/downloads:
    cd ~/downloads
    tar xvfz prometheus-*.tar.gz
    # This will create a directory like 'prometheus-2.XX.X.linux-amd64'
    # Let's navigate into that directory:
    cd prometheus-*.linux-amd64
    # We'll note the absolute path to this directory for reference, for example:
    # export PROMETHEUS_HOME=$(pwd)
    # /home/your_user/downloads/prometheus-2.XX.X.linux-amd64
    This directory contains the prometheus binary and an example configuration file prometheus.yml (which we will replace or edit).

Step 2: Create and Configure prometheus.yml

The prometheus.yml file is the heart of Prometheus's configuration. This is where you define where and how often Prometheus should collect metrics.

If you followed the extraction steps and cd'd into the Prometheus directory (e.g., /home/your_user/downloads/prometheus-2.XX.X.linux-amd64/), you'll create or edit the prometheus.yml file in that location. For a more permanent setup, you might place it in a dedicated configuration directory like /etc/prometheus/prometheus.yml.

Below is what a basic prometheus.yml file would look like. If you are creating the file for the first time or replacing the example one, this would be its initial content. We use /path/to/your/prometheus.yml as a placeholder for the file's path in the diff. Make sure to replace it with the actual path where you will save your file (e.g., $(pwd)/prometheus.yml if you are in the Prometheus directory).

scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

- job_name: 'udmg'
metrics_path: /metrics
static_configs:
- targets: ['<udmg-server-ip>:7070']

Metrics

Go Runtime Metrics

  • go_gc_duration_seconds:

    • Description: A summary of the time taken by garbage collection cycles. Garbage collection is the process of freeing up memory that is no longer in use by the Go program.
    • Type: Summary. A summary provides several quantiles (0th, 25th, 50th, 75th, 100th), the total sum, and the count of observations.
    • Breakdown:
      • go_gc_duration_seconds{quantile="0"}: The minimum GC duration (70.807 microseconds).
      • go_gc_duration_seconds{quantile="0.25"}: The 25th percentile GC duration (89.407 microseconds).
      • go_gc_duration_seconds{quantile="0.5"}: The median GC duration (125.611 microseconds).
      • go_gc_duration_seconds{quantile="0.75"}: The 75th percentile GC duration (241.92 microseconds).
      • go_gc_duration_seconds{quantile="1"}: The maximum GC duration (368.83 microseconds).
      • go_gc_duration_seconds_sum: The total time spent in GC (0.003035851 seconds).
      • go_gc_duration_seconds_count: The number of GC cycles (18).
    • Use: High values or increasing trends indicate potential garbage collection issues, which can impact application performance. The quantiles help understand the distribution of GC times.
  • go_gc_gogc_percent:

    • Description: The target percentage of the heap size that triggers a garbage collection. Defaults to 100. Configured by the GOGC environment variable.
    • Type: Gauge. A gauge represents a single numerical value that can arbitrarily increase and decrease.
    • Value: 100.
    • Use: Indicates how aggressively the Go runtime is trying to manage memory. Lower values trigger more frequent GC.
  • go_gc_gomemlimit_bytes:

    • Description: The maximum amount of memory the Go runtime can use. Configured by the GOMEMLIMIT environment variable.
    • Type: Gauge
    • Value: 9.223372036854776e+18 (approximately 9.2 exabytes, which is math.MaxInt64).
    • Use: Shows the upper bound of memory usage.
  • go_goroutines:

    • Description: The number of currently running goroutines (lightweight threads in Go).
    • Type: Gauge
    • Value: 37
    • Use: A high or rapidly increasing number of goroutines can indicate a potential concurrency issue, such as a goroutine leak.
  • go_info:

    • Description: Information about the Go environment.
    • Type: Gauge
    • Value: 1, with a label version="go1.23.8".
    • Use: Provides the Go version. Useful for debugging and understanding the environment.
  • go_memstats_alloc_bytes:

    • Description: The number of bytes of heap memory currently in use.
    • Type: Gauge
    • Value: 5.236488e+06 (approximately 5.2 MB)
    • Use: Shows how much memory the application is actively using.
  • go_memstats_alloc_bytes_total:

    • Description: The total number of bytes of heap memory allocated since the program started.
    • Type: Counter. A counter is a metric that only increases.
    • Value: 3.2523192e+07 (approximately 32.5 MB)
    • Use: Useful for calculating the memory allocation rate.
  • go_memstats_buck_hash_sys_bytes:

    • Description: Bytes used by the profiling bucket hash table.
    • Type: Gauge
    • Value: 1.474131e+06
    • Use: Related to memory used for profiling.
  • go_memstats_frees_total:

    • Description: Total number of heap object frees.
    • Type: Counter
    • Value: 323214
    • Use: Tracks how many heap objects have been freed.
  • go_memstats_gc_sys_bytes:

    • Description: Bytes used for garbage collection system metadata.
    • Type: Gauge
    • Value: 3.453336e+06
    • Use: Memory overhead related to GC.
  • go_memstats_heap_alloc_bytes:

    • Description: Same as go_memstats_alloc_bytes.
    • Type: Gauge
    • Value: 5.236488e+06
    • Use: Redundant with go_memstats_alloc_bytes.
  • go_memstats_heap_idle_bytes:

    • Description: Heap memory that is waiting to be used.
    • Type: Gauge
    • Value: 7.675904e+06
    • Use: Indicates available heap memory.
  • go_memstats_heap_inuse_bytes:

    • Description: Heap memory that is currently in use.
    • Type: Gauge
    • Value: 8.15104e+06
    • Use: Memory actively being used in the heap.
  • go_memstats_heap_objects:

    • Description: Number of allocated heap objects
    • Type: Gauge
    • Value: 19118
    • Use: Number of objects in the heap
  • go_memstats_heap_released_bytes:

    • Description: Heap memory released to the OS.
    • Type: Gauge
    • Value: 6.995968e+06
    • Use: How much memory has been returned to the operating system.
  • go_memstats_heap_sys_bytes:

    • Description: Heap memory obtained from the system.
    • Type: Gauge
    • Value: 1.5826944e+07
    • Use: Total heap memory acquired.
  • go_memstats_last_gc_time_seconds:

    • Description: Seconds since 1970 of the last garbage collection.
    • Type: Gauge
    • Value: 1.7467308985491297e+09
    • Use: Timestamp of the last GC.
  • go_memstats_mallocs_total:

    • Description: Total number of heap objects allocated.
    • Type: Counter
    • Value: 342332
    • Use: Tracks total allocations.
  • go_memstats_mcache_inuse_bytes:

    • Description: Bytes in use by mcache structures
    • Type: Gauge
    • Value: 4800
    • Use: Memory used by mcache.
  • go_memstats_mcache_sys_bytes:

    • Description: Bytes used for mcache structures obtained from system.
    • Type: Gauge
    • Value: 15600
    • Use: Memory used by mcache from the system.
  • go_memstats_mspan_inuse_bytes:

    • Description: Bytes in use by mspan structures
    • Type: Gauge
    • Value: 154720
    • Use: Memory used by mspan.
  • go_memstats_mspan_sys_bytes:

    • Description: Bytes used for mspan structures obtained from system.
    • Type: Gauge
    • Value: 195840
    • Use: Memory used by mspan from the system.
  • go_memstats_next_gc_bytes:

    • Description: Heap size when the next garbage collection will take place.
    • Type: Gauge
    • Value: 1.0816336e+07
    • Use: Predicts when the next GC will occur.
  • go_memstats_other_sys_bytes:

    • Description: Bytes used for other system allocations.
    • Type: Gauge
    • Value: 977709
    • Use: Miscellaneous system memory usage.
  • go_memstats_stack_inuse_bytes:

    • Description: Bytes obtained from system for stack allocator in non-CGO environments.
    • Type: Gauge
    • Value: 950272
    • Use: Stack memory usage (non-CGO).
  • go_memstats_stack_sys_bytes:

    • Description: Bytes obtained from system for stack allocator.
    • Type: Gauge
    • Value: 950272
    • Use: Total stack memory usage.
  • go_memstats_sys_bytes:

    • Description: Total bytes obtained from the system.
    • Type: Gauge
    • Value: 2.2893832e+07
    • Use: Overall system memory consumption by the Go runtime.
  • go_sched_gomaxprocs_threads:

    • Description: The value of GOMAXPROCS, which limits the number of OS threads that can execute Go code simultaneously.
    • Type: Gauge
    • Value: 4
    • Use: Shows the level of parallelism configured for the Go runtime.
  • go_sql_idle_connections:

    • Description: The number of idle connections.
    • Type: Gauge
    • Value: 2, with label db_name="udmg-server"
    • Use: Number of database connections in idle state.
  • go_sql_in_use_connections:

    • Description: The number of connections currently in use.
    • Type: Gauge
    • Value: 0, with label db_name="udmg-server"
    • Use: Number of database connections currently active.
  • go_sql_max_idle_closed_total:

    • Description: The total number of connections closed due to SetMaxIdleConns.
    • Type: Counter
    • Value: 0, with label db_name="udmg-server"
    • Use: Count of connections closed because of maximum idle connection setting.
  • go_sql_max_idle_time_closed_total:

    • Description: The total number of connections closed due to SetConnMaxIdleTime.
    • Type: Counter
    • Value: 0, with label db_name="udmg-server"
    • Use: Count of connections closed because of maximum idle time.
  • go_sql_max_lifetime_closed_total:

    • Description: The total number of connections closed due to SetConnMaxLifetime.
    • Type: Counter
    • Value: 0, with label db_name="udmg-server"
    • Use: Count of connections closed because of maximum connection lifetime.
  • go_sql_max_open_connections:

    • Description: Maximum number of open connections to the database.
    • Type: Gauge
    • Value: 100, with label db_name="udmg-server"
    • Use: Maximum allowed database connections.
  • go_sql_open_connections:

    • Description: The number of established connections both in use and idle.
    • Type: Gauge
    • Value: 2, with label db_name="udmg-server"
    • Use: Total number of open database connections.
  • go_sql_wait_count_total:

    • Description: The total number of connections waited for.
    • Type: Counter
    • Value: 0, with label db_name="udmg-server"
    • Use: Count of times the application had to wait for a database connection.
  • go_sql_wait_duration_seconds_total:

    • Description: The total time blocked waiting for a new connection.
    • Type: Counter
    • Value: 0, with label db_name="udmg-server"
    • Use: Total time spent waiting for database connections.
  • go_threads:

    • Description: The number of OS threads created.
    • Type: Gauge
    • Value: 11
    • Use: Number of operating system threads used by the Go process.

HTTP Metrics

  • promhttp_metric_handler_requests_in_flight:

    • Description: The current number of scrapes (Prometheus requests) being served.
    • Type: Gauge
    • Value: 1
    • Use: Indicates the concurrency of Prometheus scrapes.
  • promhttp_metric_handler_requests_total:

    • Description: The total number of scrapes, broken down by HTTP status code.
    • Type: Counter
    • Breakdown:
      • promhttp_metric_handler_requests_total{code="200"}: 78 (Successful scrapes)
      • promhttp_metric_handler_requests_total{code="404"}: 79 (Scrapes that resulted in a 404 error)
      • promhttp_metric_handler_requests_total{code="500"}: 0 (Server error)
      • promhttp_metric_handler_requests_total{code="503"}: 0
    • Use: Shows the rate of Prometheus scrapes and their success/failure status. A high number of 404s might indicate misconfiguration.

UDMG Server Metrics

  • udmg_connection_actives_totals:

    • Description: UDMG Server total active connections.
    • Type: Gauge
    • Value: 0
    • Use: Number of currently active connections to the UDMG server.
  • udmg_connection_totals:

    • Description: UDMG Server total connections.
    • Type: Gauge
    • Value: 0
    • Use: Total number of connections to the UDMG server.
  • udmg_memory_host_free_percent:

    • Description: UDMG Server host memory free percentage.
    • Type: Gauge
    • Value: 3.18078976e+08
    • Use: Percentage of free memory on the host.
  • udmg_memory_host_total:

    • Description: UDMG Server host memory total.
    • Type: Gauge
    • Value: 4.03789824e+09
    • Use: Total host memory.
  • udmg_memory_host_used_percent:

    • Description: UDMG Server host memory used percentage.
    • Type: Gauge
    • Value: 42.239061081440234
    • Use: Percentage of memory in use on the host.
  • udmg_storage_disk_used_percent:

    • Description: UDMG Server work directory used percentage.
    • Type: Gauge
    • Value: 2.7302465536e+10
    • Use: Disk space usage of the UDMG server's working directory.
  • udmg_transfers_actives:

    • Description: UDMG Server active transfers.
    • Type: Gauge
    • Value: 0
    • Use: Number of currently active data transfers.
  • udmg_transfers_bytes_in_totals:

    • Description: UDMG Server incoming transfer bytes.
    • Type: Gauge
    • Value: 0
    • Use: Total bytes transferred in.
  • udmg_transfers_bytes_out_totals:

    • Description: UDMG server outgoing transfer bytes.
    • Type: Gauge
    • Value: 0
    • Use: Total bytes transferred out.
  • udmg_transfers_error_totals:

    • Description: UDMG Server error transfers.
    • Type: Gauge
    • Value: 0
    • Use: Number of transfers that resulted in an error.
  • udmg_transfers_totals:

    • Description: UDMG Server total transfers.
    • Type: Gauge
    • Value: 0
    • Use: Total number of transfers.
  • udmg_version_info:

    • Description: UDMG Server version information.
    • Type: Gauge
    • Value: 1, with labels:
      • branch="HEAD"
      • build="1825"
      • commit="7ed4d58fb"
      • date="2025-05-08T18:38:52+00:00"
      • number="2.99.0.1"
    • Use: Provides detailed version information about the UDMG server.