diff --git a/README.md b/README.md index e21593710..8cb72e587 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ Name | Description | Enabled by default [license](docs/collector.license.md) | Windows license status | [logical_disk](docs/collector.logical_disk.md) | Logical disks, disk I/O | ✓ [logon](docs/collector.logon.md) | User logon sessions | -[memory](docs/collector.memory.md) | Memory usage metrics | +[memory](docs/collector.memory.md) | Memory usage metrics | ✓ [mscluster](docs/collector.mscluster.md) | MSCluster metrics | [msmq](docs/collector.msmq.md) | MSMQ queues | [mssql](docs/collector.mssql.md) | [SQL Server Performance Objects](https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/use-sql-server-objects#SQLServerPOs) metrics | @@ -214,7 +214,7 @@ If you need to skip TLS verification, you can use the `--config.file.insecure-sk ```yaml collectors: - enabled: cpu,cs,net,service + enabled: cpu,net,service collector: service: services-where: "Name='windows_exporter'" diff --git a/config.yaml b/config.yaml index bbb26da76..f837deff8 100644 --- a/config.yaml +++ b/config.yaml @@ -1,5 +1,5 @@ collectors: - enabled: cpu,cpu_info,cs,exchange,iis,logical_disk,logon,memory,net,os,process,remote_fx,service,system,tcp,time,terminal_services,textfile + enabled: cpu,cpu_info,exchange,iis,logical_disk,logon,memory,net,os,process,remote_fx,service,system,tcp,time,terminal_services,textfile collector: service: services-where: "Name='windows_exporter'" diff --git a/docs/collector.cpu.md b/docs/collector.cpu.md index f5a5d19fd..f031579ba 100644 --- a/docs/collector.cpu.md +++ b/docs/collector.cpu.md @@ -16,26 +16,22 @@ None ## Metrics These metrics are available on all versions of Windows: -Name | Description | Type | Labels ------|-------------|------|------- -`windows_cpu_cstate_seconds_total` | Time spent in low-power idle states | counter | `core`, `state` -`windows_cpu_time_total` | Time that processor spent in different modes (dpc, idle, interrupt, privileged, user) | counter | `core`, `mode` -`windows_cpu_interrupts_total` | Total number of received and serviced hardware interrupts | counter | `core` -`windows_cpu_dpcs_total` | Total number of received and serviced deferred procedure calls (DPCs) | counter | `core` - -These metrics are only exposed on Windows Server 2008R2 and later: - -Name | Description | Type | Labels ------|-------------|------|------- -`windows_cpu_clock_interrupts_total` | Total number of received and serviced clock tick interrupts | counter | `core` -`windows_cpu_idle_break_events_total` | Total number of time processor was woken from idle | counter | `core` -`windows_cpu_parking_status` | Parking Status represents whether a processor is parked or not | gauge | `core` -`windows_cpu_core_frequency_mhz` | Core frequency in megahertz | gauge | `core` -`windows_cpu_processor_performance_total` | Processor Performance is the number of CPU cycles executing instructions by each core; it is believed to be similar to the value that the APERF MSR would show, were it exposed | counter | `core` -`windows_cpu_processor_mperf_total` | Processor MPerf Total is proportioanl to the number of TSC ticks each core has accumulated while executing instructions. Due to the manner in which it is presented, it should be scaled by 1e2 to properly line up with Processor Performance Total. As above, it is believed to be closely related to the MPERF MSR. | counter | `core` -`windows_cpu_processor_rtc_total` | RTC total is assumed to represent the 64Hz tick rate in Windows. It is not by itself useful, but can be used with `windows_cpu_processor_utility_total` to more accurately measure CPU utilisation than with `windows_cpu_time_total` | counter | `core` -`windows_cpu_processor_utility_total` | Processor Utility Total is a newer, more accurate measure of CPU utilization, in particular handling modern CPUs with variant CPU frequencies. The rate of this counter divided by the rate of `windows_cpu_processor_rtc_total` should provide an accurate view of CPU utilisation on modern systems, as observed in Task Manager. | counter | `core` -`windows_cpu_processor_privileged_utility_total` | Processor Privileged Utility Total, when used in a similar fashion to `windows_cpu_processor_utility_total` will show the portion of CPU utilization which is happening in privileged mode. | counter | `core` +| Name | Description | Type | Labels | +|--------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|-----------------| +| `windows_cpu_logical_processor` | Number of installed logical processors | counter | `core`, `state` | +| `windows_cpu_cstate_seconds_total` | Time spent in low-power idle states | counter | `core`, `state` | +| `windows_cpu_time_total` | Time that processor spent in different modes (dpc, idle, interrupt, privileged, user) | counter | `core`, `mode` | +| `windows_cpu_interrupts_total` | Total number of received and serviced hardware interrupts | counter | `core` | +| `windows_cpu_dpcs_total` | Total number of received and serviced deferred procedure calls (DPCs) | counter | `core` | +| `windows_cpu_clock_interrupts_total` | Total number of received and serviced clock tick interrupts | counter | `core` | +| `windows_cpu_idle_break_events_total` | Total number of time processor was woken from idle | counter | `core` | +| `windows_cpu_parking_status` | Parking Status represents whether a processor is parked or not | gauge | `core` | +| `windows_cpu_core_frequency_mhz` | Core frequency in megahertz | gauge | `core` | +| `windows_cpu_processor_performance_total` | Processor Performance is the number of CPU cycles executing instructions by each core; it is believed to be similar to the value that the APERF MSR would show, were it exposed | counter | `core` | +| `windows_cpu_processor_mperf_total` | Processor MPerf Total is proportioanl to the number of TSC ticks each core has accumulated while executing instructions. Due to the manner in which it is presented, it should be scaled by 1e2 to properly line up with Processor Performance Total. As above, it is believed to be closely related to the MPERF MSR. | counter | `core` | +| `windows_cpu_processor_rtc_total` | RTC total is assumed to represent the 64Hz tick rate in Windows. It is not by itself useful, but can be used with `windows_cpu_processor_utility_total` to more accurately measure CPU utilisation than with `windows_cpu_time_total` | counter | `core` | +| `windows_cpu_processor_utility_total` | Processor Utility Total is a newer, more accurate measure of CPU utilization, in particular handling modern CPUs with variant CPU frequencies. The rate of this counter divided by the rate of `windows_cpu_processor_rtc_total` should provide an accurate view of CPU utilisation on modern systems, as observed in Task Manager. | counter | `core` | +| `windows_cpu_processor_privileged_utility_total` | Processor Privileged Utility Total, when used in a similar fashion to `windows_cpu_processor_utility_total` will show the portion of CPU utilization which is happening in privileged mode. | counter | `core` | ### Example metric Show frequency of host CPU cores diff --git a/docs/collector.cs.md b/docs/collector.cs.md index ffec191b4..33465407f 100644 --- a/docs/collector.cs.md +++ b/docs/collector.cs.md @@ -1,5 +1,9 @@ # cs collector +> [!CAUTION] +> This collector is deprecated and will be removed in a future release. +> See https://github.com/prometheus-community/windows_exporter/pull/1596 for more information. + The cs collector exposes metrics detailing the hardware of the computer system ||| diff --git a/docs/collector.memory.md b/docs/collector.memory.md index 18a7b2686..f77ad7e3a 100644 --- a/docs/collector.memory.md +++ b/docs/collector.memory.md @@ -5,9 +5,9 @@ The memory collector exposes metrics about system memory usage ||| -|- Metric name prefix | `memory` -Data source | Perflib -Classes | `Win32_PerfRawData_PerfOS_Memory` -Enabled by default? | No +Data source | Performance Counters +Classes | - +Enabled by default? | Yes ## Flags @@ -15,46 +15,73 @@ None ## Metrics -Name | Description | Type | Labels ------|-------------|------|------- -`windows_memory_available_bytes` | The amount of physical memory immediately available for allocation to a process or for system use. It is equal to the sum of memory assigned to the standby (cached), free and zero page lists | gauge | None -`windows_memory_cache_bytes` | Number of bytes currently being used by the file system cache | gauge | None -`windows_memory_cache_bytes_peak` | Maximum number of CacheBytes after the system was last restarted | gauge | None -`windows_memory_cache_faults_total` | Number of faults which occur when a page sought in the file system cache is not found there and must be retrieved from elsewhere in memory (soft fault) or from disk (hard fault) | counter | None -`windows_memory_commit_limit` | Amount of virtual memory, in bytes, that can be committed without having to extend the paging file(s) | gauge | None -`windows_memory_committed_bytes` | Amount of committed virtual memory, in bytes | gauge | None -`windows_memory_demand_zero_faults_total` | The number of zeroed pages required to satisfy faults. Zeroed pages, pages emptied of previously stored data and filled with zeros, are a security feature of Windows that prevent processes from seeing data stored by earlier processes that used the memory space | counter | None -`windows_memory_free_and_zero_page_list_bytes` | The amount of physical memory, in bytes, that is assigned to the free and zero page lists. This memory does not contain cached data. It is immediately available for allocation to a process or for system use | gauge | None -`windows_memory_free_system_page_table_entries` | Number of page table entries not being used by the system | gauge | None -`windows_memory_modified_page_list_bytes` | The amount of physical memory, in bytes, that is assigned to the modified page list. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. This memory needs to be written out before it will be available for allocation to a process or for system use | gauge | None -`windows_memory_page_faults_total` | Overall rate at which faulted pages are handled by the processor | counter | None -`windows_memory_swap_page_reads_total` | Number of disk page reads (a single read operation reading several pages is still only counted once) | counter | None -`windows_memory_swap_pages_read_total` | Number of pages read across all page reads (ie counting all pages read even if they are read in a single operation) | counter | None -`windows_memory_swap_pages_written_total` | Number of pages written across all page writes (ie counting all pages written even if they are written in a single operation) | counter | None -`windows_memory_swap_page_operations_total` | Total number of swap page read and writes (PagesPersec) | counter | None -`windows_memory_swap_page_writes_total` | Number of disk page writes (a single write operation writing several pages is still only counted once) | counter | None -`windows_memory_pool_nonpaged_allocs_total` | The number of calls to allocate space in the nonpaged pool. The nonpaged pool is an area of system memory area for objects that cannot be written to disk, and must remain in physical memory as long as they are allocated | counter | None -`windows_memory_pool_nonpaged_bytes` | Number of bytes in the non-paged pool, an area of the system virtual memory that is used for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated | gauge | None -`windows_memory_pool_paged_allocs_total` | Number of calls to allocate space in the paged pool, regardless of the amount of space allocated in each call | counter | None -`windows_memory_pool_paged_bytes` | Number of bytes in the paged pool | gauge | None -`windows_memory_pool_paged_resident_bytes` | The size, in bytes, of the portion of the paged pool that is currently resident and active in physical memory. The paged pool is an area of the system virtual memory that is used for objects that can be written to disk when they are not being used | gauge | None -`windows_memory_standby_cache_core_bytes` | The amount of physical memory, in bytes, that is assigned to the core standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists | gauge | None -`windows_memory_standby_cache_normal_priority_bytes` | The amount of physical memory, in bytes, that is assigned to the normal priority standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists | gauge | None -`windows_memory_standby_cache_reserve_bytes` | The amount of physical memory, in bytes, that is assigned to the reserve standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists | gauge | None -`windows_memory_system_cache_resident_bytes` | The size, in bytes, of the portion of the system file cache which is currently resident and active in physical memory | gauge | None -`windows_memory_system_code_resident_bytes` | The size, in bytes, of the pageable operating system code that is currently resident and active in physical memory. This value is a component of Memory\\System Code Total Bytes. Memory\\System Code Resident Bytes (and Memory\\System Code Total Bytes) does not include code that must remain in physical memory and cannot be written to disk | gauge | None -`windows_memory_system_code_total_bytes` | The size, in bytes, of the pageable operating system code currently mapped into the system virtual address space. This value is calculated by summing the bytes in Ntoskrnl.exe, Hal.dll, the boot drivers, and file systems loaded by Ntldr/osloader. This counter does not include code that must remain in physical memory and cannot be written to disk | gauge | None -`windows_memory_system_driver_resident_bytes` | The size, in bytes, of the pageable physical memory being used by device drivers. It is the working set (physical memory area) of the drivers. This value is a component of Memory\\System Driver Total Bytes, which also includes driver memory that has been written to disk. Neither Memory\\System Driver Resident Bytes nor Memory\\System Driver Total Bytes includes memory that cannot be written to disk | gauge | None -`windows_memory_system_driver_total_bytes` | The size, in bytes, of the pageable virtual memory currently being used by device drivers. Pageable memory can be written to disk when it is not being used. It includes both physical memory (Memory\\System Driver Resident Bytes) and code and data paged to disk. It is a component of Memory\\System Code Total Bytes | gauge | None -`windows_memory_transition_faults_total` | Number of faults rate at which page faults are resolved by recovering pages that were being used by another process sharing the page, or were on the modified page list or the standby list, or were being written to disk at the time of the page fault. The pages were recovered without additional disk activity. Transition faults are counted in numbers of faults; because only one page is faulted in each operation, it is also equal to the number of pages faulted | counter | None -`windows_memory_transition_pages_repurposed_total` | Transition Pages RePurposed is the rate at which the number of transition cache pages were reused for a different purpose. These pages would have otherwise remained in the page cache to provide a (fast) soft fault (instead of retrieving it from backing store) in the event the page was accessed in the future | counter | None -`windows_memory_write_copies_total` | The number of page faults caused by attempting to write that were satisfied by copying the page from elsewhere in physical memory | counter | None +| Name | Description | Type | Labels | +|------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|--------| +| `windows_memory_available_bytes` | The amount of physical memory immediately available for allocation to a process or for system use. It is equal to the sum of memory assigned to the standby (cached), free and zero page lists | gauge | None | +| `windows_memory_cache_bytes` | Number of bytes currently being used by the file system cache | gauge | None | +| `windows_memory_cache_bytes_peak` | Maximum number of CacheBytes after the system was last restarted | gauge | None | +| `windows_memory_cache_faults_total` | Number of faults which occur when a page sought in the file system cache is not found there and must be retrieved from elsewhere in memory (soft fault) or from disk (hard fault) | counter | None | +| `windows_memory_commit_limit` | Amount of virtual memory, in bytes, that can be committed without having to extend the paging file(s) | gauge | None | +| `windows_memory_committed_bytes` | Amount of committed virtual memory, in bytes | gauge | None | +| `windows_memory_demand_zero_faults_total` | The number of zeroed pages required to satisfy faults. Zeroed pages, pages emptied of previously stored data and filled with zeros, are a security feature of Windows that prevent processes from seeing data stored by earlier processes that used the memory space | counter | None | +| `windows_memory_free_and_zero_page_list_bytes` | The amount of physical memory, in bytes, that is assigned to the free and zero page lists. This memory does not contain cached data. It is immediately available for allocation to a process or for system use | gauge | None | +| `windows_memory_free_system_page_table_entries` | Number of page table entries not being used by the system | gauge | None | +| `windows_memory_modified_page_list_bytes` | The amount of physical memory, in bytes, that is assigned to the modified page list. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. This memory needs to be written out before it will be available for allocation to a process or for system use | gauge | None | +| `windows_memory_page_faults_total` | Overall rate at which faulted pages are handled by the processor | counter | None | +| `windows_memory_swap_page_reads_total` | Number of disk page reads (a single read operation reading several pages is still only counted once) | counter | None | +| `windows_memory_swap_pages_read_total` | Number of pages read across all page reads (ie counting all pages read even if they are read in a single operation) | counter | None | +| `windows_memory_swap_pages_written_total` | Number of pages written across all page writes (ie counting all pages written even if they are written in a single operation) | counter | None | +| `windows_memory_swap_page_operations_total` | Total number of swap page read and writes (PagesPersec) | counter | None | +| `windows_memory_swap_page_writes_total` | Number of disk page writes (a single write operation writing several pages is still only counted once) | counter | None | +| `windows_memory_physical_free_bytes` | Bytes of physical memory currently unused and available | gauge | None | +| `windows_memory_physical_total_bytes` | Total bytes of physical memory available to the operating system. This value does not necessarily indicate the true amount of physical memory, but what is reported to the operating system as available to it | gauge | None | +| `windows_memory_pool_nonpaged_allocs_total` | The number of calls to allocate space in the nonpaged pool. The nonpaged pool is an area of system memory area for objects that cannot be written to disk, and must remain in physical memory as long as they are allocated | counter | None | +| `windows_memory_pool_nonpaged_bytes` | Number of bytes in the non-paged pool, an area of the system virtual memory that is used for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated | gauge | None | +| `windows_memory_pool_paged_allocs_total` | Number of calls to allocate space in the paged pool, regardless of the amount of space allocated in each call | counter | None | +| `windows_memory_pool_paged_bytes` | Number of bytes in the paged pool | gauge | None | +| `windows_memory_pool_paged_resident_bytes` | The size, in bytes, of the portion of the paged pool that is currently resident and active in physical memory. The paged pool is an area of the system virtual memory that is used for objects that can be written to disk when they are not being used | gauge | None | +| `windows_memory_process_memory_limit_bytes` | Maximum number of bytes of memory that can be allocated to a process | gauge | None | +| `windows_memory_standby_cache_core_bytes` | The amount of physical memory, in bytes, that is assigned to the core standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists | gauge | None | +| `windows_memory_standby_cache_normal_priority_bytes` | The amount of physical memory, in bytes, that is assigned to the normal priority standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists | gauge | None | +| `windows_memory_standby_cache_reserve_bytes` | The amount of physical memory, in bytes, that is assigned to the reserve standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists | gauge | None | +| `windows_memory_system_cache_resident_bytes` | The size, in bytes, of the portion of the system file cache which is currently resident and active in physical memory | gauge | None | +| `windows_memory_system_code_resident_bytes` | The size, in bytes, of the pageable operating system code that is currently resident and active in physical memory. This value is a component of Memory\\System Code Total Bytes. Memory\\System Code Resident Bytes (and Memory\\System Code Total Bytes) does not include code that must remain in physical memory and cannot be written to disk | gauge | None | +| `windows_memory_system_code_total_bytes` | The size, in bytes, of the pageable operating system code currently mapped into the system virtual address space. This value is calculated by summing the bytes in Ntoskrnl.exe, Hal.dll, the boot drivers, and file systems loaded by Ntldr/osloader. This counter does not include code that must remain in physical memory and cannot be written to disk | gauge | None | +| `windows_memory_system_driver_resident_bytes` | The size, in bytes, of the pageable physical memory being used by device drivers. It is the working set (physical memory area) of the drivers. This value is a component of Memory\\System Driver Total Bytes, which also includes driver memory that has been written to disk. Neither Memory\\System Driver Resident Bytes nor Memory\\System Driver Total Bytes includes memory that cannot be written to disk | gauge | None | +| `windows_memory_system_driver_total_bytes` | The size, in bytes, of the pageable virtual memory currently being used by device drivers. Pageable memory can be written to disk when it is not being used. It includes both physical memory (Memory\\System Driver Resident Bytes) and code and data paged to disk. It is a component of Memory\\System Code Total Bytes | gauge | None | +| `windows_memory_transition_faults_total` | Number of faults rate at which page faults are resolved by recovering pages that were being used by another process sharing the page, or were on the modified page list or the standby list, or were being written to disk at the time of the page fault. The pages were recovered without additional disk activity. Transition faults are counted in numbers of faults; because only one page is faulted in each operation, it is also equal to the number of pages faulted | counter | None | +| `windows_memory_transition_pages_repurposed_total` | Transition Pages RePurposed is the rate at which the number of transition cache pages were reused for a different purpose. These pages would have otherwise remained in the page cache to provide a (fast) soft fault (instead of retrieving it from backing store) in the event the page was accessed in the future | counter | None | +| `windows_memory_write_copies_total` | The number of page faults caused by attempting to write that were satisfied by copying the page from elsewhere in physical memory | counter | None | ### Example metric _This collector does not yet have explained examples, we would appreciate your help adding them!_ ## Useful queries -_This collector does not yet have any useful queries added, we would appreciate your help adding them!_ +Show memory usage for instance (%) +``` +100 - 100 * windows_memory_physical_free_bytes{instance="localhost"} / windows_memory_physical_total_bytes +``` ## Alerting examples -_This collector does not yet have alerting examples, we would appreciate your help adding them!_ + +**prometheus.rules** +```yaml +# Alert on hosts that have exhausted all available physical memory +- alert: MemoryExhausted + expr: windows_os_physical_memory_free_bytes == 0 + for: 10m + labels: + severity: high + annotations: + summary: "Host {{ $labels.instance }} is out of memory" + description: "{{ $labels.instance }} has exhausted all available physical memory" + +# Alert on hosts with greater than 90% memory usage +- alert: MemoryLow + expr: 100 - 100 * windows_memory_physical_free_bytes{instance="localhost"} / windows_memory_physical_total_bytes > 90 + for: 10m + labels: + severity: warning + annotations: + summary: "Memory usage for host {{ $labels.instance }} is greater than 90%" +``` diff --git a/docs/collector.os.md b/docs/collector.os.md index a262cad60..34b3ab0d5 100644 --- a/docs/collector.os.md +++ b/docs/collector.os.md @@ -14,58 +14,26 @@ None ## Metrics -Name | Description | Type | Labels ------|-------------|------|------- -`windows_os_info` | Contains full product name & version in labels. Note that the `major_version` for Windows 11 is "10"; a build number greater than 22000 represents Windows 11. | gauge | `product`, `version`, `major_version`, `minor_version`, `build_number` -`windows_os_paging_limit_bytes` | Total number of bytes that can be stored in the operating system paging files. 0 (zero) indicates that there are no paging files | gauge | None -`windows_os_paging_free_bytes` | Number of bytes that can be mapped into the operating system paging files without causing any other pages to be swapped out | gauge | None -`windows_os_physical_memory_free_bytes` | Bytes of physical memory currently unused and available | gauge | None -`windows_os_time` | Current time as reported by the operating system, in [Unix time](https://en.wikipedia.org/wiki/Unix_time). See [time.Unix()](https://golang.org/pkg/time/#Unix) for details | gauge | None -`windows_os_timezone` | Current timezone as reported by the operating system. See [time.Zone()](https://golang.org/pkg/time/#Time.Zone) for details | gauge | `timezone` -`windows_os_processes` | Number of process contexts currently loaded or running on the operating system | gauge | None -`windows_os_processes_limit` | Maximum number of process contexts the operating system can support. The default value set by the provider is 4294967295 (0xFFFFFFFF) | gauge | None -`windows_os_process_memory_limit_bytes` | Maximum number of bytes of memory that can be allocated to a process | gauge | None -`windows_os_users` | Number of user sessions for which the operating system is storing state information currently. For a list of current active logon sessions, see [`logon`](collector.logon.md) | gauge | None -`windows_os_virtual_memory_bytes` | Bytes of virtual memory | gauge | None -`windows_os_visible_memory_bytes` | Total bytes of physical memory available to the operating system. This value does not necessarily indicate the true amount of physical memory, but what is reported to the operating system as available to it | gauge | None -`windows_os_virtual_memory_free_bytes` | Bytes of virtual memory currently unused and available | gauge | None +| Name | Description | Type | Labels | +|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------------| +| `windows_os_info` | Contains full product name & version in labels. Note that the `major_version` for Windows 11 is "10"; a build number greater than 22000 represents Windows 11. | gauge | `product`, `version`, `major_version`, `minor_version`, `build_number` | +| `windows_os_paging_limit_bytes` | Total number of bytes that can be stored in the operating system paging files. 0 (zero) indicates that there are no paging files | gauge | None | +| `windows_os_paging_free_bytes` | Number of bytes that can be mapped into the operating system paging files without causing any other pages to be swapped out | gauge | None | + ### Example metric -Show current number of processes -``` -windows_os_processes{instance="localhost"} -``` -## Useful queries -Find all devices not set to UTC timezone ``` -windows_os_timezone{timezone != "UTC"} +# HELP windows_os_hostname Labelled system hostname information as provided by ComputerSystem.DNSHostName and ComputerSystem.Domain +# TYPE windows_os_hostname gauge +windows_os_hostname{domain="",fqdn="PC",hostname="PC"} 1 +# HELP windows_os_info Contains full product name & version in labels. Note that the "major_version" for Windows 11 is \\"10\\"; a build number greater than 22000 represents Windows 11. +# TYPE windows_os_info gauge +windows_os_info{build_number="19045",major_version="10",minor_version="0",product="Windows 10 Pro",revision="4842",version="10.0.19045"} 1 ``` -Show memory usage for instance (%) -``` -100 - 100 * windows_os_physical_memory_free_bytes{instance="localhost"} / windows_cs_physical_memory_bytes{instance="localhost"} -``` +## Useful queries +_This collector does not yet have useful queries, we would appreciate your help adding them!_ ## Alerting examples -**prometheus.rules** -```yaml -# Alert on hosts that have exhausted all available physical memory -- alert: MemoryExhausted - expr: windows_os_physical_memory_free_bytes == 0 - for: 10m - labels: - severity: high - annotations: - summary: "Host {{ $labels.instance }} is out of memory" - description: "{{ $labels.instance }} has exhausted all available physical memory" - -# Alert on hosts with greater than 90% memory usage -- alert: MemoryLow - expr: 100 - 100 * windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes > 90 - for: 10m - labels: - severity: warning - annotations: - summary: "Memory usage for host {{ $labels.instance }} is greater than 90%" -``` +_This collector does not yet have alerting examples, we would appreciate your help adding them!_ \ No newline at end of file diff --git a/docs/collector.system.md b/docs/collector.system.md index 61d9b88bc..1a68f1e1a 100644 --- a/docs/collector.system.md +++ b/docs/collector.system.md @@ -5,8 +5,7 @@ The system collector exposes metrics about ... ||| -|- Metric name prefix | `system` -Data source | Perflib -Classes | [`Win32_PerfRawData_PerfOS_System`](https://web.archive.org/web/20050830140516/http://msdn.microsoft.com/library/en-us/wmisdk/wmi/win32_perfrawdata_perfos_system.asp) +Data source | Performance Counters Enabled by default? | Yes ## Flags @@ -15,14 +14,18 @@ None ## Metrics -Name | Description | Type | Labels ------|-------------|------|------- -`windows_system_context_switches_total` | Total number of [context switches](https://en.wikipedia.org/wiki/Context_switch) | counter | None -`windows_system_exception_dispatches_total` | Total exceptions dispatched by the system | counter | None -`windows_system_processor_queue_length` | Number of threads in the processor queue. There is a single queue for processor time even on computers with multiple processors. | gauge | None -`windows_system_system_calls_total` | Total combined calls to Windows NT system service routines by all processes running on the computer | counter | None -`windows_system_system_up_time` | Time of last boot of system | gauge | None -`windows_system_threads` | Number of Windows system [threads](https://en.wikipedia.org/wiki/Thread_(computing)) | gauge | None +| Name | Description | Type | Labels | +|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|--------| +| `windows_system_context_switches_total` | Total number of [context switches](https://en.wikipedia.org/wiki/Context_switch) | counter | None | +| `windows_system_exception_dispatches_total` | Total exceptions dispatched by the system | counter | None | +| `windows_system_processes` | Number of process contexts currently loaded or running on the operating system | gauge | None | +| `windows_system_process_limit` | The size of the user-mode portion of the virtual address space of the calling process, in bytes. This value depends on the type of process, the type of processor, and the configuration of the operating system. | gauge | None | +| `windows_system_processor_queue_length` | Number of threads in the processor queue. There is a single queue for processor time even on computers with multiple processors. | gauge | None | +| `windows_system_system_calls_total` | Total combined calls to Windows NT system service routines by all processes running on the computer | counter | None | +| `windows_system_system_up_time` | Time of last boot of system | gauge | None | +| `windows_system_threads` | Number of Windows system [threads](https://en.wikipedia.org/wiki/Thread_(computing)) | gauge | None | + + ### Example metric Show current number of system threads @@ -30,6 +33,11 @@ Show current number of system threads windows_system_threads{instance="localhost"} ``` +Show current number of processes +``` +windows_system_processes{instance="localhost"} +``` + ## Useful queries Find hosts that have rebooted in the last 24 hours ``` diff --git a/docs/collector.time.md b/docs/collector.time.md index 662745a2a..214d4fa17 100644 --- a/docs/collector.time.md +++ b/docs/collector.time.md @@ -17,14 +17,16 @@ None ## Metrics -| Name | Description | Type | Labels | -|-----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|--------| -| `windows_time_clock_frequency_adjustment_ppb_total` | Total adjustment made to the local system clock frequency by W32Time in parts per billion (PPB) units. 1 PPB adjustment implies the system clock was adjusted at a rate of 1 nanosecond per second (1 ns/s). The smallest possible adjustment can vary and is expected to be in the order of 100's of PPB. | counter | None | -| `windows_time_computed_time_offset_seconds` | The absolute time offset between the system clock and the chosen time source, as computed by the W32Time service in microseconds. When a new valid sample is available, the computed time is updated with the time offset indicated by the sample. This time is the actual time offset of the local clock. W32Time initiates clock correction by using this offset and updates the computed time in between samples with the remaining time offset that needs to be applied to the local clock. Clock accuracy can be tracked by using this performance counter with a low polling interval (for example, 256 seconds or less) and looking for the counter value to be smaller than the desired clock accuracy limit. | gauge | None | -| `windows_time_ntp_client_time_sources` | Active number of NTP Time sources being used by the client. This is a count of active, distinct IP addresses of time servers that are responding to this client's requests. | gauge | None | -| `windows_time_ntp_round_trip_delay_seconds` | Total roundtrip delay experienced by the NTP client in receiving a response from the server for the most recent request, in seconds. This is the time elapsed on the NTP client between transmitting a request to the NTP server and receiving a valid response from the server. | gauge | None | -| `windows_time_ntp_server_outgoing_responses_total` | Total number of requests responded to by the NTP server. | counter | None | -| `windows_time_ntp_server_incoming_requests_total` | Total number of requests received by the NTP server. | counter | None | +| Name | Description | Type | Labels | +|-----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|------------| +| `windows_time_clock_frequency_adjustment_ppb_total` | Total adjustment made to the local system clock frequency by W32Time in parts per billion (PPB) units. 1 PPB adjustment implies the system clock was adjusted at a rate of 1 nanosecond per second (1 ns/s). The smallest possible adjustment can vary and is expected to be in the order of 100's of PPB. | counter | None | +| `windows_time_computed_time_offset_seconds` | The absolute time offset between the system clock and the chosen time source, as computed by the W32Time service in microseconds. When a new valid sample is available, the computed time is updated with the time offset indicated by the sample. This time is the actual time offset of the local clock. W32Time initiates clock correction by using this offset and updates the computed time in between samples with the remaining time offset that needs to be applied to the local clock. Clock accuracy can be tracked by using this performance counter with a low polling interval (for example, 256 seconds or less) and looking for the counter value to be smaller than the desired clock accuracy limit. | gauge | None | +| `windows_time_ntp_client_time_sources` | Active number of NTP Time sources being used by the client. This is a count of active, distinct IP addresses of time servers that are responding to this client's requests. | gauge | None | +| `windows_time_ntp_round_trip_delay_seconds` | Total roundtrip delay experienced by the NTP client in receiving a response from the server for the most recent request, in seconds. This is the time elapsed on the NTP client between transmitting a request to the NTP server and receiving a valid response from the server. | gauge | None | +| `windows_time_ntp_server_outgoing_responses_total` | Total number of requests responded to by the NTP server. | counter | None | +| `windows_time_ntp_server_incoming_requests_total` | Total number of requests received by the NTP server. | counter | None | +| `windows_time_current_timestamp_seconds` | Current time as reported by the operating system, in [Unix time](https://en.wikipedia.org/wiki/Unix_time). See [time.Unix()](https://golang.org/pkg/time/#Unix) for details | gauge | None | +| `windows_time_timezone` | Current timezone as reported by the operating system. | gauge | `timezone` | ### Example metric _This collector does not yet have explained examples, we would appreciate your help adding them!_ diff --git a/exporter.go b/exporter.go index 493ad40c5..9ba390368 100644 --- a/exporter.go +++ b/exporter.go @@ -223,15 +223,15 @@ func main() { _ = level.Info(logger).Log("msg", fmt.Sprintf("Enabled collectors: %v", strings.Join(enabledCollectorList, ", "))) mux := http.NewServeMux() - mux.HandleFunc(*metricsPath, withConcurrencyLimit(*maxRequests, collectors.BuildServeHTTP(logger, *disableExporterMetrics, *timeoutMargin))) - mux.HandleFunc("/health", func(w http.ResponseWriter, _ *http.Request) { + mux.HandleFunc("GET "+*metricsPath, withConcurrencyLimit(*maxRequests, collectors.BuildServeHTTP(logger, *disableExporterMetrics, *timeoutMargin))) + mux.HandleFunc("GET /health", func(w http.ResponseWriter, _ *http.Request) { w.Header().Set("Content-Type", "application/json") _, err := fmt.Fprintln(w, `{"status":"ok"}`) if err != nil { _ = level.Debug(logger).Log("msg", "Failed to write to stream", "err", err) } }) - mux.HandleFunc("/version", func(w http.ResponseWriter, _ *http.Request) { + mux.HandleFunc("GET /version", func(w http.ResponseWriter, _ *http.Request) { // we can't use "version" directly as it is a package, and not an object that // can be serialized. err := json.NewEncoder(w).Encode(prometheusVersion{ @@ -248,11 +248,11 @@ func main() { }) if *debugEnabled { - mux.HandleFunc("/debug/pprof/", pprof.Index) - mux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline) - mux.HandleFunc("/debug/pprof/profile", pprof.Profile) - mux.HandleFunc("/debug/pprof/symbol", pprof.Symbol) - mux.HandleFunc("/debug/pprof/trace", pprof.Trace) + mux.HandleFunc("GET /debug/pprof/", pprof.Index) + mux.HandleFunc("GET /debug/pprof/cmdline", pprof.Cmdline) + mux.HandleFunc("GET /debug/pprof/profile", pprof.Profile) + mux.HandleFunc("GET /debug/pprof/symbol", pprof.Symbol) + mux.HandleFunc("GET /debug/pprof/trace", pprof.Trace) } _ = level.Info(logger).Log("msg", "Starting windows_exporter", "version", version.Info()) diff --git a/pkg/collector/cpu/cpu.go b/pkg/collector/cpu/cpu.go index 111589028..f4870bdf7 100644 --- a/pkg/collector/cpu/cpu.go +++ b/pkg/collector/cpu/cpu.go @@ -9,7 +9,6 @@ import ( "github.com/go-kit/log" "github.com/prometheus-community/windows_exporter/pkg/perflib" "github.com/prometheus-community/windows_exporter/pkg/types" - "github.com/prometheus-community/windows_exporter/pkg/winversion" "github.com/prometheus/client_golang/prometheus" "github.com/yusufpapurcu/wmi" ) @@ -23,6 +22,7 @@ var ConfigDefaults = Config{} type Collector struct { config Config + logicalProcessors *prometheus.Desc cStateSecondsTotal *prometheus.Desc timeTotal *prometheus.Desc interruptsTotal *prometheus.Desc @@ -59,10 +59,7 @@ func (c *Collector) GetName() string { } func (c *Collector) GetPerfCounter(_ log.Logger) ([]string, error) { - if winversion.WindowsVersionFloat() > 6.05 { - return []string{"Processor Information"}, nil - } - return []string{"Processor"}, nil + return []string{"Processor Information"}, nil } func (c *Collector) Close() error { @@ -70,6 +67,13 @@ func (c *Collector) Close() error { } func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { + c.logicalProcessors = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "logical_processor"), + "Total number of logical processors", + nil, + nil, + ) + c.cStateSecondsTotal = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "cstate_seconds_total"), "Time spent in low-power idle state", @@ -95,16 +99,6 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { nil, ) - // For Windows 2008 (version 6.0) or earlier we only have the "Processor" - // class. As of Windows 2008 R2 (version 6.1) the more detailed - // "Processor Information" set is available (although some of the counters - // are added in later versions, so we aren't guaranteed to get all of - // them). - // Value 6.05 was selected to split between Windows versions. - if winversion.WindowsVersionFloat() < 6.05 { - return nil - } - c.cStateSecondsTotal = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "cstate_seconds_total"), "Time spent in low-power idle state", @@ -189,111 +183,8 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { func (c *Collector) Collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { logger = log.With(logger, "collector", Name) - if winversion.WindowsVersionFloat() > 6.05 { - return c.CollectFull(ctx, logger, ch) - } - - return c.CollectBasic(ctx, logger, ch) -} - -type perflibProcessor struct { - Name string - C1Transitions float64 `perflib:"C1 Transitions/sec"` - C2Transitions float64 `perflib:"C2 Transitions/sec"` - C3Transitions float64 `perflib:"C3 Transitions/sec"` - DPCRate float64 `perflib:"DPC Rate"` - DPCsQueued float64 `perflib:"DPCs Queued/sec"` - Interrupts float64 `perflib:"Interrupts/sec"` - PercentC1Time float64 `perflib:"% C1 Time"` - PercentC2Time float64 `perflib:"% C2 Time"` - PercentC3Time float64 `perflib:"% C3 Time"` - PercentDPCTime float64 `perflib:"% DPC Time"` - PercentIdleTime float64 `perflib:"% Idle Time"` - PercentInterruptTime float64 `perflib:"% Interrupt Time"` - PercentPrivilegedTime float64 `perflib:"% Privileged Time"` - PercentProcessorTime float64 `perflib:"% Processor Time"` - PercentUserTime float64 `perflib:"% User Time"` -} - -func (c *Collector) CollectBasic(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { - logger = log.With(logger, "collector", Name) - data := make([]perflibProcessor, 0) - err := perflib.UnmarshalObject(ctx.PerfObjects["Processor"], &data, logger) - if err != nil { - return err - } - - for _, cpu := range data { - if strings.Contains(strings.ToLower(cpu.Name), "_total") { - continue - } - core := cpu.Name - - ch <- prometheus.MustNewConstMetric( - c.cStateSecondsTotal, - prometheus.CounterValue, - cpu.PercentC1Time, - core, "c1", - ) - ch <- prometheus.MustNewConstMetric( - c.cStateSecondsTotal, - prometheus.CounterValue, - cpu.PercentC2Time, - core, "c2", - ) - ch <- prometheus.MustNewConstMetric( - c.cStateSecondsTotal, - prometheus.CounterValue, - cpu.PercentC3Time, - core, "c3", - ) - - ch <- prometheus.MustNewConstMetric( - c.timeTotal, - prometheus.CounterValue, - cpu.PercentIdleTime, - core, "idle", - ) - ch <- prometheus.MustNewConstMetric( - c.timeTotal, - prometheus.CounterValue, - cpu.PercentInterruptTime, - core, "interrupt", - ) - ch <- prometheus.MustNewConstMetric( - c.timeTotal, - prometheus.CounterValue, - cpu.PercentDPCTime, - core, "dpc", - ) - ch <- prometheus.MustNewConstMetric( - c.timeTotal, - prometheus.CounterValue, - cpu.PercentPrivilegedTime, - core, "privileged", - ) - ch <- prometheus.MustNewConstMetric( - c.timeTotal, - prometheus.CounterValue, - cpu.PercentUserTime, - core, "user", - ) - - ch <- prometheus.MustNewConstMetric( - c.interruptsTotal, - prometheus.CounterValue, - cpu.Interrupts, - core, - ) - ch <- prometheus.MustNewConstMetric( - c.dpcsTotal, - prometheus.CounterValue, - cpu.DPCsQueued, - core, - ) - } - return nil + return c.CollectFull(ctx, logger, ch) } type perflibProcessorInformation struct { @@ -333,12 +224,16 @@ func (c *Collector) CollectFull(ctx *types.ScrapeContext, logger log.Logger, ch return err } + var coreCount float64 + for _, cpu := range data { if strings.Contains(strings.ToLower(cpu.Name), "_total") { continue } core := cpu.Name + coreCount++ + ch <- prometheus.MustNewConstMetric( c.cStateSecondsTotal, prometheus.CounterValue, @@ -459,5 +354,11 @@ func (c *Collector) CollectFull(ctx *types.ScrapeContext, logger log.Logger, ch ) } + ch <- prometheus.MustNewConstMetric( + c.logicalProcessors, + prometheus.GaugeValue, + coreCount, + ) + return nil } diff --git a/pkg/collector/cs/cs.go b/pkg/collector/cs/cs.go index 8e14cce80..2b654fd5f 100644 --- a/pkg/collector/cs/cs.go +++ b/pkg/collector/cs/cs.go @@ -22,9 +22,15 @@ var ConfigDefaults = Config{} type Collector struct { config Config + // physicalMemoryBytes + // Deprecated: Use windows_cpu_logical_processor instead physicalMemoryBytes *prometheus.Desc - logicalProcessors *prometheus.Desc - hostname *prometheus.Desc + // logicalProcessors + // Deprecated: Use windows_physical_memory_total_bytes instead + logicalProcessors *prometheus.Desc + // hostname + // Deprecated: Use windows_os_hostname instead + hostname *prometheus.Desc } func New(config *Config) *Collector { @@ -55,22 +61,28 @@ func (c *Collector) Close() error { return nil } -func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { +func (c *Collector) Build(logger log.Logger, _ *wmi.Client) error { + _ = level.Warn(logger). + Log("msg", "The cs collector is deprecated and will be removed in a future release. "+ + "Logical processors has been moved to cpu_info collector. "+ + "Physical memory has been moved to memory collector. "+ + "Hostname has been moved to os collector.") + c.logicalProcessors = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "logical_processors"), - "ComputerSystem.NumberOfLogicalProcessors", + "Deprecated: Use windows_cpu_logical_processor instead", nil, nil, ) c.physicalMemoryBytes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "physical_memory_bytes"), - "ComputerSystem.TotalPhysicalMemory", + "Deprecated: Use windows_physical_memory_total_bytes instead", nil, nil, ) c.hostname = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "hostname"), - "Labelled system hostname information as provided by ComputerSystem.DNSHostName and ComputerSystem.Domain", + "Deprecated: Use windows_os_hostname instead", []string{ "hostname", "domain", diff --git a/pkg/collector/memory/memory.go b/pkg/collector/memory/memory.go index 96157f536..932421f26 100644 --- a/pkg/collector/memory/memory.go +++ b/pkg/collector/memory/memory.go @@ -6,9 +6,13 @@ package memory import ( + "errors" + "fmt" + "github.com/alecthomas/kingpin/v2" "github.com/go-kit/log" "github.com/go-kit/log/level" + "github.com/prometheus-community/windows_exporter/pkg/headers/sysinfoapi" "github.com/prometheus-community/windows_exporter/pkg/perflib" "github.com/prometheus-community/windows_exporter/pkg/types" "github.com/prometheus/client_golang/prometheus" @@ -25,6 +29,7 @@ var ConfigDefaults = Config{} type Collector struct { config Config + // Performance metrics availableBytes *prometheus.Desc cacheBytes *prometheus.Desc cacheBytesPeak *prometheus.Desc @@ -57,6 +62,11 @@ type Collector struct { transitionFaultsTotal *prometheus.Desc transitionPagesRepurposedTotal *prometheus.Desc writeCopiesTotal *prometheus.Desc + + // Global memory status + processMemoryLimitBytes *prometheus.Desc + physicalMemoryTotalBytes *prometheus.Desc + physicalMemoryFreeBytes *prometheus.Desc } func New(config *Config) *Collector { @@ -292,6 +302,25 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { nil, nil, ) + c.processMemoryLimitBytes = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "process_memory_limit_bytes"), + "The size of the user-mode portion of the virtual address space of the calling process, in bytes. This value depends on the type of process, the type of processor, and the configuration of the operating system.", + nil, + nil, + ) + c.physicalMemoryTotalBytes = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "physical_total_bytes"), + "The amount of actual physical memory, in bytes.", + nil, + nil, + ) + c.physicalMemoryFreeBytes = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "physical_free_bytes"), + "The amount of physical memory currently available, in bytes. This is the amount of physical memory that can be immediately reused without having to write its contents to disk first. It is the sum of the size of the standby, free, and zero lists.", + nil, + nil, + ) + return nil } @@ -299,10 +328,46 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { // to the provided prometheus Metric channel. func (c *Collector) Collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { logger = log.With(logger, "collector", Name) - if err := c.collect(ctx, logger, ch); err != nil { + + errs := make([]error, 0, 2) + + if err := c.collectPerformanceData(ctx, logger, ch); err != nil { _ = level.Error(logger).Log("msg", "failed collecting memory metrics", "err", err) - return err + errs = append(errs, err) + } + + if err := c.collectGlobalMemoryStatus(ch); err != nil { + _ = level.Error(logger).Log("msg", "failed collecting memory metrics", "err", err) + errs = append(errs, err) } + + return errors.Join(errs...) +} + +func (c *Collector) collectGlobalMemoryStatus(ch chan<- prometheus.Metric) error { + memoryStatusEx, err := sysinfoapi.GlobalMemoryStatusEx() + if err != nil { + return fmt.Errorf("failed to get memory status: %w", err) + } + + ch <- prometheus.MustNewConstMetric( + c.processMemoryLimitBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.TotalVirtual), + ) + + ch <- prometheus.MustNewConstMetric( + c.physicalMemoryTotalBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.TotalPhys), + ) + + ch <- prometheus.MustNewConstMetric( + c.physicalMemoryFreeBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.AvailPhys), + ) + return nil } @@ -343,7 +408,7 @@ type memory struct { WriteCopiesPersec float64 `perflib:"Write Copies/sec"` } -func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { +func (c *Collector) collectPerformanceData(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { logger = log.With(logger, "collector", Name) var dst []memory if err := perflib.UnmarshalObject(ctx.PerfObjects["Memory"], &dst, logger); err != nil { diff --git a/pkg/collector/os/os.go b/pkg/collector/os/os.go index 531010c7d..26c5e2f5d 100644 --- a/pkg/collector/os/os.go +++ b/pkg/collector/os/os.go @@ -35,19 +35,45 @@ var ConfigDefaults = Config{} type Collector struct { config Config - osInformation *prometheus.Desc - pagingFreeBytes *prometheus.Desc - pagingLimitBytes *prometheus.Desc + hostname *prometheus.Desc + osInformation *prometheus.Desc + pagingFreeBytes *prometheus.Desc + pagingLimitBytes *prometheus.Desc + + // users + // Deprecated: Use windows_system_processes instead. + processes *prometheus.Desc + // users + // Deprecated: Use windows_system_process_limit instead. + processesLimit *prometheus.Desc + + // users + // Deprecated: Use count(windows_logon_logon_type) instead. + users *prometheus.Desc + + // physicalMemoryFreeBytes + // Deprecated: Use windows_memory_physical_free_bytes instead. physicalMemoryFreeBytes *prometheus.Desc + + // processMemoryLimitBytes + // Deprecated: Use windows_memory_process_memory_limit_bytes instead. processMemoryLimitBytes *prometheus.Desc - processes *prometheus.Desc - processesLimit *prometheus.Desc - time *prometheus.Desc - timezone *prometheus.Desc - users *prometheus.Desc - virtualMemoryBytes *prometheus.Desc - virtualMemoryFreeBytes *prometheus.Desc - visibleMemoryBytes *prometheus.Desc + + // time + // Deprecated: Use windows_time_current_timestamp_seconds instead. + time *prometheus.Desc + // timezone + // Deprecated: Use windows_time_timezone instead. + timezone *prometheus.Desc + // virtualMemoryBytes + // Deprecated: Use windows_memory_commit_limit instead. + virtualMemoryBytes *prometheus.Desc + // virtualMemoryFreeBytes + // Deprecated: Use windows_memory_commit_limit instead. + virtualMemoryFreeBytes *prometheus.Desc + // visibleMemoryBytes + // Deprecated: Use windows_memory_physical_total_bytes instead. + visibleMemoryBytes *prometheus.Desc } type pagingFileCounter struct { @@ -84,11 +110,43 @@ func (c *Collector) Close() error { return nil } -func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { +func (c *Collector) Build(logger log.Logger, _ *wmi.Client) error { + _ = level.Warn(logger). + Log("msg", "The os collect holds a number of deprecated metrics and will be removed mid 2025. "+ + "See https://github.com/prometheus-community/windows_exporter/pull/1596 for more information.") + + workstationInfo, err := netapi32.GetWorkstationInfo() + if err != nil { + return fmt.Errorf("failed to get workstation info: %w", err) + } + + productName, buildNumber, revision, err := c.getWindowsVersion() + if err != nil { + return fmt.Errorf("failed to get Windows version: %w", err) + } + c.osInformation = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "info"), - "OperatingSystem.Caption, OperatingSystem.Version", - []string{"product", "version", "major_version", "minor_version", "build_number", "revision"}, + `Contains full product name & version in labels. Note that the "major_version" for Windows 11 is \"10\"; a build number greater than 22000 represents Windows 11.`, + nil, + prometheus.Labels{ + "product": productName, + "version": fmt.Sprintf("%d.%d.%s", workstationInfo.VersionMajor, workstationInfo.VersionMinor, buildNumber), + "major_version": strconv.FormatUint(uint64(workstationInfo.VersionMajor), 10), + "minor_version": strconv.FormatUint(uint64(workstationInfo.VersionMinor), 10), + "build_number": buildNumber, + "revision": revision, + }, + ) + + c.hostname = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "hostname"), + "Labelled system hostname information as provided by ComputerSystem.DNSHostName and ComputerSystem.Domain", + []string{ + "hostname", + "domain", + "fqdn", + }, nil, ) c.pagingLimitBytes = prometheus.NewDesc( @@ -105,61 +163,61 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { ) c.physicalMemoryFreeBytes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "physical_memory_free_bytes"), - "OperatingSystem.FreePhysicalMemory", + "Deprecated: Use `windows_memory_physical_free_bytes` instead.", nil, nil, ) c.time = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "time"), - "OperatingSystem.LocalDateTime", + "Deprecated: Use windows_time_current_timestamp_seconds instead.", nil, nil, ) c.timezone = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "timezone"), - "OperatingSystem.LocalDateTime", + "Deprecated: Use windows_time_timezone instead.", []string{"timezone"}, nil, ) c.processes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "processes"), - "OperatingSystem.NumberOfProcesses", + "Deprecated: Use `windows_system_processes` instead.", nil, nil, ) c.processesLimit = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "processes_limit"), - "OperatingSystem.MaxNumberOfProcesses", + "Deprecated: Use `windows_system_process_limit` instead.", nil, nil, ) c.processMemoryLimitBytes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "process_memory_limit_bytes"), - "OperatingSystem.MaxProcessMemorySize", + "Deprecated: Use `windows_memory_process_memory_limit_bytes` instead.", nil, nil, ) c.users = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "users"), - "OperatingSystem.NumberOfUsers", + "Deprecated: Use `count(windows_logon_logon_type)` instead.", nil, nil, ) c.virtualMemoryBytes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "virtual_memory_bytes"), - "OperatingSystem.TotalVirtualMemorySize", + "Deprecated: Use `windows_memory_commit_limit` instead.", nil, nil, ) c.visibleMemoryBytes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "visible_memory_bytes"), - "OperatingSystem.TotalVisibleMemorySize", + "Deprecated: Use `windows_memory_physical_total_bytes` instead.", nil, nil, ) c.virtualMemoryFreeBytes = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "virtual_memory_free_bytes"), - "OperatingSystem.FreeVirtualMemory", + "Deprecated: Use `windows_memory_commit_limit - windows_memory_committed_bytes` instead.", nil, nil, ) @@ -170,45 +228,81 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { // to the provided prometheus Metric channel. func (c *Collector) Collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { logger = log.With(logger, "collector", Name) - if err := c.collect(ctx, logger, ch); err != nil { + + errs := make([]error, 0, 5) + + c.collect(ch) + + if err := c.collectHostname(ch); err != nil { _ = level.Error(logger).Log("msg", "failed collecting os metrics", "err", err) - return err + errs = append(errs, err) + } + + if err := c.collectLoggedInUserCount(ch); err != nil { + _ = level.Error(logger).Log("msg", "failed collecting os user count metrics", "err", err) + errs = append(errs, err) + } + + if err := c.collectMemory(ch); err != nil { + _ = level.Error(logger).Log("msg", "failed collecting os memory metrics", "err", err) + errs = append(errs, err) } - return nil -} -// Win32_OperatingSystem docs: -// - https://msdn.microsoft.com/en-us/library/aa394239 - Win32_OperatingSystem class. -type Win32_OperatingSystem struct { - Caption string - FreePhysicalMemory uint64 - FreeSpaceInPagingFiles uint64 - FreeVirtualMemory uint64 - LocalDateTime time.Time - MaxNumberOfProcesses uint32 - MaxProcessMemorySize uint64 - NumberOfProcesses uint32 - NumberOfUsers uint32 - SizeStoredInPagingFiles uint64 - TotalVirtualMemorySize uint64 - TotalVisibleMemorySize uint64 - Version string + if err := c.collectTime(ch); err != nil { + _ = level.Error(logger).Log("msg", "failed collecting os time metrics", "err", err) + errs = append(errs, err) + } + + if err := c.collectPaging(ctx, logger, ch); err != nil { + _ = level.Error(logger).Log("msg", "failed collecting os paging metrics", "err", err) + errs = append(errs, err) + } + + return errors.Join(errs...) } -func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { - logger = log.With(logger, "collector", Name) - nwgi, err := netapi32.GetWorkstationInfo() +func (c *Collector) collectLoggedInUserCount(ch chan<- prometheus.Metric) error { + workstationInfo, err := netapi32.GetWorkstationInfo() if err != nil { return err } - gmse, err := sysinfoapi.GlobalMemoryStatusEx() + ch <- prometheus.MustNewConstMetric( + c.users, + prometheus.GaugeValue, + float64(workstationInfo.LoggedOnUsers), + ) + + return nil +} + +func (c *Collector) collectHostname(ch chan<- prometheus.Metric) error { + hostname, err := sysinfoapi.GetComputerName(sysinfoapi.ComputerNameDNSHostname) + if err != nil { + return err + } + domain, err := sysinfoapi.GetComputerName(sysinfoapi.ComputerNameDNSDomain) + if err != nil { + return err + } + fqdn, err := sysinfoapi.GetComputerName(sysinfoapi.ComputerNameDNSFullyQualified) if err != nil { return err } - currentTime := time.Now() + ch <- prometheus.MustNewConstMetric( + c.hostname, + prometheus.GaugeValue, + 1.0, + hostname, + domain, + fqdn, + ) + + return nil +} +func (c *Collector) collectTime(ch chan<- prometheus.Metric) error { timeZoneInfo, err := kernel32.GetDynamicTimeZoneInformation() if err != nil { return err @@ -217,6 +311,62 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan // timeZoneKeyName contains the english name of the timezone. timezoneName := syscall.UTF16ToString(timeZoneInfo.TimeZoneKeyName[:]) + ch <- prometheus.MustNewConstMetric( + c.time, + prometheus.GaugeValue, + float64(time.Now().Unix()), + ) + + ch <- prometheus.MustNewConstMetric( + c.timezone, + prometheus.GaugeValue, + 1.0, + timezoneName, + ) + + return nil +} + +func (c *Collector) collectMemory(ch chan<- prometheus.Metric) error { + memoryStatusEx, err := sysinfoapi.GlobalMemoryStatusEx() + if err != nil { + return err + } + + ch <- prometheus.MustNewConstMetric( + c.physicalMemoryFreeBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.AvailPhys), + ) + + ch <- prometheus.MustNewConstMetric( + c.virtualMemoryFreeBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.AvailPageFile), + ) + + ch <- prometheus.MustNewConstMetric( + c.virtualMemoryBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.TotalPageFile), + ) + + ch <- prometheus.MustNewConstMetric( + c.visibleMemoryBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.TotalPhys), + ) + + ch <- prometheus.MustNewConstMetric( + c.processMemoryLimitBytes, + prometheus.GaugeValue, + float64(memoryStatusEx.TotalVirtual), + ) + + return nil +} + +func (c *Collector) collectPaging(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { // Get total allocation of paging files across all disks. memManKey, err := registry.OpenKey(registry.LOCAL_MACHINE, `SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management`, registry.QUERY_VALUE) if err != nil { @@ -239,38 +389,13 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan } } - // Get build number and product name from registry - ntKey, err := registry.OpenKey(registry.LOCAL_MACHINE, `SOFTWARE\Microsoft\Windows NT\CurrentVersion`, registry.QUERY_VALUE) - if err != nil { - return err - } - - defer ntKey.Close() - - pn, _, err := ntKey.GetStringValue("ProductName") - if err != nil { - return err - } - - bn, _, err := ntKey.GetStringValue("CurrentBuildNumber") - if err != nil { - return err - } - - revision, _, err := ntKey.GetIntegerValue("UBR") - if errors.Is(err, registry.ErrNotExist) { - revision = 0 - } else if err != nil { - return err - } - gpi, err := psapi.GetPerformanceInfo() if err != nil { return err } pfc := make([]pagingFileCounter, 0) - if err := perflib.UnmarshalObject(ctx.PerfObjects["Paging File"], &pfc, logger); err != nil { + if err = perflib.UnmarshalObject(ctx.PerfObjects["Paging File"], &pfc, logger); err != nil { return err } @@ -283,41 +408,10 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan pfbRaw += pageFile.Usage } - // Subtract from total page file allocation on disk. - pfb := fsipf - (pfbRaw * float64(gpi.PageSize)) - - ch <- prometheus.MustNewConstMetric( - c.osInformation, - prometheus.GaugeValue, - 1.0, - "Microsoft "+pn, // Caption - fmt.Sprintf("%d.%d.%s", nwgi.VersionMajor, nwgi.VersionMinor, bn), // Version - strconv.FormatUint(uint64(nwgi.VersionMajor), 10), // Major Version - strconv.FormatUint(uint64(nwgi.VersionMinor), 10), // Minor Version - bn, // Build number - strconv.FormatUint(revision, 10), // Revision - ) - - ch <- prometheus.MustNewConstMetric( - c.physicalMemoryFreeBytes, - prometheus.GaugeValue, - float64(gmse.AvailPhys), - ) - - ch <- prometheus.MustNewConstMetric( - c.time, - prometheus.GaugeValue, - float64(currentTime.Unix()), - ) - - ch <- prometheus.MustNewConstMetric( - c.timezone, - prometheus.GaugeValue, - 1.0, - timezoneName, - ) - if pagingErr == nil { + // Subtract from total page file allocation on disk. + pfb := fsipf - (pfbRaw * float64(gpi.PageSize)) + ch <- prometheus.MustNewConstMetric( c.pagingFreeBytes, prometheus.GaugeValue, @@ -332,10 +426,21 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan } else { _ = level.Debug(logger).Log("msg", "Could not find HKLM:\\SYSTEM\\CurrentControlSet\\Control\\Session Manager\\Memory Management key. windows_os_paging_free_bytes and windows_os_paging_limit_bytes will be omitted.") } + ch <- prometheus.MustNewConstMetric( - c.virtualMemoryFreeBytes, + c.processes, + prometheus.GaugeValue, + float64(gpi.ProcessCount), + ) + + return nil +} + +func (c *Collector) collect(ch chan<- prometheus.Metric) { + ch <- prometheus.MustNewConstMetric( + c.osInformation, prometheus.GaugeValue, - float64(gmse.AvailPageFile), + 1.0, ) // Windows has no defined limit, and is based off available resources. This currently isn't calculated by WMI and is set to default value. @@ -346,36 +451,33 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan prometheus.GaugeValue, float64(4294967295), ) +} - ch <- prometheus.MustNewConstMetric( - c.processMemoryLimitBytes, - prometheus.GaugeValue, - float64(gmse.TotalVirtual), - ) +func (c *Collector) getWindowsVersion() (string, string, string, error) { + // Get build number and product name from registry + ntKey, err := registry.OpenKey(registry.LOCAL_MACHINE, `SOFTWARE\Microsoft\Windows NT\CurrentVersion`, registry.QUERY_VALUE) + if err != nil { + return "", "", "", fmt.Errorf("failed to open registry key: %w", err) + } - ch <- prometheus.MustNewConstMetric( - c.processes, - prometheus.GaugeValue, - float64(gpi.ProcessCount), - ) + defer ntKey.Close() - ch <- prometheus.MustNewConstMetric( - c.users, - prometheus.GaugeValue, - float64(nwgi.LoggedOnUsers), - ) + productName, _, err := ntKey.GetStringValue("ProductName") + if err != nil { + return "", "", "", err + } - ch <- prometheus.MustNewConstMetric( - c.virtualMemoryBytes, - prometheus.GaugeValue, - float64(gmse.TotalPageFile), - ) + buildNumber, _, err := ntKey.GetStringValue("CurrentBuildNumber") + if err != nil { + return "", "", "", err + } - ch <- prometheus.MustNewConstMetric( - c.visibleMemoryBytes, - prometheus.GaugeValue, - float64(gmse.TotalPhys), - ) + revision, _, err := ntKey.GetIntegerValue("UBR") + if errors.Is(err, registry.ErrNotExist) { + revision = 0 + } else if err != nil { + return "", "", "", err + } - return nil + return productName, buildNumber, strconv.FormatUint(revision, 10), nil } diff --git a/pkg/collector/system/system.go b/pkg/collector/system/system.go index 15620d53a..5c06abc98 100644 --- a/pkg/collector/system/system.go +++ b/pkg/collector/system/system.go @@ -3,6 +3,8 @@ package system import ( + "errors" + "github.com/alecthomas/kingpin/v2" "github.com/go-kit/log" "github.com/go-kit/log/level" @@ -25,6 +27,8 @@ type Collector struct { contextSwitchesTotal *prometheus.Desc exceptionDispatchesTotal *prometheus.Desc processorQueueLength *prometheus.Desc + processes *prometheus.Desc + processesLimit *prometheus.Desc systemCallsTotal *prometheus.Desc systemUpTime *prometheus.Desc threads *prometheus.Desc @@ -71,6 +75,19 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { nil, nil, ) + c.processes = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "processes"), + "Current number of processes (WMI source: PerfOS_System.Processes)", + nil, + nil, + ) + c.processesLimit = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "processes_limit"), + "Maximum number of processes.", + nil, + nil, + ) + c.processorQueueLength = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "processor_queue_length"), "Length of processor queue (WMI source: PerfOS_System.ProcessorQueueLength)", @@ -117,6 +134,7 @@ type system struct { ProcessorQueueLength float64 `perflib:"Processor Queue Length"` SystemCallsPersec float64 `perflib:"System Calls/sec"` SystemUpTime float64 `perflib:"System Up Time"` + Processes float64 `perflib:"Processes"` Threads float64 `perflib:"Threads"` } @@ -127,6 +145,10 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan return err } + if len(dst) == 0 { + return errors.New("no data returned from Performance Counter") + } + ch <- prometheus.MustNewConstMetric( c.contextSwitchesTotal, prometheus.CounterValue, @@ -142,6 +164,11 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan prometheus.GaugeValue, dst[0].ProcessorQueueLength, ) + ch <- prometheus.MustNewConstMetric( + c.processes, + prometheus.GaugeValue, + dst[0].Processes, + ) ch <- prometheus.MustNewConstMetric( c.systemCallsTotal, prometheus.CounterValue, @@ -157,5 +184,15 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan prometheus.GaugeValue, dst[0].Threads, ) + + // Windows has no defined limit, and is based off available resources. This currently isn't calculated by WMI and is set to default value. + // https://techcommunity.microsoft.com/t5/windows-blog-archive/pushing-the-limits-of-windows-processes-and-threads/ba-p/723824 + // https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/win32-operatingsystem + ch <- prometheus.MustNewConstMetric( + c.processesLimit, + prometheus.GaugeValue, + float64(4294967295), + ) + return nil } diff --git a/pkg/collector/textfile/textfile_test_test.go b/pkg/collector/textfile/textfile_test_test.go index 246aa56d3..b5de5c3a2 100644 --- a/pkg/collector/textfile/textfile_test_test.go +++ b/pkg/collector/textfile/textfile_test_test.go @@ -16,9 +16,8 @@ import ( var baseDir = "../../../tools/textfile-test" +//nolint:paralleltest func TestMultipleDirectories(t *testing.T) { - t.Parallel() - logger := log.NewLogfmtLogger(os.Stdout) testDir := baseDir + "/multiple-dirs" testDirs := fmt.Sprintf("%[1]s/dir1,%[1]s/dir2,%[1]s/dir3", testDir) @@ -60,9 +59,8 @@ func TestMultipleDirectories(t *testing.T) { } } +//nolint:paralleltest func TestDuplicateFileName(t *testing.T) { - t.Parallel() - logger := log.NewLogfmtLogger(os.Stdout) testDir := baseDir + "/duplicate-filename" textFileCollector := textfile.New(&textfile.Config{ diff --git a/pkg/collector/time/time.go b/pkg/collector/time/time.go index 6a8cce7fc..a15ab378e 100644 --- a/pkg/collector/time/time.go +++ b/pkg/collector/time/time.go @@ -4,10 +4,13 @@ package time import ( "errors" + "syscall" + "time" "github.com/alecthomas/kingpin/v2" "github.com/go-kit/log" "github.com/go-kit/log/level" + "github.com/prometheus-community/windows_exporter/pkg/headers/kernel32" "github.com/prometheus-community/windows_exporter/pkg/perflib" "github.com/prometheus-community/windows_exporter/pkg/types" "github.com/prometheus-community/windows_exporter/pkg/winversion" @@ -25,6 +28,8 @@ var ConfigDefaults = Config{} type Collector struct { config Config + currentTime *prometheus.Desc + timezone *prometheus.Desc clockFrequencyAdjustmentPPBTotal *prometheus.Desc computedTimeOffset *prometheus.Desc ntpClientTimeSourceCount *prometheus.Desc @@ -66,6 +71,18 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { return errors.New("windows version older than Server 2016 detected. The time collector will not run and should be disabled via CLI flags or configuration file") } + c.currentTime = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "current_timestamp_seconds"), + "OperatingSystem.LocalDateTime", + nil, + nil, + ) + c.timezone = prometheus.NewDesc( + prometheus.BuildFQName(types.Namespace, Name, "timezone"), + "OperatingSystem.LocalDateTime", + []string{"timezone"}, + nil, + ) c.clockFrequencyAdjustmentPPBTotal = prometheus.NewDesc( prometheus.BuildFQName(types.Namespace, Name, "clock_frequency_adjustment_ppb_total"), "Total adjustment made to the local system clock frequency by W32Time in Parts Per Billion (PPB) units.", @@ -109,11 +126,20 @@ func (c *Collector) Build(_ log.Logger, _ *wmi.Client) error { // to the provided prometheus Metric channel. func (c *Collector) Collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { logger = log.With(logger, "collector", Name) - if err := c.collect(ctx, logger, ch); err != nil { + + errs := make([]error, 0, 2) + + if err := c.collectTime(ch); err != nil { _ = level.Error(logger).Log("msg", "failed collecting time metrics", "err", err) - return err + errs = append(errs, err) } - return nil + + if err := c.collectNTP(ctx, logger, ch); err != nil { + _ = level.Error(logger).Log("msg", "failed collecting time ntp metrics", "err", err) + errs = append(errs, err) + } + + return errors.Join(errs...) } // Perflib "Windows Time Service". @@ -121,18 +147,47 @@ type windowsTime struct { ClockFrequencyAdjustmentPPBTotal float64 `perflib:"Clock Frequency Adjustment (ppb)"` ComputedTimeOffset float64 `perflib:"Computed Time Offset"` NTPClientTimeSourceCount float64 `perflib:"NTP Client Time Source Count"` - NTPRoundtripDelay float64 `perflib:"NTP Roundtrip Delay"` + NTPRoundTripDelay float64 `perflib:"NTP Roundtrip Delay"` NTPServerIncomingRequestsTotal float64 `perflib:"NTP Server Incoming Requests"` NTPServerOutgoingResponsesTotal float64 `perflib:"NTP Server Outgoing Responses"` } -func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { +func (c *Collector) collectTime(ch chan<- prometheus.Metric) error { + ch <- prometheus.MustNewConstMetric( + c.currentTime, + prometheus.GaugeValue, + float64(time.Now().Unix()), + ) + + timeZoneInfo, err := kernel32.GetDynamicTimeZoneInformation() + if err != nil { + return err + } + + // timeZoneKeyName contains the english name of the timezone. + timezoneName := syscall.UTF16ToString(timeZoneInfo.TimeZoneKeyName[:]) + + ch <- prometheus.MustNewConstMetric( + c.timezone, + prometheus.GaugeValue, + 1.0, + timezoneName, + ) + + return nil +} + +func (c *Collector) collectNTP(ctx *types.ScrapeContext, logger log.Logger, ch chan<- prometheus.Metric) error { logger = log.With(logger, "collector", Name) var dst []windowsTime // Single-instance class, array is required but will have single entry. if err := perflib.UnmarshalObject(ctx.PerfObjects["Windows Time Service"], &dst, logger); err != nil { return err } + if len(dst) == 0 { + return errors.New("no data returned for Windows Time Service") + } + ch <- prometheus.MustNewConstMetric( c.clockFrequencyAdjustmentPPBTotal, prometheus.CounterValue, @@ -151,7 +206,7 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan ch <- prometheus.MustNewConstMetric( c.ntpRoundTripDelay, prometheus.GaugeValue, - dst[0].NTPRoundtripDelay/1000000, // microseconds -> seconds + dst[0].NTPRoundTripDelay/1000000, // microseconds -> seconds ) ch <- prometheus.MustNewConstMetric( c.ntpServerIncomingRequestsTotal, @@ -163,5 +218,6 @@ func (c *Collector) collect(ctx *types.ScrapeContext, logger log.Logger, ch chan prometheus.CounterValue, dst[0].NTPServerOutgoingResponsesTotal, ) + return nil } diff --git a/pkg/types/const.go b/pkg/types/const.go index eca2931c3..e108970c6 100644 --- a/pkg/types/const.go +++ b/pkg/types/const.go @@ -3,7 +3,7 @@ package types const ( - DefaultCollectors = "cpu,cs,logical_disk,physical_disk,net,os,service,system" + DefaultCollectors = "cpu,cs,memory,logical_disk,physical_disk,net,os,service,system" DefaultCollectorsPlaceholder = "[defaults]" Namespace = "windows" ) diff --git a/tools/e2e-output.txt b/tools/e2e-output.txt index 084d0808b..861beab71 100644 --- a/tools/e2e-output.txt +++ b/tools/e2e-output.txt @@ -1,4 +1,4 @@ -# HELP test_alpha_total Some random metric. +# HELP test_alpha_total Some random metric. # TYPE test_alpha_total counter test_alpha_total 42 # HELP windows_cpu_clock_interrupts_total Total number of received and serviced clock tick interrupts @@ -27,25 +27,27 @@ test_alpha_total 42 # TYPE windows_cpu_info_thread gauge # HELP windows_cpu_interrupts_total Total number of received and serviced hardware interrupts # TYPE windows_cpu_interrupts_total counter +# HELP windows_cpu_logical_processor Total number of logical processors +# TYPE windows_cpu_logical_processor gauge # HELP windows_cpu_parking_status Parking Status represents whether a processor is parked or not # TYPE windows_cpu_parking_status gauge -# HELP windows_cpu_processor_performance_total Processor Performance is the average performance of the processor while it is executing instructions, as a percentage of the nominal performance of the processor. On some processors, Processor Performance may exceed 100% -# TYPE windows_cpu_processor_performance_total counter -# HELP windows_cpu_time_total Time that processor spent in different modes (dpc, idle, interrupt, privileged, user) -# TYPE windows_cpu_time_total counter # HELP windows_cpu_processor_mperf_total Processor MPerf is the number of TSC ticks incremented while executing instructions # TYPE windows_cpu_processor_mperf_total counter +# HELP windows_cpu_processor_performance_total Processor Performance is the average performance of the processor while it is executing instructions, as a percentage of the nominal performance of the processor. On some processors, Processor Performance may exceed 100% +# TYPE windows_cpu_processor_performance_total counter # HELP windows_cpu_processor_privileged_utility_total Processor Privileged Utility represents is the amount of time the core has spent executing instructions inside the kernel # TYPE windows_cpu_processor_privileged_utility_total counter # HELP windows_cpu_processor_rtc_total Processor RTC represents the number of RTC ticks made since the system booted. It should consistently be 64e6, and can be used to properly derive Processor Utility Rate # TYPE windows_cpu_processor_rtc_total counter # HELP windows_cpu_processor_utility_total Processor Utility represents is the amount of time the core spends executing instructions # TYPE windows_cpu_processor_utility_total counter -# HELP windows_cs_hostname Labelled system hostname information as provided by ComputerSystem.DNSHostName and ComputerSystem.Domain +# HELP windows_cpu_time_total Time that processor spent in different modes (dpc, idle, interrupt, privileged, user) +# TYPE windows_cpu_time_total counter +# HELP windows_cs_hostname Deprecated: Use windows_os_hostname instead # TYPE windows_cs_hostname gauge -# HELP windows_cs_logical_processors ComputerSystem.NumberOfLogicalProcessors +# HELP windows_cs_logical_processors Deprecated: Use windows_cpu_logical_processor instead # TYPE windows_cs_logical_processors gauge -# HELP windows_cs_physical_memory_bytes ComputerSystem.TotalPhysicalMemory +# HELP windows_cs_physical_memory_bytes Deprecated: Use windows_physical_memory_total_bytes instead # TYPE windows_cs_physical_memory_bytes gauge # HELP windows_exporter_collector_duration_seconds windows_exporter: Duration of a collection. # TYPE windows_exporter_collector_duration_seconds gauge @@ -55,9 +57,10 @@ windows_exporter_collector_success{collector="cpu"} 1 windows_exporter_collector_success{collector="cpu_info"} 1 windows_exporter_collector_success{collector="cs"} 1 windows_exporter_collector_success{collector="logical_disk"} 1 -windows_exporter_collector_success{collector="physical_disk"} 1 +windows_exporter_collector_success{collector="memory"} 1 windows_exporter_collector_success{collector="net"} 1 windows_exporter_collector_success{collector="os"} 1 +windows_exporter_collector_success{collector="physical_disk"} 1 windows_exporter_collector_success{collector="process"} 1 windows_exporter_collector_success{collector="scheduled_task"} 1 windows_exporter_collector_success{collector="service"} 1 @@ -69,9 +72,10 @@ windows_exporter_collector_timeout{collector="cpu"} 0 windows_exporter_collector_timeout{collector="cpu_info"} 0 windows_exporter_collector_timeout{collector="cs"} 0 windows_exporter_collector_timeout{collector="logical_disk"} 0 -windows_exporter_collector_timeout{collector="physical_disk"} 0 +windows_exporter_collector_timeout{collector="memory"} 0 windows_exporter_collector_timeout{collector="net"} 0 windows_exporter_collector_timeout{collector="os"} 0 +windows_exporter_collector_timeout{collector="physical_disk"} 0 windows_exporter_collector_timeout{collector="process"} 0 windows_exporter_collector_timeout{collector="scheduled_task"} 0 windows_exporter_collector_timeout{collector="service"} 0 @@ -79,6 +83,10 @@ windows_exporter_collector_timeout{collector="system"} 0 windows_exporter_collector_timeout{collector="textfile"} 0 # HELP windows_exporter_perflib_snapshot_duration_seconds Duration of perflib snapshot capture # TYPE windows_exporter_perflib_snapshot_duration_seconds gauge +# HELP windows_logical_disk_avg_read_requests_queued Average number of read requests that were queued for the selected disk during the sample interval (LogicalDisk.AvgDiskReadQueueLength) +# TYPE windows_logical_disk_avg_read_requests_queued gauge +# HELP windows_logical_disk_avg_write_requests_queued Average number of write requests that were queued for the selected disk during the sample interval (LogicalDisk.AvgDiskWriteQueueLength) +# TYPE windows_logical_disk_avg_write_requests_queued gauge # HELP windows_logical_disk_free_bytes Free space in bytes, updates every 10-15 min (LogicalDisk.PercentFreeSpace) # TYPE windows_logical_disk_free_bytes gauge # HELP windows_logical_disk_idle_seconds_total Seconds that the disk was idle (LogicalDisk.PercentIdleTime) @@ -97,10 +105,6 @@ windows_exporter_collector_timeout{collector="textfile"} 0 # TYPE windows_logical_disk_reads_total counter # HELP windows_logical_disk_requests_queued The number of requests queued to the disk (LogicalDisk.CurrentDiskQueueLength) # TYPE windows_logical_disk_requests_queued gauge -# HELP windows_logical_disk_avg_read_requests_queued Average number of read requests that were queued for the selected disk during the sample interval (LogicalDisk.AvgDiskReadQueueLength) -# TYPE windows_logical_disk_avg_read_requests_queued gauge -# HELP windows_logical_disk_avg_write_requests_queued Average number of write requests that were queued for the selected disk during the sample interval (LogicalDisk.AvgDiskWriteQueueLength) -# TYPE windows_logical_disk_avg_write_requests_queued gauge # HELP windows_logical_disk_size_bytes Total space in bytes, updates every 10-15 min (LogicalDisk.PercentFreeSpace_Base) # TYPE windows_logical_disk_size_bytes gauge # HELP windows_logical_disk_split_ios_total The number of I/Os to the disk were split into multiple I/Os (LogicalDisk.SplitIOPerSec) @@ -113,40 +117,86 @@ windows_exporter_collector_timeout{collector="textfile"} 0 # TYPE windows_logical_disk_write_seconds_total counter # HELP windows_logical_disk_writes_total The number of write operations on the disk (LogicalDisk.DiskWritesPerSec) # TYPE windows_logical_disk_writes_total counter -# HELP windows_physical_disk_idle_seconds_total Seconds that the disk was idle (PhysicalDisk.PercentIdleTime) -# TYPE windows_physical_disk_idle_seconds_total counter -# HELP windows_physical_disk_read_bytes_total The number of bytes transferred from the disk during read operations (PhysicalDisk.DiskReadBytesPerSec) -# TYPE windows_physical_disk_read_bytes_total counter -# HELP windows_physical_disk_read_latency_seconds_total Shows the average time, in seconds, of a read operation from the disk (PhysicalDisk.AvgDiskSecPerRead) -# TYPE windows_physical_disk_read_latency_seconds_total counter -# HELP windows_physical_disk_read_seconds_total Seconds that the disk was busy servicing read requests (PhysicalDisk.PercentDiskReadTime) -# TYPE windows_physical_disk_read_seconds_total counter -# HELP windows_physical_disk_read_write_latency_seconds_total Shows the time, in seconds, of the average disk transfer (PhysicalDisk.AvgDiskSecPerTransfer) -# TYPE windows_physical_disk_read_write_latency_seconds_total counter -# HELP windows_physical_disk_reads_total The number of read operations on the disk (PhysicalDisk.DiskReadsPerSec) -# TYPE windows_physical_disk_reads_total counter -# HELP windows_physical_disk_requests_queued The number of requests queued to the disk (PhysicalDisk.CurrentDiskQueueLength) -# TYPE windows_physical_disk_requests_queued gauge -# HELP windows_physical_disk_split_ios_total The number of I/Os to the disk were split into multiple I/Os (PhysicalDisk.SplitIOPerSec) -# TYPE windows_physical_disk_split_ios_total counter -# HELP windows_physical_disk_write_bytes_total The number of bytes transferred to the disk during write operations (PhysicalDisk.DiskWriteBytesPerSec) -# TYPE windows_physical_disk_write_bytes_total counter -# HELP windows_physical_disk_write_latency_seconds_total Shows the average time, in seconds, of a write operation to the disk (PhysicalDisk.AvgDiskSecPerWrite) -# TYPE windows_physical_disk_write_latency_seconds_total counter -# HELP windows_physical_disk_write_seconds_total Seconds that the disk was busy servicing write requests (PhysicalDisk.PercentDiskWriteTime) -# TYPE windows_physical_disk_write_seconds_total counter -# HELP windows_physical_disk_writes_total The number of write operations on the disk (PhysicalDisk.DiskWritesPerSec) -# TYPE windows_physical_disk_writes_total counter +# HELP windows_memory_available_bytes The amount of physical memory immediately available for allocation to a process or for system use. It is equal to the sum of memory assigned to the standby (cached), free and zero page lists (AvailableBytes) +# TYPE windows_memory_available_bytes gauge +# HELP windows_memory_cache_bytes (CacheBytes) +# TYPE windows_memory_cache_bytes gauge +# HELP windows_memory_cache_bytes_peak (CacheBytesPeak) +# TYPE windows_memory_cache_bytes_peak gauge +# HELP windows_memory_cache_faults_total Number of faults which occur when a page sought in the file system cache is not found there and must be retrieved from elsewhere in memory (soft fault) or from disk (hard fault) (Cache Faults/sec) +# TYPE windows_memory_cache_faults_total counter +# HELP windows_memory_commit_limit (CommitLimit) +# TYPE windows_memory_commit_limit gauge +# HELP windows_memory_committed_bytes (CommittedBytes) +# TYPE windows_memory_committed_bytes gauge +# HELP windows_memory_demand_zero_faults_total The number of zeroed pages required to satisfy faults. Zeroed pages, pages emptied of previously stored data and filled with zeros, are a security feature of Windows that prevent processes from seeing data stored by earlier processes that used the memory space (Demand Zero Faults/sec) +# TYPE windows_memory_demand_zero_faults_total counter +# HELP windows_memory_free_and_zero_page_list_bytes The amount of physical memory, in bytes, that is assigned to the free and zero page lists. This memory does not contain cached data. It is immediately available for allocation to a process or for system use (FreeAndZeroPageListBytes) +# TYPE windows_memory_free_and_zero_page_list_bytes gauge +# HELP windows_memory_free_system_page_table_entries (FreeSystemPageTableEntries) +# TYPE windows_memory_free_system_page_table_entries gauge +# HELP windows_memory_modified_page_list_bytes The amount of physical memory, in bytes, that is assigned to the modified page list. This memory contains cached data and code that is not actively in use by processes, the system and the system cache (ModifiedPageListBytes) +# TYPE windows_memory_modified_page_list_bytes gauge +# HELP windows_memory_page_faults_total Overall rate at which faulted pages are handled by the processor (Page Faults/sec) +# TYPE windows_memory_page_faults_total counter +# HELP windows_memory_physical_free_bytes The amount of physical memory currently available, in bytes. This is the amount of physical memory that can be immediately reused without having to write its contents to disk first. It is the sum of the size of the standby, free, and zero lists. +# TYPE windows_memory_physical_free_bytes gauge +# HELP windows_memory_physical_total_bytes The amount of actual physical memory, in bytes. +# TYPE windows_memory_physical_total_bytes gauge +# HELP windows_memory_pool_nonpaged_allocs_total The number of calls to allocate space in the nonpaged pool. The nonpaged pool is an area of system memory area for objects that cannot be written to disk, and must remain in physical memory as long as they are allocated (PoolNonpagedAllocs) +# TYPE windows_memory_pool_nonpaged_allocs_total gauge +# HELP windows_memory_pool_nonpaged_bytes Number of bytes in the non-paged pool, an area of the system virtual memory that is used for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated (PoolNonpagedBytes) +# TYPE windows_memory_pool_nonpaged_bytes gauge +# HELP windows_memory_pool_paged_allocs_total Number of calls to allocate space in the paged pool, regardless of the amount of space allocated in each call (PoolPagedAllocs) +# TYPE windows_memory_pool_paged_allocs_total counter +# HELP windows_memory_pool_paged_bytes (PoolPagedBytes) +# TYPE windows_memory_pool_paged_bytes gauge +# HELP windows_memory_pool_paged_resident_bytes The size, in bytes, of the portion of the paged pool that is currently resident and active in physical memory. The paged pool is an area of the system virtual memory that is used for objects that can be written to disk when they are not being used (PoolPagedResidentBytes) +# TYPE windows_memory_pool_paged_resident_bytes gauge +# HELP windows_memory_process_memory_limit_bytes The size of the user-mode portion of the virtual address space of the calling process, in bytes. This value depends on the type of process, the type of processor, and the configuration of the operating system. +# TYPE windows_memory_process_memory_limit_bytes gauge +# HELP windows_memory_standby_cache_core_bytes The amount of physical memory, in bytes, that is assigned to the core standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache (StandbyCacheCoreBytes) +# TYPE windows_memory_standby_cache_core_bytes gauge +# HELP windows_memory_standby_cache_normal_priority_bytes The amount of physical memory, in bytes, that is assigned to the normal priority standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache (StandbyCacheNormalPriorityBytes) +# TYPE windows_memory_standby_cache_normal_priority_bytes gauge +# HELP windows_memory_standby_cache_reserve_bytes The amount of physical memory, in bytes, that is assigned to the reserve standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache (StandbyCacheReserveBytes) +# TYPE windows_memory_standby_cache_reserve_bytes gauge +# HELP windows_memory_swap_page_operations_total Total number of swap page read and writes (PagesPersec) +# TYPE windows_memory_swap_page_operations_total counter +# HELP windows_memory_swap_page_reads_total Number of disk page reads (a single read operation reading several pages is still only counted once) (PageReadsPersec) +# TYPE windows_memory_swap_page_reads_total counter +# HELP windows_memory_swap_page_writes_total Number of disk page writes (a single write operation writing several pages is still only counted once) (PageWritesPersec) +# TYPE windows_memory_swap_page_writes_total counter +# HELP windows_memory_swap_pages_read_total Number of pages read across all page reads (ie counting all pages read even if they are read in a single operation) (PagesInputPersec) +# TYPE windows_memory_swap_pages_read_total counter +# HELP windows_memory_swap_pages_written_total Number of pages written across all page writes (ie counting all pages written even if they are written in a single operation) (PagesOutputPersec) +# TYPE windows_memory_swap_pages_written_total counter +# HELP windows_memory_system_cache_resident_bytes The size, in bytes, of the portion of the system file cache which is currently resident and active in physical memory (SystemCacheResidentBytes) +# TYPE windows_memory_system_cache_resident_bytes gauge +# HELP windows_memory_system_code_resident_bytes The size, in bytes, of the pageable operating system code that is currently resident and active in physical memory (SystemCodeResidentBytes) +# TYPE windows_memory_system_code_resident_bytes gauge +# HELP windows_memory_system_code_total_bytes The size, in bytes, of the pageable operating system code currently mapped into the system virtual address space (SystemCodeTotalBytes) +# TYPE windows_memory_system_code_total_bytes gauge +# HELP windows_memory_system_driver_resident_bytes The size, in bytes, of the pageable physical memory being used by device drivers. It is the working set (physical memory area) of the drivers (SystemDriverResidentBytes) +# TYPE windows_memory_system_driver_resident_bytes gauge +# HELP windows_memory_system_driver_total_bytes The size, in bytes, of the pageable virtual memory currently being used by device drivers. Pageable memory can be written to disk when it is not being used (SystemDriverTotalBytes) +# TYPE windows_memory_system_driver_total_bytes gauge +# HELP windows_memory_transition_faults_total Number of faults rate at which page faults are resolved by recovering pages that were being used by another process sharing the page, or were on the modified page list or the standby list, or were being written to disk at the time of the page fault (TransitionFaultsPersec) +# TYPE windows_memory_transition_faults_total counter +# HELP windows_memory_transition_pages_repurposed_total Transition Pages RePurposed is the rate at which the number of transition cache pages were reused for a different purpose (TransitionPagesRePurposedPersec) +# TYPE windows_memory_transition_pages_repurposed_total counter +# HELP windows_memory_write_copies_total The number of page faults caused by attempting to write that were satisfied by copying the page from elsewhere in physical memory (WriteCopiesPersec) +# TYPE windows_memory_write_copies_total counter # HELP windows_net_bytes_received_total (Network.BytesReceivedPerSec) # TYPE windows_net_bytes_received_total counter # HELP windows_net_bytes_sent_total (Network.BytesSentPerSec) # TYPE windows_net_bytes_sent_total counter # HELP windows_net_bytes_total (Network.BytesTotalPerSec) # TYPE windows_net_bytes_total counter -# HELP windows_net_output_queue_length_packets (Network.OutputQueueLength) -# TYPE windows_net_output_queue_length_packets gauge # HELP windows_net_current_bandwidth_bytes (Network.CurrentBandwidth) # TYPE windows_net_current_bandwidth_bytes gauge +# HELP windows_net_output_queue_length_packets (Network.OutputQueueLength) +# TYPE windows_net_output_queue_length_packets gauge # HELP windows_net_packets_outbound_discarded_total (Network.PacketsOutboundDiscarded) # TYPE windows_net_packets_outbound_discarded_total counter # HELP windows_net_packets_outbound_errors_total (Network.PacketsOutboundErrors) @@ -163,32 +213,58 @@ windows_exporter_collector_timeout{collector="textfile"} 0 # TYPE windows_net_packets_sent_total counter # HELP windows_net_packets_total (Network.PacketsPerSec) # TYPE windows_net_packets_total counter -# HELP windows_os_info OperatingSystem.Caption, OperatingSystem.Version +# HELP windows_os_hostname Labelled system hostname information as provided by ComputerSystem.DNSHostName and ComputerSystem.Domain +# TYPE windows_os_hostname gauge +# HELP windows_os_info Contains full product name & version in labels. Note that the "major_version" for Windows 11 is \\"10\\"; a build number greater than 22000 represents Windows 11. # TYPE windows_os_info gauge # HELP windows_os_paging_free_bytes OperatingSystem.FreeSpaceInPagingFiles # TYPE windows_os_paging_free_bytes gauge # HELP windows_os_paging_limit_bytes OperatingSystem.SizeStoredInPagingFiles # TYPE windows_os_paging_limit_bytes gauge -# HELP windows_os_physical_memory_free_bytes OperatingSystem.FreePhysicalMemory +# HELP windows_os_physical_memory_free_bytes Deprecated: Use `windows_memory_physical_free_bytes` instead. # TYPE windows_os_physical_memory_free_bytes gauge -# HELP windows_os_process_memory_limit_bytes OperatingSystem.MaxProcessMemorySize +# HELP windows_os_process_memory_limit_bytes Deprecated: Use `windows_memory_process_memory_limit_bytes` instead. # TYPE windows_os_process_memory_limit_bytes gauge -# HELP windows_os_processes OperatingSystem.NumberOfProcesses +# HELP windows_os_processes Deprecated: Use `windows_system_processes` instead. # TYPE windows_os_processes gauge -# HELP windows_os_processes_limit OperatingSystem.MaxNumberOfProcesses +# HELP windows_os_processes_limit Deprecated: Use `windows_system_process_limit` instead. # TYPE windows_os_processes_limit gauge -# HELP windows_os_time OperatingSystem.LocalDateTime +# HELP windows_os_time Deprecated: Use windows_time_current_timestamp_seconds instead. # TYPE windows_os_time gauge -# HELP windows_os_timezone OperatingSystem.LocalDateTime +# HELP windows_os_timezone Deprecated: Use windows_time_timezone instead. # TYPE windows_os_timezone gauge -# HELP windows_os_users OperatingSystem.NumberOfUsers +# HELP windows_os_users Deprecated: Use `count(windows_logon_logon_type)` instead. # TYPE windows_os_users gauge -# HELP windows_os_virtual_memory_bytes OperatingSystem.TotalVirtualMemorySize +# HELP windows_os_virtual_memory_bytes Deprecated: Use `windows_memory_commit_limit` instead. # TYPE windows_os_virtual_memory_bytes gauge -# HELP windows_os_virtual_memory_free_bytes OperatingSystem.FreeVirtualMemory +# HELP windows_os_virtual_memory_free_bytes Deprecated: Use `windows_memory_commit_limit - windows_memory_committed_bytes` instead. # TYPE windows_os_virtual_memory_free_bytes gauge -# HELP windows_os_visible_memory_bytes OperatingSystem.TotalVisibleMemorySize +# HELP windows_os_visible_memory_bytes Deprecated: Use `windows_memory_physical_total_bytes` instead. # TYPE windows_os_visible_memory_bytes gauge +# HELP windows_physical_disk_idle_seconds_total Seconds that the disk was idle (PhysicalDisk.PercentIdleTime) +# TYPE windows_physical_disk_idle_seconds_total counter +# HELP windows_physical_disk_read_bytes_total The number of bytes transferred from the disk during read operations (PhysicalDisk.DiskReadBytesPerSec) +# TYPE windows_physical_disk_read_bytes_total counter +# HELP windows_physical_disk_read_latency_seconds_total Shows the average time, in seconds, of a read operation from the disk (PhysicalDisk.AvgDiskSecPerRead) +# TYPE windows_physical_disk_read_latency_seconds_total counter +# HELP windows_physical_disk_read_seconds_total Seconds that the disk was busy servicing read requests (PhysicalDisk.PercentDiskReadTime) +# TYPE windows_physical_disk_read_seconds_total counter +# HELP windows_physical_disk_read_write_latency_seconds_total Shows the time, in seconds, of the average disk transfer (PhysicalDisk.AvgDiskSecPerTransfer) +# TYPE windows_physical_disk_read_write_latency_seconds_total counter +# HELP windows_physical_disk_reads_total The number of read operations on the disk (PhysicalDisk.DiskReadsPerSec) +# TYPE windows_physical_disk_reads_total counter +# HELP windows_physical_disk_requests_queued The number of requests queued to the disk (PhysicalDisk.CurrentDiskQueueLength) +# TYPE windows_physical_disk_requests_queued gauge +# HELP windows_physical_disk_split_ios_total The number of I/Os to the disk were split into multiple I/Os (PhysicalDisk.SplitIOPerSec) +# TYPE windows_physical_disk_split_ios_total counter +# HELP windows_physical_disk_write_bytes_total The number of bytes transferred to the disk during write operations (PhysicalDisk.DiskWriteBytesPerSec) +# TYPE windows_physical_disk_write_bytes_total counter +# HELP windows_physical_disk_write_latency_seconds_total Shows the average time, in seconds, of a write operation to the disk (PhysicalDisk.AvgDiskSecPerWrite) +# TYPE windows_physical_disk_write_latency_seconds_total counter +# HELP windows_physical_disk_write_seconds_total Seconds that the disk was busy servicing write requests (PhysicalDisk.PercentDiskWriteTime) +# TYPE windows_physical_disk_write_seconds_total counter +# HELP windows_physical_disk_writes_total The number of write operations on the disk (PhysicalDisk.DiskWritesPerSec) +# TYPE windows_physical_disk_writes_total counter # HELP windows_scheduled_task_state The current state of a scheduled task # TYPE windows_scheduled_task_state gauge windows_scheduled_task_state{state="disabled",task="/Microsoft/Windows/Maintenance/WinSAT"} 1 @@ -208,6 +284,10 @@ windows_scheduled_task_state{state="unknown",task="/Microsoft/Windows/Maintenanc # TYPE windows_system_context_switches_total counter # HELP windows_system_exception_dispatches_total Total number of exceptions dispatched (WMI source: PerfOS_System.ExceptionDispatchesPersec) # TYPE windows_system_exception_dispatches_total counter +# HELP windows_system_processes Current number of processes (WMI source: PerfOS_System.Processes) +# TYPE windows_system_processes gauge +# HELP windows_system_processes_limit Maximum number of processes. +# TYPE windows_system_processes_limit gauge # HELP windows_system_processor_queue_length Length of processor queue (WMI source: PerfOS_System.ProcessorQueueLength) # TYPE windows_system_processor_queue_length gauge # HELP windows_system_system_calls_total Total number of system calls (WMI source: PerfOS_System.SystemCallsPersec) diff --git a/tools/end-to-end-test.ps1 b/tools/end-to-end-test.ps1 index d1597ff55..022de403e 100644 --- a/tools/end-to-end-test.ps1 +++ b/tools/end-to-end-test.ps1 @@ -1,15 +1,15 @@ $ErrorActionPreference = 'Stop' Set-StrictMode -Version 3 -if (-not (Test-Path -Path '.\windows_exporter.exe')) { - Write-Output ".\windows_exporter.exe not found. Consider running \`go build\` first" -} - # cd to location of script $script_path = $MyInvocation.MyCommand.Path $working_dir = Split-Path $script_path Push-Location $working_dir +if (-not (Test-Path -Path '..\windows_exporter.exe')) { + Write-Output "..\windows_exporter.exe not found. Consider running \`go build\` first" +} + $temp_dir = Join-Path $env:TEMP $(New-Guid) | ForEach-Object { mkdir $_ } # Create temporary directory for textfile collector @@ -18,7 +18,7 @@ mkdir $textfile_dir | Out-Null Copy-Item 'e2e-textfile.prom' -Destination "$($textfile_dir)/e2e-textfile.prom" # Omit dynamic collector information that will change after each run -$skip_re = "^(go_|windows_exporter_build_info|windows_exporter_collector_duration_seconds|windows_exporter_perflib_snapshot_duration_seconds|process_|windows_textfile_mtime_seconds|windows_cpu|windows_cs|windows_logical_disk|windows_physical_disk|windows_net|windows_os|windows_process|windows_service|windows_system|windows_textfile_mtime_seconds)" +$skip_re = "^(go_|windows_exporter_build_info|windows_exporter_collector_duration_seconds|windows_exporter_perflib_snapshot_duration_seconds|process_|windows_textfile_mtime_seconds|windows_cpu|windows_cs|windows_logical_disk|windows_physical_disk|windows_memory|windows_net|windows_os|windows_process|windows_service|windows_system|windows_textfile_mtime_seconds)" # Start process in background, awaiting HTTP requests. # Use default collectors, port and address: http://localhost:9182/metrics @@ -53,7 +53,17 @@ try { } # Response output must be split and saved as UTF-8. $response.content -split "[`r`n]"| Select-String -NotMatch $skip_re | Set-Content -Encoding utf8 "$($temp_dir)/e2e-output.txt" -Stop-Process -Id $exporter_proc.Id +try { + Stop-Process -Id $exporter_proc.Id +} catch { + Write-Host "STDOUT" + Get-Content "$($temp_dir)/windows_exporter.log" + Write-Host "STDERR" + Get-Content "$($temp_dir)/windows_exporter_error.log" + + throw $_ +} + $output_diff = Compare-Object (Get-Content 'e2e-output.txt') (Get-Content "$($temp_dir)/e2e-output.txt") # Fail if differences in output are detected @@ -65,5 +75,7 @@ if (-not ($null -eq $output_diff)) { Write-Host "STDERR" Get-Content "$($temp_dir)/windows_exporter_error.log" + (Get-Content "$($temp_dir)/e2e-output.txt") | Set-Content -Encoding utf8 "e2e-output.txt" + exit 1 } diff --git a/tools/promtool.ps1 b/tools/promtool.ps1 index 89d7ba2ff..948153d94 100644 --- a/tools/promtool.ps1 +++ b/tools/promtool.ps1 @@ -110,7 +110,8 @@ for ($i=1; $i -le 5; $i++) { } # Omit metrics from client_golang library; we're not responsible for these -$skip_re = "^[#]?\s*(HELP|TYPE)?\s*go_" +# windows_memory_pool_nonpaged_allocs_total is wrong for years. It's not a gauge, but a counter. +$skip_re = "^([#]?\s*(HELP|TYPE)?\s*go_|windows_memory_pool_nonpaged_allocs_total)" # Need to remove carriage returns, as promtool expects LF line endings $output = ((Invoke-WebRequest -UseBasicParsing -URI http://127.0.0.1:9183/metrics).Content) -Split "`r?`n" | Select-String -NotMatch $skip_re | Join-String -Separator "`n"