Monitoring systems are responsible to collect, show and generate alarms in monitored devices such us: servers or network devices.
Tools
Linux includes different tools for collecting, observe and monitoring systems metrics or performance, such as: sysstat, atop, dstat and vmstat. Some of them include recording and alerting capabilities while others just visualization capabilities.
Monitoring solutions
Zabbix and Nagios have been historically tools used to perform monitoring across different servers or devices. Other tools included Icinga (Nagios fork), Prometheus and Netdata.
Alerting capabilities
Netdata support email alerts and is planned to add support to Slack.
prometheus alertmanager support different notifications methods
vmstat
vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 10 2 837532 459608 64052 23976592 1 2 992 1714 1 1 23 5 70 2 0 1 0 837532 421692 64052 24014288 0 0 33444 2280 20629 33647 38 5 55 2 0 2 1 837532 496892 64012 23937976 0 32 56464 2224 19104 35788 24 4 70 3 0 7 0 837532 435928 64020 23999028 0 0 55584 2272 22717 37604 32 5 60 2 0 10 8 837532 411988 64020 24021820 0 0 21532 270348 25256 33189 41 6 38 16 0 8 3 837532 447948 63984 23986276 0 0 28788 20560 27664 42733 39 7 41 13 0
Activities
- Review wikipedia list of software monitors: https://en.wikipedia.org/wiki/System_monitor#List_of_software_monitors
- Review wikiversity articles covering system, software or network monitors: monit, Nagios, Prometheus (software), netdata, sar and Zabbix
- Identify key differences between network monitoring, system monitoring[1] and application performance monitoring (APM)[2].
- Implement a solution to detect disk array failures: System administration/ProLiant/Remove a disk from your redundant storage array and review OS logs
- Monitoring disk space:
- Monitor your RAID devices:
- Software RAIDs: mdadm
- Hardware RAIDs: HPE Array Controllers
See also
- Prometheus monitoring, Zabbix
- ElasticSearch
- Grafana
- CompTIA Computer Networks/Monitoring and Troubleshooting
- Linux server administration/Performance and Troubleshooting
- Configuration management
References
This article is issued from Wikiversity. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.