Commit Graph

79 Commits (ff4784ca2b0164f1aa9c7e91bb34e01c35b048f9)

Author SHA1 Message Date
Michael Hähnel b10f7c3361 Feature/dev 1122 3 years ago
Michael Hähnel fb1ead8a1a DEV-1060 Prometheus Setup für DEMO MPMX anpassen (Metriken/Alerts) 3 years ago
Ketelsen, Sven 91303a458d DEV-1042: added new stage for demo mpmx 3 years ago
Michael Haehnel 83193d70cb
NOTICKET: Silence DB Restore test alerts 3 years ago
friedrich goerz 573cde02e2 DEV-1011: inc. threshold to avoid senseless false positives 3 years ago
Hoan To 0c390415c9 DEV-580: Added prom2teams alert and receiver for email 3 years ago
Görz, Friedrich 3905dff581 DEV-471: added push metrics part to restre playbook 3 years ago
Michael Hähnel 8374ae0d2a DEV-880 Configure Prometheus high_load Alert instance specific 3 years ago
Görz, Friedrich 96da6ef83f Feature/dev 962 es clsuter activehards alert 3 years ago
Ketelsen, Sven e4a391be7f DEV-873 added custom node exporter polling for EXT stage 3 years ago
friedrich goerz 45eb3c0f7f NOTICKET: abolishing nightly false positive alerts 3 years ago
Ketelsen, Sven 7c8d548e4d DEV-719 added prometheus polling for ext-bdev-mpmexec-02-connect 3 years ago
Görz, Friedrich e1d05f5e81 DEV-721: exclude restore-servers from patchday - avoiding broken... 3 years ago
friedrich goerz bb0354e085 DEV-709: fixing timezone for all dashabords 3 years ago
friedrich goerz 81beaf71ac DEV-709: added needed k8s-related dashboards 3 years ago
Görz, Friedrich 982ec72f28 DEV-695: fixing buggy firewall stuff 3 years ago
Sven Ketelsen 42d8398349 DEV-664 bugfix use server specific domain 3 years ago
Görz, Friedrich fe97fbbab5 Bug/dev 659 pgdatadir nospace 3 years ago
friedrich goerz e23813f9d1 NOTICKET: but metrics missing since Nov2021 - needs to be fixed ;) 3 years ago
Michael Hähnel 87a286dd60 DEV-624 New alert for failed db backups 3 years ago
Ketelsen, Sven db57bcb7ca DEV-579 add basic auth to prometheus stack 3 years ago
Görz, Friedrich 24e5cbf3d9 DEV-616: increased vol_count to mitigate disk size problem 3 years ago
Hoan To 98c5f39c85 DEV-579: added prometheus basic auth 3 years ago
Ketelsen, Sven f47c5dc345 DEV-578 investigation for hetzner api rate limits 3 years ago
Ketelsen, Sven ac7285bbcf DEV-572: alertmanager metrics 3 years ago
friedrich goerz 659943ccc5 DEV-563: bugfixed hetzner rate limit alert 3 years ago
Ketelsen, Sven 35dbd3cad1 DEV-569: extended stage overview dashboard 3 years ago
friedrich goerz 9e6f28c62a DEV-563: added hetzner dashboard + svennes dashboard + refactoring alert for hetzner_api_rate_limit 3 years ago
Görz, Friedrich 01c972771b Rollout main=>qa 13.09.2022 3 years ago
friedrich goerz 5367c9929e DEV-539: increased timerange; bugfixed broken silencing for patchday 3 years ago
Görz, Friedrich ffb3aa2122 DEV-543: integrated DO-blackbox VM into DEV-patchday + increased threshold for... 3 years ago
Hoan To a0ff9a5d8e added elasticsearch health check rule 3 years ago
friedrich goerz 1558548682 DEV-517: added alerting for DO API usage 3 years ago
Görz, Friedrich 1c5b1c44dd DEV-391: fix merge problems + fixing linter problems 4 years ago
Görz, Friedrich 6c6dd5c1ae DEV-442: added threshold for pg_repl_lag to avoid false positives on DEV-stage 4 years ago
Michael Hähnel ff9c0d94a1 Extended Monitoring/Alerting for PostgreSQL 4 years ago
friedrich goerz 8c8722851f DEV-386: added alert to get notification in case of ssh root login 4 years ago
Görz, Friedrich f0eab6d3ae DEv-421: refactored installation for postgres-exporter + installed newer... 4 years ago
Görz, Friedrich a2fa12ef40 DEV-396: changed diskspace alert from predictive to alert of current usage 4 years ago
friedrich goerz a834b13ded DEV-378: increased allowed pending time for some alerts 4 years ago
Görz, Friedrich ea2ef949c9 DEV-360: rollout k8s on prodnso 4 years ago
friedrich goerz 46e021d22c DEV-327: added several stuff for new prodnso-stage + bugfixing and improving other stuff 4 years ago
Sven Ketelsen d314e164c7 bugfix: disabled blackbox exporter for connect management
- current config didn't works with 302 to login page
4 years ago
Sven Ketelsen df0e320743 bugfix: fixed connect url for blackbox exporter 4 years ago
Sven Ketelsen 43a4dccc3f chore: removed unnecessary ip lookup 4 years ago
Görz, Friedrich 9f9a192432 DEV-269: added stuff to federate k8s-internal prometheus metrics 4 years ago
Görz, Friedrich 5bdff07d1b DEV-253: digitalocean stuff - add droplet but not idempotentgit branch git branch plz check 4 years ago
Sven Ketelsen d780336dad bugfix: wrong port for postgres exporter
- monitor_port_system > monitor_port_postgres
4 years ago
friedrich goerz 3766911cc5 DEV-241: added monitoring stuff for redis 4 years ago
Sven Ketelsen bd13643e30 feat: prometheus now uses stage_server_infos (auto discover task) 4 years ago