Zabbix 4.0 and beyond - kampan.snt.skkampan.snt.sk/zabbix2018/pdf/Zabbix 4.0 and beyond... · The...

Post on 20-May-2020

16 views 0 download

transcript

The Universal Open Source Enterprise Level Monitoring Solution

Zabbix 4.0 and beyondWhat we may expect in the future

Alexei Vladishev

Founder and CEO of Zabbix

Twitter: @avladishev

Email: alex@zabbix.com

2

3

Zabbix is a universal open source enterprise level

monitoring solution

Zabbix Team

4

3.0 LTS 3.2 3.4 4.0 LTS

5

Where we are?

ReleasedItem pre-processing

Dependent items

Maps and dashboards

Remote commands by Proxies

Elastic Search

3.0 LTS 3.2 3.4 4.0 LTS

6

Where we are?

Released Under development

4.0 LTS

7

A few major improvements of Zabbix 4.0 and why they

are important

Making problems independent

8

1

Problems and events

9

Triggers{HOST.NAME} has just been restarted

Problems*No problem name*

3.x

Slow: problems and events name are calculated on the fly

Problems and events

10

Triggers{HOST.NAME} has just been restarted

4.0

Fast: display as it is in “problems” and “events”

Problems Name: “Linux006 has just been restarted”

Better integration options

11

2

Real-time export of history and events

12

Zabbix Server

History file

ExportDir=/var/log/zabbix ExportFileSize=100M

JSON

Trends file

Events file

Better work flow

13

3

Acknowledgements

14

Message is mandatory No way to put message only No way just to close problem

3.xMonitoring -> Problems -> Ack

Advanced problem work flow

15

Message is optional

Operations are optional: - ACK - Change severity (!) - Close problem

4.0Monitoring -> Problems -> Update

New ways of monitoring

16

4

HTTP item type

17

Some use cases• Monitoring content of WEB application

• Getting data out of APIs, which are based on JSON/XML

• Access to HTTP header fields

• Server: Apache/2.4.1 (Unix)

18

19

Typical HTTP processing

HTTP check Pre-processing History

HTTP data processing

TEXT HTML JSON XML

BINARY

XPath JSONPath

Regex

Better interface

20

5

UI getting simpler

21

No Monitoring->Triggers anymore, use Monitoring->Problems

New widgets

22

and more!

23

We create an universal self-service monitoring

platform delivering business value

Self service

24

Getting most business value out of collected data

Give access to everyone: finance, analytics, sales, support, developers, customers, etc

Requires best user experience

Security and flexible user roles are important

Extreme flexibility

25

Collect any data

Pre-process and transform collected data in any way

Modules and webhooks for extending Zabbix

Choice of: OS, HW, database, programming languages

26

More platforms

Official packages for more hardware and cloud platforms

Modularity

27

28

3.x

29

+

Independent modules

4.x3.x

30

Single pane of glassCentral place to see and control monitoring of whole infrastructure

Central management of alerting

Event collection from various sources

Observing information from multiple Zabbix Servers

31

Unified dashboard and

alerting

Events

Events

EventsEvents

EventsEvents

Root cause analysis

32

Root cause analysis

33

• It gives a clear answer to the question “What is the cause of the problem?”

• It provides information about impact and importance

• Reduces recovery time (MTTR)

Root cause analysis

34

Trigger dependencies Event correlation

3.4

Root cause analysis

35

Trigger dependencies Event correlation

3.4 4.x

Automatic and manual relationship between problems Complex event processing (de-duplication, filtering, enrichment incl. AI & machine learning)

Enrichment

36

‣ Server B is not available

Root cause analysisDatacenter: Tokyo1 Class: Availability

Enrichment

37

‣ Server B is not available

Root cause analysisDatacenter: Tokyo1 Class: Availability

‣ Server B is not available

AI & ML, CMDB, Network topology,

service tree

Datacenter: Tokyo1 Class: Availability

Location: Rack4,32 Contact: Alexei

Service: Helpdesk HW: HP DL380

GEO: 12.459 34.34

External systems

38

‣ Server B is not available

‣ Server C is not available

‣ Datacenter Tokyo1 is not available

‣ Network is not available to Tokyo1

‣ Server B is not available

‣ Server C is not available

‣ 197 more problems ….

Root cause analysisDatacenter: Tokyo1 Class: Availability

Datacenter: Tokyo1 Class: Availability

Datacenter: Tokyo1 Class: AvailabilityRelationship between problems

39

‣ Server B is not available

‣ Server C is not available

‣ Datacenter Tokyo1 is not available

‣ Network is not available to Tokyo1

‣ Server B is not available

‣ Server C is not available

‣ 197 more problems ….

Root cause analysisDatacenter: Tokyo1 Class: Availability

Datacenter: Tokyo1 Class: Availability

Datacenter: Tokyo1 Class: Availability

‣ Datacenter Tokyo1 is not available (2 related problems)

‣ Server B is not available

‣ Server C is not available

Relationship between problems Correlation rules

Service as a first class citizen

40

41

Disk space

Oracle Database

CPU Network

Transaction processing

Java Middleware API

Ticket selling system

WEB Server

Our services

VOIPHelpdesk

42

Disk space

Oracle Database

CPU Network

Transaction processing

Java Middleware API

Ticket selling system

WEB Server

Our services

VOIPHelpdesk

Service: Oracle

System: Disk

Tag based linkageto problems

• Much easier maintenance: tag based linkage between problems and services

• Choice of service propagation rules (up, down)

• Visualisation: more widgets to display services (status, SLA)

• Alerting of service status changes

• Use service tree for problem correlation and impact analysis

43

Services

44

Metrics Problems

Value

Services

IT Infrastructure level More about technology

Business level About SLA and KPIs

More value for business

45

Disk space

Oracle Database

CPU Network

Transaction processing

Java Middleware API

Ticket selling system

WEB Server

Our services

VOIPHelpdesk

IT InfrastructureBusiness

A few announcements

46

New training programs

47

ZCU ZCS ZCP ZCECertified User Certified Specialist Certified Professional Certified Expert

Low Medium High Very high

Difficulty

new new

48

49

The Universal Open Source Enterprise Level Monitoring Solution

Thank you!

Some of the used icons made by Freepik from www.flaticon.com