Configuring the MaDDash Server

Note

The primary way most users configure MaDDash is with the pSConfig MaDDash Agent. The agent automatically fills in the values detailed below and most never need to touch any of the options listed. If you find you self in a situation where you do need to make manual configuration changes, then this document provides guidance.

The main configuration file for the maddash-server component is located at /etc/maddash/maddash-server/maddash.yaml

The file is in YAML format. In this file you define the checks you want to run and how they are organized. You can also tweak setting for the embedded web server and various other aspects of the software. The configuration file is broken into the following primary sections:

  • groups - In general, this is the section where you define lists of hosts that you want to check. You can define any number of custom groups in this section. When you define grids these groups will be used to define what constitutes the rows and columns. In theory they don’t have to be hosts, but for the perfSONAR checks this is the common case because your dimensions are endpoints of a test.

  • checks - This is where you define how to run each check. For Nagios checks, this means defining the command-line script to run and arguments to pass it.

  • grids - This is where you define how the groups will be applied to the checks. You define the groups that will compose the rows and columns, then define the type of check that will be run.

  • dashboards - This is where you group grids together.

In addition to the sections above, there are also the following optional sections:

  • reports - A set of patterns that can match across one or more cells in individual grids and generate notifications

  • notifications - Here you can define the types of notifications sent (e.g. email) when a pattern defined in the reports is matched.

There are also a number of miscellaneous parameters related to the general operation of the service. Each of these property categories is described in the remainder of this section.

groups

groups are used to define the resources that will compose the rows and columns of a grid. For the perfSONAR checks, this is often used to group the endpoints used in performance tests together (e.g. create a group for all the hosts running OWAMP tests and another for all the hosts running BWCTL tests). The names, members and number of groups is all customizable. groups take the following format:

groups:
    groupName:
        - "member"
        - "member"

All groups go under the group block. In the example above groupName can replaced with any alphanumeric string that does not contain whitespace. For example “myOwampHosts”, “campusHosts”, or any other string that meets the requirements. This name will be used later in the configuration, so its important to make note of it. In the above example, you should change member to the name of the host or other resource you want in the list.

groupMembers

groupMembers are used to further describe members of a group. Any custom property may be added but only the values listed are understood by the default MaDDash interface. There is also a special map propert that can be used to dynamically set values based on the value of the opposing row/column. groupMembers take the following format where “member” is the name of the member referenced:

groupMembers:
    - id: "member"
      ...member properties...

Below are the list of well-known properties:

Name

Type

Required

Description

id

String

Yes

The group member this is referencing. For example, the hostname or address of a single item in a group list

label

String

No

The value to show when displaying information about the referenced group member.

pstoolkiturl

String

No

The URL of the perfSONAR Toolkit web page.

map

YAML Object

No

Defines properties that change depending on the opposing row/column. The keys of this object are other groupMembers underneath which is a custom set of parameters that can be accessed in template strings using %row.map.<property> and %col.map.<property> respectively. The reserved key default can be used to define a set of properties that apply to hosts not explicitly listed otherwise.

checks

General Parameters

checks are where you provide instructions as to how results should be obtained. Checks take the current format in the configuration file:

checks:
    checkName :
       ...check-parameters...

The checkName can be any alphanumeric string with no whitespace. It is used to identify the check later in the file, so you should note it. For example you could name it something like “owampLossCheck” or “bwctl100Gbps”. The following parameters are available for the check:

Name

Type

Required

Description

name

String

Yes

A human-readable name used for display purposes when describing this check

description

String

Yes

Human-readable text describing the purpose of this check. The description field accepts the template variables %row and %col that will be populated with the current row and column values respectively when the check in applied to a grid.

type

String

Yes

The type of check. Currently the software supports net.es.maddash.checks.NagiosCheck, net.es.maddash.checks.PSNagiosCheck, and net.es.maddash.checks.RandomCheck

params

YAML Object

Type dependent

A YAML object containing parameters specific to the check. See the Type-specific Parameters section.

checkInterval

int

Yes

How frequently to run the check in seconds

retryInterval

int

Yes

How frequently to run the check in seconds if it encounters a state different from the current state

retryAttempts

int

Yes

The number of consecutive times a new state must be seen before it changes the state of the check. For example, if a check has been OK for many days, but suddenly a critical is seen, then a critical state must be seen 2 more times before the status will change

timeout

int

Yes

The number of seconds to wait for the check to return. If it does not return in this timeframe, the check is set to the UNKNOWN status.

Type-specific Parameters

Currently the software supports the following types of checks:

  • net.es.maddash.checks.NagiosCheck - This check is performed using Nagios command. The parameters provided describe how to run that command.

  • net.es.maddash.checks.PSNagiosCheck - This check is a perfSONAR Nagios command. It is an extension of net.es.maddash.checks.NagiosCheck, but includes additional fields to collect information necessary to display graphs from the perfSONAR toolkit.

  • net.es.maddash.checks.RandomCheck - This should only be used for testing. This check returns a random result every time it runs.

NagiosCheck

Name

Type

Required

Description

command

String or YAML Object

Yes

The full nagios command to run on the local system. It accepts the template variables and can also be a map where the outer key is the row and inner is the column (both can take special value “default”) allowing for deterministic setting of the command.

PSNagiosCheck

Name

Type

Required

Description

command

String or YAML Object

Yes

See NagiosCheck

maUrl

YAML Object

Yes

The URL of the measurement archive where performance data related to this check may be retrieved. This accepts the template variables listed in the Nagios Check Template Variables section. The object has one key that is called default which will be the default URL used for any cell in a grid. The remaining keys are members of groups assigned to the row. If default and a row key are specified, the row key is preferred for that row. The value of each key is a map where the key is a member of a group assigned to the column or you can use the default key to apply the URL to every column in the row. If default is specified and a specific value for a column, the specific value for the column is preferred. See the default configuration file for a full example.

graphUrl

String

Yes

A URL where a graph of data related to the check can be retrieved. This accepts the template variables listed in the Nagios Check Template Variables section.

metaDataKeyLookup

String

Yes

DEPRECATED A URL where metaDataKeys can be looked up for the data. These are often needed to generate the graph URL. This accepts some of the template variables listed in the Nagios Check Template Variables section. Note: Some variables it cannot accept because it is responsible for generating them.

Nagios Check Template Variables

Name

Description

%row

The row in the grid associated with this check at the time its run

%col

The column in the grid associated with this check at the time its run

%row.<prop>

Custom properties defined in the groupMembers section.

%col.<prop>

Custom properties defined in the groupMembers section.

%row.map.<prop>

Custom properties defined in the groupMembers section that change depending on opposing row or column.

%col.map.<prop>

Custom properties defined in the groupMembers section that change depending on opposing row or column.

%maUrl

The url of the measurement archive. You can’t use this in the maUrl parameters as this is generated from that template.

%maKeyF

DEPRECATED A comma-separated list of the metaDataKeys for the forward direction of a test. This cannot be used in metaDataKeyLookup as it is generated after the URL that is called.

%maKeyR

DEPRECATED A comma-separated list of the metaDataKeys for the reverse direction of a test. This cannot be used in metaDataKeyLookup as it is generated after the URL that is called.

%srcName

DEPRECATED The hostname of the source endpoint of a point-to-point test. This cannot be used in metaDataKeyLookup as it is generated after the URL is called.

%srcIP

DEPRECATED The IP address of the source endpoint of a point-to-point test. This cannot be used in metaDataKeyLookup as it is generated after the URL is called.

%dstName

DEPRECATED The hostname of the destination endpoint of a point-to-point test. This cannot be used in metaDataKeyLookup as it is generated after the URL is called.

%dstIP

DEPRECATED The IP of the destination endpoint of a point-to-point test. This cannot be used in metaDataKeyLookup as it is generated after the URL is called.

%eventType

DEPRECATED The eventType returned by metaDataKeyLookup of the destination endpoint of a point-to-point test. This cannot be used in metaDataKeyLookup as it is generated after the URL is called.

%event.delayBuckets

The string http://ggf.org/ns/nmwg/characteristic/delay/summary/20110317

%event.delay

The string http://ggf.org/ns/nmwg/characteristic/delay/summary/20070921

%event.bandwidth

The string http://ggf.org/ns/nmwg/characteristics/bandwidth/achievable/2.0

%event.iperf

The string http://ggf.org/ns/nmwg/tools/iperf/2.0

%event.utilization

The string http://ggf.org/ns/nmwg/characteristic/utilization/2.0

grids

grids associate groups with checks and arrange them in a two-dimensional structure. Grids are arranged as a list of objects with the following parameters:

Name

Type

Required

Description

name

String

Yes

A human readable name of the grid

rows

String

Yes

The name of the group that will compose the rows of the grid. This must match a group name defined in the groups section of the configuration file or an error will be returned.

columns

String

Yes

The name of the group that will compose the columns of the grid. This must match a group name defined in the groups section of the configuration file or an error will be returned.

checks

List of Strings

Yes

The name of the check elements that need to be run for each row and column. Each element must match a check name defined under the checks section of the configuration or an error will be returned.

rowOrder

String

Yes

Specifies how the rows should be ordered. Valid values are alphabetical, which will sort them alphabetically, or group which will present them exactly in the order they are defined in the group section.

colOrder

String

Yes

Specifies how the columns should be ordered. Valid values are alphabetical, which will sort them alphabetically, or group which will present them exactly in the order they are defined in the group section.

excludeSelf

boolean

Yes

If set to 1, then a check will not be run where the value of the current row is equal to the value of the current column. If set to 0, then a check will be run in this case.

excludeChecks

YAML Object

No

This excludes individual checks based on the row and column. The structure is a map where the key is the name of the row where you want to exclude a check. It should match a member of the group assigned to the “rows” property of this grid or it can be the special key ‘default’ that matches every row. The value is a list of columns that should not appear in the grid. An item in the list must be a member of the group assigned to the “columns” property of this grid or the special value “all” which removes all columns for a row. A full example is provided in the default configuration file.

columnAlgorithm

boolean

Yes

Determines which checks will be run. Valid values are as follows: all - Run a check between every row and column; afterSelf - Run a check to every host that’s defined after the current row in the ‘rows’ group; beforeSelf - Run a check to every host that’s defined before the current row in the ‘rows’ group

reports

String

No

References the id field of a report in the reports section to match against this grid.

statusLabels

YAML object

Yes

Describes what each status means. It is structured as a set of key/value pairs where the key is the status and the value is the description of the status. Valid status values are ok, warning, critical, unknown and notrun, and extra. You do not need to define every status if not all are applicable to your check.

statusLabels.extra

YAML object

Yes

Object where you can define custom status labels. Valid keys are value which is an integer identifying the custom state, shortName which is a name to label the state and description which is text that will apear in the GUI legend.

dashboards

dashboards group grids together. You define them as as a list of YAML objects with the following properties:

Name

Type

Required

Description

name

String

Yes

The name you want displayed as the title of the dashboard

grids

List of YAML objects

Yes

The list of grids you want included in this dashboard. Each item in the list has one property name, where you specify the name of the grid. This must map to a name property for one of the defined grids in the configuration file.

reports

reports define patterns that match across one or more cells in a single grid. You define them as as a list of YAML objects with the following properties:

Name

Type

Required

Description

id

String

Yes

The identifier used in grids to reference this rule

rule

YAML Object

Yes

The parent rule object that defines that patterns to match and the “problem” it identifies

rule.type

String

Yes

The type of rule. See Rule Types.

rule.selector

YAML object

Type dependent

An object describing what cells to look at when evaluating the rule. Valid when rule.type is rule or siteRule.

rule.match

YAML object

Type dependent

An object describing how to determine if this pattern should generate a notification. Valid when rule.type is rule or siteRule.

rule.problem

YAML object

Type dependent

An object describing what to do if the rule matches. Valid when rule.type is rule or siteRule.

rule.rules

List of YAML objects

Type dependent

List of rule objects to evaluate. Valid when rule.type is forEachSite, *matchAll or matchFirst.

rule.site

String

Type dependent

Only valid when rule.type is siteRule. This is the name or the row or column to evaluate

rule.selector.type

String

Yes

The type of selector. See Selector Types.

rule.selector.rowSite

String

Yes

The name of the row to select when using a selector of type cell.

rule.selector.colSite

String

Yes

The name of the column to select when using a selector of type cell.

rule.selector.rowIndex

String

Yes

The index of the check to select when selecting a row in a match of type cell or check.

rule.selector.colIndex

String

Yes

The index of the check to select when selecting a column in a match of type cell or check.

rule.match.type

String

Yes

The type of match. See Match Types.

rule.match.status

Integers

Type dependent

For match types status and statusThreshold, the integer value of the status to match.

rule.match.statuses

List of Numbers

Type dependent

For match type statusWeightedThreshold, the list of statuses where the index in the list corresponds to the integer value of the status (starting at 0)

rule.match.threshold

Number

Type dependent

For match type statusThreshold the percentage of checks that must have status and for statusWeightedThreshold the average weight of each selected check

rule.problem.severity

Integer

Yes

The severity of the problem. 0=OK, 1=WARNING, 2=CRITICAL, 3=UNKNOWN

rule.problem.category

String

Yes

A string to define the category of the problem, for example CONFIGURATION or PERFORMANCE.

rule.problem.message

String

Yes

A message describing the problem (i.e. what it means when the rules match)

rule.problem.solutions

List of Strings

No

A list of potential solutions to the problem

Rule Types

Name

Description

forEachSite

A site in this context is a a group member that is in a row and/or column. This type of rule loops through these unique groupMembers and applies the rule object in the rules property.

matchAll

All the sub-rules defined in rules must match for the pattern to match

matchFirst

The first sub-rule defined in rules that matches causes this rule to match

siteRule

A rule that only looks at the site specified by a site property

rule

The fundamental building block of the other rule types that selects a set of cells, matches them against a criteria and defines the problem a match identifies.

Selector Types

Name

Description

cell

Selects a specific cell for a site given by rowSite and/or colSite

check

Selects a an individual check specified by rowIndex and colIndex

column

Selects a column

grid

Selects the entire grid

row

Selects an entire row

site

Selects row and column

Match Types

Name

Description

status

Match if all the checks selected have the given status

statusThreshold

Match if the percentage of cells specified by threshold have the given status

statusWeightedThreshold

Assign a weight to each status using the statuses array and if the average weight of the selected cells is above threshold generate an alarm

notifications

notifications define when and where to send notifications of problems found in reports. You define them as as a list of YAML objects with the following properties:

Name

Type

Required

Description

name

String

Yes

An identifier that can be any string. Mainly used as a description and may be used by notification plug-in.

type

String

Yes

The type of notification. Current valid values are email or servicenow.

schedule

String

Yes

A cron schedule in MIN HOUR DAY-OF-MONTH MONTH DAY-OF-WEEK format.

problemReportFrequency

Integer

Yes

Frequency in seconds with which to report the same problem.

problemResolveAfter

Integer

Yes

If supported, the length of time to wait after a problem has gone away to mark it as resolved.

minimumSeverity

Integer

Yes

The minimum severity a problem must be to generate a notification where 0=OK, 1=Warning, 2=Critical

filters

List of YAML Object

Yes

A list of filter objects that define which types of problems to generate notifications.

filters.type

String

Yes

The type of filter. Valid values are dashboard, grid, site, and category.

filters.value

String

Yes

The value to match. For example, if this is of type dashboard then this is the name of the dashboard for which notifications should be generated.

parameters

YAML Object

Yes

Type specific parameters. See Email Parameters and ServiceNow Parameters.

Email Parameters

Name

Type

Required

Description

mailServer

String

No

An object describing how to contact your mail server. If not specified, defaults are used.

mailServer.address

String

No

The IP or hostname of your mail server. Default is 127.0.0.1.

mailServer.port

String

No

The port of your mail server. Default is port 25.

mailServer.username

String

No

The username for authenticating to your mail server. Default is none.

mailServer.password

String

No

The password for authenticating to your mail server. Default is none.

mailServer.useSSL

boolean

No

Indicates whether or not to use SSL when communicating with mail server. Default is false.

from

String

No

The from address of emails sent.

replyTo

List of Strings

No

A list of replyTo addresses for emails sent.

to

List of Strings

No

A list of addresses where the emails should be sent.

cc

List of Strings

No

A list of CC addresses to send copies of the email notifications.

bcc

List of Strings

No

A list of BCC addresses to send blind copies of the email notifications.

subjectPrefix

String

No

A string to append on beginning of the email subject. Note that the subject is the name property of the notification definition.

format

String

No

The format of the email sent. Valid values are html or text. Default is html.

dashboardUrl

String

No

The URL of your dashboard used to add links in emails. If not specified then links will not be included.

ServiceNow Parameters

Name

Type

Required

Description

instance

String

Yes

Name of the ServiceNow instance.

oauthFile

String

No

A YAML file containing a combination of the username, password, clientID, and clientSecret.

clientID

String

No

The client ID to authenticate to ServiceNow

clientSecret

String

No

The client secret to authenticate to ServiceNow.

username

String

No

The username to authenticate to ServiceNow.

password

String

No

The password to authenticate to ServiceNow.

recordTable

String

No

The name of the table to create the record.

recordFields

YAML Object

Yes

A YAML object with the fields to set on creation. This is highly dependent on your ServiceNow setup. The keys are the name of the fields. The values support multiple ServiceNow Template Variables.

resolveFields

YAML Object

No

A YAML object with the fields to set on resolve. This is highly dependent on your ServiceNow setup. The keys are the name of the fields. The values support multiple ServiceNow Template Variables.

dashboardUrl

String

No

The URL of the dashboard. This is used to generate links to the dashboard in created records.

duplicateRules

YAML Object

No

Rules used to determine if a record that already exists in ServiceNow matches a problem and what action to take if it does.

duplicateRules.identityFields

List of Strings

No

A list of fields that MUST match in both records for them to be considered equal.

duplicateRules.rules

List of YAML Object

No

A list of rules evaluated in order until one matches that determine the action to take when a record is found that matches the given identity fields.

duplicateRules.rules[N].equalsFields

YAML Object

No

An object where the key is the field to match and the value is what that field must equal for the given updateFields to be applied to the record.

duplicateRules.rules[N].gtFields

YAML Object

No

An object where the key is the field to match and the value is what that field must be greater than for the given updateFields to be applied to the record.

duplicateRules.rules[N].ltFields

YAML Object

No

An object where the key is the field to match and the value is what that field must less than for the given updateFields to be applied to the record.

duplicateRules.rules[N].updateFields

YAML Object

No

The fields to update in the record and the values they should be given if this rule matches. The values support multiple ServiceNow Template Variables.

ServiceNow Template Variables

Name

Description

%br

A line break

%problemEntity

The source of the problem - either a grid or a site (i.e. a row and/or column)

%gridUrl

The URL of the grid

%gridLink

An HTML link to the grid

%siteName

The name of the site (i.e. row and/or column) that is the source of the alarm

%gridName

Name of the grid that is source of the alarm

%isGlobal

Boolean indicating whether this affects the entire grid.

%severity

The severity of the problem.

%category

The problem category.

%solutions

A bulleted list of potential solutions.

%name

The name of the problem.

General Properties

Name

Type

Required

Description

database

String

Yes

The path to the directory where the database is stored

jobThreadPoolSize

Integer

No

The maximum number of checks that can run in parallel. Defaults to 20

jobBatchSize

Integer

No

The maximum number of checks that can be running or waiting to run in memory. Defaults to 250.

disableScheduler

Boolean

No

If set to 1 then the server will only run as a REST server and not execute any new checks. Default is 0.

skipTableBuild

Boolean

No

If set to 1 then the database tables will not be built and indexes will not be built/rebuilt. The first time you run the server it must be set to 0. After that, you may find that setting it to 1 significantly speeds-up boot time. Keeping it on though has the advantage of rebuilding indexes on startup which can improve query performance.

Web Server Properties

Name

Type

Required

Description

serverHost

String

No

The hostname of the interface where the web server should listen. Defaults to localhost.

http

YAML Object

Yes (unless https specified)

Parameters related to http. See http properties section.

https

YAML Object

Yes (unless http specified)

Parameters related to https. See https properties section.

http properties

Name

Type

Required

Description

port

Integer

Yes

The port on which the web server should listen for HTTP connections

proxyMode

String

Yes

Reserved for future use. Currently let’s the server know that if it is behind a proxy. This may be used in later implementation to extract headers that forward information related to authentication.

https properties

Name

Type

Required

Description

port

Integer

Yes

The port on which the web server should listen for HTTPS connections

keystore

String

Yes

The keystore containing the key ‘mykey’ to use as the ssl server certificate. It should also contain any trusted certificates if doing client authentication.

keystorePassword

String

Yes

The password to access the keystore.

clientAuth

String

Yes

Indicates whether a client to the rest server must have a trusted SSL certificate. Valid values are require, want and off. require means the user MUST have a trusted certificate or the request will be rejected. want means the server will check the certificate if one is presented, but will not reject requests that do not provide one. off means no certificate is required.

proxyMode

String

Yes

Reserved for future use. Currently let’s the server know that if it is behind a proxy. This may be used in later implementation to extract headers that forward information related to authentication.