New ping engine and its settings

Share
06. 06. 2019

Through our Helpdesk, we have received a lot of queries about the way the new ping feature works as well as about the new graphs, so we have decided to provide you with a more detailed description of how it all works.

Why did we come up with this new feature?

In version 5.06, we implemented a new ping engine and ping graphs into ISPadmin. This changed the overall approach to testing the availability of routers, devices, modems, switches, and clients. Our decision resulted from our experience and observation. We found that the old ping engine no longer met the performance requirements. The requirements increase with the ever-increasing number of devices and clients being monitored. In addition, the previous system was not flexible enough as far as the customization options were concerned. This way, we wanted to give individual users an opportunity to customize the system according to their own needs, which are purely individual.

How is the new engine different?

ISPadmin now processes a queue that is filled with IP addresses from various sources (routers, switches, endpoints, client IPs, modems, … - that is, "everything that has an IP address"). Each IP address is classified according to device type and different ping settings can be applied to each group.

Batches of IP addresses are created in the background and then processed in parallel. As a result, there is a great deal of simultaneous host testing and availability information retrieval for subsequent graph plotting.

All statistical data is stored in the Influx database, which is specifically designed for this purpose.

How to set the system up?

Individual and system settings

  • Hardware / Settings / Ping settings / Individual
  • Hardware / Settings / Ping settings / System

You can set specific ping behavior for each device (IP) or device group.

The following parameters are available:

  • Active: Host group availability monitoring can be enabled / disabled. The only exception is the Routers group. It is always active. Changes made here are automatically applied within one minute.
  • Size: Size of a packet in bytes (including the header). This size is used to ping an IP address.
  • Timeout: Time (in seconds) after which the system classify the host as unavailable. Available options: 1 and 2 seconds.
  • Count: Number of ping requests sent per a host. This number is used to calculate the average, maximum, and minimum values, which are then displayed in the graph.
  • Interval: Interval between individual ping requests sent to a particular host (The number of requests is given in the Count value.)
  • Period: Time (in minutes) after which the host availability test is repeated.
  • Expiration: Expiration date of the availability test. This option is only active in the specific device settings. It is not possible to define it in the group settings.

Global variables settings

  • Hardware / Settings / Ping settings / System

On the page given above, you can set the following global variables that affect the ping engine:

  • Number of ping requests within a single process: Number of hosts (IP addresses) per a batch (process)
  • Number of ping processes: Number of batches (processes) processed simultaneously
  • Process start delay: Delay when starting further processes (in seconds). By setting this item correctly, you can make sure that the system is not overloaded.
  • Influx DB cache size Cache size for Influx. The given value is OK for 99% of the systems. Too low a value would cause frequent data deletion and saving and higher system load. A high value could cause the http GET request to be overloaded and the data stored to be lost.

Host testing uses batches of IP addresses (process). These batches are processed in parallel. It is important to understand that a large number of tested IP addresses can be processed at a time, which means that a large number of ping reguests (ICMP) are sent and processed at a time. The precise number can be determined from the number of IP addresses in a batch and the number of concurrently processed batches. In addition, individual host group settings and individual host settings come into play. This complex and performance-intensive mechanism affects both ping results and server load.

Example

If you have 200 hosts in a batch and 5 concurrent batches, the system generates 1,000 ping requests at a time. If each test consists of 10 requests with an interval of 0.1 seconds, the system generates 10,000 ping requests within a second.

No more than the specified number of batches (processes) can run at a time (Ping process count setting). For example, if the total number of all hosts (IPs) is 50,000, the ping engine can work and load the system for 2 minutes. This then affects the overall stability of the system and the data obtained from the ping output.

Recommendation

If the packet loss is not as expected, use the following minimalistic settings:

Global variables:

  • Number of ping requests within a single process: 50
  • Number of ping processes: 1
  • Process start delay: 1 s

System settings:

  • All groups except Routers are inactive.
  • The Routers group is set as follows:
    • Size: 56 B
    • Timeout: 1 s
    • Count: 3
    • Interval: 0.5 s
    • Period: 1 min (only for testing purposes, then set it back to 5 min)

When the system stabilizes, slowly add host groups and increase the number of processes and hosts in the process.

The availability testing system is significantly different from the previous conception. Therefore, the results in the graphs might be different (different processing and mainly system throughput).

Did this article help you?