perfSONAR Frequently Asked Questions (FAQs)¶
Topics
Installation Questions¶
Q: What are the hardware requirements for running the perfSONAR Toolkit?¶
A: See Hardware Requirements.
Q: Does my machine have to meet the System Requirements?¶
A: There is nothing on the perfSONAR Toolkit that will prevent systems that do not meet the requirements from starting. Erroneous or inaccurate behavior is possible if the hardware cannot support the measurement tools. Performance considerations do favor meeting or exceeding the minimum guidelines.
Q: The Services screen shows many services in the non-running state when first started, what is wrong?¶
A: Services should start right away. It may be an indication of an installation problem. See Reading Log Files for information on where to look for more information.
Q: I do not see my service in the directory of services, where is it?¶
A: Much like DNS, the information that will populate the Lookup Service will take time to propagate. Please allow some time (e.g. a few hours) before your service will be fully visible. If it stil does not appear, you may want to look in lsregistrationdaemon.log for errors.
Q: What should I enter for the Communities section of my Administrative Information configuration?¶
A: The goal of communities is to help identify the types of tests and where they are being run by a perfSONAR measurement point. Think of this them as being similar to labels you assign to photos or music. Some examples of communities one might assign are:
Internet2 - The tests run somehow connects the Internet2 backbone
LHC (CMS, ATLAS, etc.) - The host is part of the LHC deployment structure
eVLBI - The host is a part of the larger telescope community
MAX - A connector of member of the MAX gigapop
DOE-SC-LAB - US Department of Energy Office of Science Labs
Communities are not required and are currently not strictly defined. Use your best judgement in defining them and try to pick communities that will help others determine if a measurement to your host would be beneficial when configuring their own tests.
Q: I do not think I am a member of a community, should I put anything?¶
A: Communities are not required, but they allow other individuals and organizations to find and use your services. Assign a community if you think it will allow others to gain insight on whether running a test to your host is beneficial.
Q: How do I disable global registration?¶
A: The following commands will stop, and disable, this service:
systemctl stop perfsonar-lsregistrationdaemon
systemctl disable perfsonar-lsregistrationdaemon
Q: Which repository addresses will be used to get updates to the perfSONAR software?¶
A: By default, the perfSONAR repo points at a mirror list hosted by software.internet2.edu. In this mirror list is linux.mirrors.es.net. In order to use the default configuration you will need to allow access to software.internet2.edu so you can grab the mirrorlist. After that, the packages can be downloaded from any of the sites listed which includes linux.mirrors.es.net, software.internet2.edu, and a few other places. You should be able to get away with just opening up access to software.internet2.edu (so it can get the mirror list) and linux.mirrors.es.net (so you can get the packages). Those should be the only places you need as linux.mirrors.es.net also has a mirror for all the base CentOS packages.
Q: I am trying to run perfSONAR on low-cost hardware (e.g. raspberry pi, etc.). Where should I start?¶
A: There are numerous hardware platforms that have emerged that are an attractive option for use in network performance measurement. The perfSONAR collaboration does not recommend, nor support, the use of perfSONAR on low-end, ARM-based hardware such as the Raspberry PI. It has been shown that it is difficult to distinguish network issues, from host issues, on these devices. In particular, we do not recommend these devices for testing throughput. Use of latency based tools (Ping, OWAMP) is possible provided that an accurate clock source is available. For more information, see perfSONAR on Low-cost Hardware.
Q: I am running a small node, and seeing a lot of IO. What is going on?¶
A: Some users report abnormalities on their small nodes related to I/O activity (e.g. iostat reports long w_await times - sometimes measured in multiple seconds). These coincide with intervals of testing, in particular related to OWAMP. Deeper investigation found that there is too much I/O going on: syslogd and systemd-journald processing syslog messages from “owampd and powstream” in “/var/log/messages”, sometimes up to 30-40 syslog messages per second depending on the testing configuration of a host. Given that small nodes are based on flash memory, changes should be made to ensure a more balanced approach to logging: Do journaling on memory by editing “/etc/systemd/journald.conf”. Make option “Storage=volatile” instead of the default “Storage=auto”. Make sure to limit the maximum usage of memory for journaling. You can do this by fiddling with “RuntimeKeepFree” and “RuntimeMaxUse” options. Don’t restart the journaling service (i.e., don’t do “systemctl restart systemd-journald”). Do an OS reboot instead.
Q: Where can I find more resources regarding timekeeping for VMWare Virtual Machines?¶
A: VMWare has two resources worth reading:
Q: How do you upgrade a perfSONAR node from Debian 7 to Debian 9¶
A: Because of systemd, upgrading a host running perfSONAR on Debian 7 to Debian 9 is better done in multiple steps as described bellow:
Upgrade Debian 7 to Debian 8 (following Debian instructions, here are Jessie upgrade notes for i386 architecture)
Reboot (to get systemd running)
Change perfSONAR repository from perfsonar-wheezy-release to perfsonar-release
Upgrade Debian 8 to Debian 9 (following Debian instructions, here are Stretch upgrade notes for i386 architecture)
Q: Why can’t my Debian/Ubuntu host find ping?¶
A: Run apt reinstall iputils-ping
to fix the issue. This was caused by a bug in the paris-traceroute package that installed a non-standard version of ping that required sudo. This was removed in perfSONAR 5.0.5 which left some systems without a ping command.
Tool Questions¶
Q: What is pScheduler and how do I use it?¶
A: pScheduler is used to schedule network tests on perfSONAR hosts. See What is pScheduler?
Q: What is OWAMP and how do I use it?¶
A: OWAMP (One-Way Ping) is a client server program that was developed to provide delay and jitter measurements between two target computers. At boot time, the perfSONAR Toolkit starts an OWAMP server process and leaves it listening on TCP port 861. This server may then be used by remote clients. Additionally, perfSONAR includes an OWAMP client application that can be used to test to remote instances. For more information on how it fits into perfSONAR overall see What is perfSONAR?.
Q: What happened to the NDT and NPAD tools?¶
A: NDT and NPAD depend on web100, which is no longer supported, so they have been dropped from perfSONAR starting with v4.0. If you need similar functionality, we recommend that you use https://www.measurementlab.net/tests/
Q: What happened to the BWCTL tool?¶
A: BWCTL is no longer included by default with perfSONAR. BWCTL was used to schedule network tests on perfSONAR hosts prior to perfSONAR v4.0 but has been replaced by pScheduler.
Q: How can I set limits to prevent others from overusing my test host? What is the purpose of pscheduler limits?¶
A: The pscheduler limits system allows you to limit the influence that outside users have on your system. For example, to prevent your machine/network from being saturated with throughput tests, limit the duration and maximum bandwidth available. For more information see Configuring pScheduler Limits.
Q: Can I run both throughput and latency/loss tests on the same interface without interference due to the way pscheduler scheduling works?¶
A: Currently you cannot guarantee no interference. pScheduler rtt test that execute the ping tool and OWAMP latency and latencybg tests that execute owping and powstream respectively, are considered background tasks and can be scheduled in parallel to each other as well as throughput tests. Given the frequency with which users prefer to run tools such as ping and owping (and powstream runs constantly), there would be very few tests slots available if this were not the case. This does not mean you cannot run these tests on the same interface, it just means some correlation of results may be necessary when debugging. It is recommended, though not required, you run these tests on separate interfaces from throughput.
Q: How can I force testing over IPv4 or IPv6 in a pSConfig template?¶
A: The exact option may very depending on the test plug-in, but in a test object’s spec
most of the default plug-ins support an ip-version
field that can get set to 4
or 6
.
Q: How do I configure a pSConfig template to pace all TCP traffic to only 5Gbps, so that I don’t use all my sites bandwidth?¶
A: Set the bandwidth
property in a test object’s spec
. It accepts bandwidth as an integer in bits per second.
Q: Why do I get such weird results when I test from a 10G connected host to 1G connected host?¶
A: See https://fasterdata.es.net/performance-testing/troubleshooting/interface-speed-mismatch/
Q: My perfSONAR results show consistent line-rate performance, but a researcher at my site is reporting really poor performance, what gives?¶
A: perfSONAR is designed to give a “best case scenario” test result for end to end testing: perfSONAR is typically installed on well-provisioned server-class hardware that contains adequate CPU, memory, and NIC support The perfSONAR toolkit follows this recommended host tuning: https://fasterdata.es.net/host-tuning/linux/
pScheduler’s throughput tests invoke “memory to memory” test tools. perfSONAR typically runs short single streamed TCP tests. The user of a network may not have a machine that is as tuned as a perfSONAR node, could be using an application that is incorrect for the job of data movement, and may have a bottleneck due to storage performance. Consider all of these factors when working with them to identify performance issues. It is often the case that the ‘network’ may be working fine, but the host and software infrastructure need additional attention.
Host and Network Administration Questions¶
Q: Where are the relevant logs for perfSONAR services?¶
A: Please see Reading Log Files for more information.
Q: How do I enable log compression with logrotate.conf?¶
A: Sometimes, log files can grow bigger and consume disk space even before their rotation. Logrotate
is a system utility installed by default and is configured to handle log rotation for all installed packages and applications. In order to enable system-wide log files compression change default settings in /etc/logrotate.conf
and uncomment compress
option. This implies that rotated files will be compressed with log files having a .gz
file extension.
Q: Can I use a firewall?¶
A: Please see Firewalls and Security Software.
Q: How many NTP servers do I need, can I select them all?¶
A: It is recommended that 4 to 5 close and active servers be used. The Select Closest Servers button will help with this decision. Note that some servers may not be available due to routing restrictions (e.g. non-R&E networks vs R&E networks - a common problem for Internet2 and ESnet servers).
Q: When setting up a dual homed host, how can one get individual tests to use one interface or another?¶
Q: How do I change the MTU for a device?¶
A: Changing the MTU on your perfSONAR host should only be done if the underlying network supports the chosen size. Please work with your local network staff before making this change on any host. You can view the MTU of your network devices by executing the /sbin/ifconfig command. To temporarily change the MTU for a device, you use the ifconfig command and specify the device and the new MTU. For example: ifconfig eth0 mtu 9000 up
To make these changes permanent you need to modify the specific devices configuration file. These files are in /etc/sysconfig/network-scripts/ and have names like ifcfg-eth0 for the device eth0 and ifcfg-eth1 for eth1.
For example you could add the line MTU=”9000” for IPv4 or IPV6_MTU=”9000” for IPv6 to /etc/sysconfig/network-scripts/ifcfg-eth0. After making the changes you need to restart the network services by running the command ‘service network restart’ as root.
Q: How can I configure my toolkit web interface to display a private IP?¶
A: The file resides at: /usr/lib/perfsonar/web-ng/etc/web_admin.conf The config option is allow_internal_addresses. Set it to 1. This affects the GUI display only, your measurement should work using private addresses with no special modification.
Q: How do I change the SSL certificate used by the web server?¶
A: The toolkit by default generates a self-signed SSL certificate that it configures for use with the Apache web server. Some users may desire to replace this certificate with a certificate signed by a certificate authority (CA).
You may also need to replace the certificate due to a problem sometimes encountered with browsers not accepting the self-signed certificate. You may see an error like the following:
HOST uses an invalid security certificate.
The certificate is not trusted because it is self-signed.
The certificate is only valid for localhost.localdomain
(Error code: sec_error_untrusted_issuer)
You can find instructions for installing a new certificate in Apache here.
Q: I forgot to enable IPv6 in CentOS when I installed the toolkit. How do I enable it?¶
A: It is recommended that you always enable IPv6 during the CentOS installation portion of the toolkit setup. If you did not enable it, then you can do so with the following steps:
Login to the toolkit as a user capable of running sudo Run sudo and enter your sudo password Open the file /etc/modprobe.conf in a text editor and remove the following lines:
alias net-pf-10 off
alias ipv6 off
options ipv6 disable=1
Then Restart the host. You can now assign an IPv6 address.
Q: Why is the static IPv6 address I assigned during the net-install process not configured when my host starts-up?¶
A: When you perform the net-install of the toolkit, you will be prompted twice to enter networking information by CentOS. The first time is to define the networking to be used for downloading required packages. The second prompt is later in the installation and defines what will be configured on the host post-installation. It is a known CentOS behavior that IPv6 information entered at the first prompt is not automatically filled-in at the second prompt. This can be confusing because the IPv4 information does get automatically filled-in. If you do not manually enter the IPv6 information a second time, then your host will not have the address configured post-installation. You will have to manually assign the address if this happens.
Q: How do I setup a perfSONAR node to have two interfaces on the same subnet?¶
A: This can be accomplished by setting the following items in sysctl:
net.ipv4.conf.default.arp_filter = 2
net.ipv4.conf.all.arp_filter = 2
More information available here: http://z-issue.com/wp/linux-rhel-6-centos-6-two-nics-in-the-same-subnet-but-secondary-doesnt-ping/
Q: What TCP congestion control algorithm is used by the perfSONAR Toolkit?¶
A: The perfSONAR toolkit sets the TCP congestion control algorithm to htcp.
Q: How can I add custom rules to my firewall?¶
Q: Is it possible to change the default port for tool X?¶
A: Some measurement tools use 2 kinds of ports:
Contact ports, e.g. a well known location to contact the daemon to initiate a test
Test ports, e.g. negotiated ports to flow test or control traffic when a test is requested
Test ports are easily configured to run on a specific set of ports, and can be configured to be opened in a site firewall. The daemon is often able to negotiate these at run time. The contact port is well known, and because of that should never be changed to a different value. Doing so severely impacts the ability of the tool to interoperate on a global scale.
As an example, the OWAMP server listens on the registered port 861 (see http://tools.ietf.org/search/rfc4656 section 2). This is the standard port for the application, in the same way that port 80 is the standard port for an HTTP server. While one can run a web server on a port other than 80, it makes the web server less useful because it’s not a standard config. The same is true for OWAMP. The OWAMP protocol is standardized, and has a well-known port - port 861 - associated with it. Running the OWAMP daemon on a non-standard port introduces significant interoperability challenges between deployments.
If you’re going to run a measurement infrastructure inside your own organization, you are of course free to do whatever you want. If you want to integrate with the rest of the world, the measurement tools should be run on the standard port to ensure interoperability.
Q: Why doesn’t the perfSONAR toolkit include the most recent version of vendor X’s driver?¶
A: We only support the default CentOS device drivers on the toolkit. Check your NIC vendor’s website to see if a newer version of the driver is available for download.
Q: How can I configure yum to automatically update the system?¶
A: Note that as of version v3.4, this is enabled by default. See Updating perfSONAR.
Q: My host was impacted by Linux security issue (Shellshock/Heartbleed/etc.). What should I do?¶
A: Please check the RedHat vulnerability archive or the Debian security list for updates, and upgrade your system as soon as the update is available.
Q: How to get rid of “There isn’t a perfSONAR sudo user defined” message?¶
A: The best option is to add a non-root user to the pssudo group. If you have another method of handling sudo users, comment out the lines in /etc/profile.d/add_psadmin_pssudo.sh. Do not remove the file entirely, just modify it, otherwise it will get restored on update.
Q: Is it possible to use non-intel SFP+ optics in the Intel X520-SR2 NIC?¶
A: The ixgbe driver has an option to allow alternative optics:
allow_unsupported_sfp=1
This can be tested using the fillow commands:
sudo modprobe -r ixgbe
sudo modprobe ixgbe allow_unsupported_sfp=0
Q: How can I tune a Dell server for a high throughput and low latency?¶
A: Dell offers this guide on tuning:
Q: What is PTP?¶
A: PTP is the Precision Time Protocol, also known as IEEE 1588, a more-accurate successor to the Network Time Protocol which as been used for many years to discipline the clocks in general-purpose computers. Under ideal conditions, PTP can discipline a clock to within a few microseconds of UTC. Compare this with NTP, which typically has accuracy of about a millisecond when used with clocks on the Internet and 100 microseconds or less when using a stratum-1 clock in a LAN environment.
Q: What is required to use PTP in my network?¶
A: Unlike NTP, which provides satisfactory operation using software clients and a pool of servers usually on the Internet, running PTP requires specialized equipment:
Clocks. For production-grade service, PTP requires a minimum of two grandmaster clocks. These are dedicated hardware appliances that use the Global Positioning System to recover accurate time and a high-precision oscillator for holdover during periods when GPS is not available. At this writing, base model clocks cost about US$2,500 each.
Network Infrastructure. PTP requires that all network elements between the grandmaster and slaves be capable of functioning as a boundary clock. This is a feature typically found on high-end routers and switches designed for use in low-latency applications.
Network Interface Cards. Interfaces in the slave system require hardware support for the timestamping that makes PTP work accurately. While software-only PTP clients exist, they may suffer inaccuracies induced by the vagaries of running under a general-purpose operating system and provide inaccurate results when testing latency in a LAN environment.
Q: Does perfSONAR support PTP?¶
A: Not at this time. The prohibitive cost of deploying PTP makes it unlikely to be used widely enough to merit adding support. The current perfSONAR code contains assumptions that the clock is disciplined by NTP and would need to be modified for other protocols.
Q: When trying to migrate from a CentOS 6 to a CentOS 7 host I receive pg_dump error. How to fix it?¶
A: Using a script that will create a backup/restore of relevant configuration files and measurement data may generate pg_dump
error failing to create pScheduler backup. This happens when you have both postgresql 8 and postgresql 9 installed, but pscheduler backup script expects only postgresql 9. This can be patched by editing /usr/libexec/pscheduler/commands/backup
:
Remove line:
pg_dump \
Add in this place these three lines:
PG_DUMP=pg_dump
[ -x /usr/pgsql-9.5/bin/pg_dump ] && PG_DUMP=/usr/pgsql-9.5/bin/pg_dump
$PG_DUMP \
Rerun the backup script.
perfSONAR Project Questions¶
Q: How do I join the perfSONAR Collaboration?¶
A: Please contact us at perfsonar-lead@internet2.edu.
Q: Where can I ask questions or report bugs?¶
A: For questions, send email to perfsonar-user at internet2 dot edu. You may also join the mailing list by visiting https://lists.internet2.edu/sympa/info/perfsonar-user.
Report bugs at https://github.com/perfsonar/project/issues.
Q: Which licenses do perfSONAR products use?¶
A: perfSONAR components are licensed under the Apache 2.0 license.
Q: How does version numbering work for the perfSONAR project?¶
A: See https://github.com/perfsonar/project/wiki/Versioning if you are interested in learning about our version numbering scheme.