5 Lessons Learned in Hosting a CI Test Lab at UNH-IOL

Today’s open source software landscape features an ever-expanding number of mature projects with development activity spanning multiple organizations. Developers on these projects come from large and small enterprises, educational institutions, and a pool of unaffiliated, open source developers. This diverse group of contributors works together to enable their projects’ core goals or use cases. However, there are challenges with this diversity of projects, developers, and outcomes. How can we reliably meet the needs of all stakeholders? When setting up a continuous integration (CI) testing lab, it’s essential to build a framework that will allow the lab to handle incoming testing requests across diverse hardware and software platforms while providing a high-quality service. Without this, administering a CI testing lab can quickly become a whack-a-mole game of supporting ad-hoc testing requests that disrupt the speed and reliability of testing operations.

As testing administrators at IOL, we have learned from our challenges in building a CI testing lab for the Data Plane Development Kit (DPDK), an open-source software project for high-speed packet processing. The DPDK developer community has worked with IOL to build an effective and stable testing process over the years. In this blog post, I will use the perspective of this project and the DPDK community to discuss how a CI testing lab can effectively play the community testing role that an open-source project requires.

Lesson #1 – Create A Communication Feedback Loop

To ensure that our testing services are reliable, relevant, and developed in good time, we maintain a short feedback loop between our testing lab and the developers of the open-source community. Specifically, in DPDK, we meet with DPDK developers weekly, with topics alternating between test suite development, CI Lab infrastructure, and maintenance topics. Though it’s true the classic channel for communication in open source projects—the open source mailing list—is an effective way of having conversations and one we also leverage, we find that holding regular meetings is essential for us to keep the CI community aligned with the project’s goals, objectives, and tasks. Regular meetings help us stay in sync on work progress and help resolve questions through group consensus from all CI testing participants. Meetings keep the CI project vibrant and in alignment with the community it serves.

Lesson #2 – Prioritize Completion of Testing and Re-Testing

Timely automated test reporting and community support are critical to lab administrators. For the CI testing lab to reduce the burden placed on software developers in an open source community, the lab must integrate well with the development, review, stage, and merge workflow used in the community. In the DPDK Community Lab, we constantly consider testing capacity when standing up automated testing. This is often caused by time-consuming, end-to-end functional and performance testing on our limited inventory of HPC servers. Developers want to know if their patch is failing tests in minutes or hours, not a day or several days later. To provide timely feedback, we closely monitor the testing load on our DUT (device under test) systems to protect against excessive test queues, employ system and service monitoring tools such as Icinga to ensure DUT downtime is immediately flagged and addressed, an have developed a robust automated retesting system which allows us to re-queue dropped test runs in the event of a test infrastructure failure. When we have testing downtime, we make sure our team receives chat alerts immediately so we can investigate the issue and retesting can be triggered within an acceptable timeframe. Such timeliness and reliability allow software developers to focus more of their time on development and less on testing, thereby increasing their productivity. When working well, CI lab test results can act as an effective gating mechanism for patches, taking some of the workloads off of project maintainers and allowing them to allocate more time to their other duties.

Lesson #3 – Ensure Tests Can Be Recreated Locally

When standing up CI testing infrastructure, we are frequently tasked with deploying testing across various hardware and software platforms. In the DPDK Community Lab, for instance, we run a broad set of build, unit, and end-to-end functional and performance tests across different CPU architectures, operating systems and Linux Distributions, kernel versions, software stacks, and dedicated vendor hardware. It’s common for a test case to pass most of our systems but fail on just one or two combinations of the many inputs above. When this happens, the next step is for the developer community to investigate the failure. This often involves an attempt on their part to recreate the test locally, so, how can a CI testing lab facilitate this? We distribute the test suite logs as downloadable artifacts to the community, but we also include system information and configuration with the test results we publish. In cases where community members may not have the hardware necessary to recreate the test on their end, we may facilitate remote SSH access to our hardware for that individual or work with them to implement a manual retest according to their direction and share with them the results. For example, during the run-up to DPDK’s 24.03 release cycle, one of the major companies involved in DPDK submitted a patch that caused a failure on lab hardware that they could not reproduce at their lab. Engineers from that company accessed the testbed that reported the failure and discovered that their patch was initializing a capability on one of their NIC models where it was unsupported. Since our CI testing lab had this hardware, we caught the bug. Their patch was updated and merged into the main branch in time for the 24.03 release. By providing comprehensive system information alongside test reports and making our systems available to the community as needed, we enable the community to engage with our testing more constructively.

Lesson #4 – Automate Publishing of Timely Testing Coverage Reports

Producing code coverage reports reveals to an open source community what components of their project are being tested and how comprehensive (or not) the testing is. It’s all good if all CI tests are passing, but if only half of the available modules in an open source project have tests written for them, that fact needs to be made clearly visible to the community! When we traveled to DPDK Summit 2023, we heard from community participants that this visual was needed to understand the current reach of testing and where our efforts needed to be directed next. This clarifies to project users which components are well validated through testing, and encourages the community to work with lab maintainers towards improving testing code coverage where it is lacking. For the DPDK Community Lab, we now run 1x/month Lcov code coverage reports, which we post on our results dashboard.

results dashboard

Lesson #5 – Integrate the Developer Community Into Test Suite Development

One lesson we’ve learned from DPDK is that test suite development should not happen as a side project handled by a few individuals at the lab. It [TH1] must directly engage with and collaborate with the developers coming from the projects’ various stakeholders who are both the primary actors contributing new features and updates and who have the most project expertise. Using DPDK ethernet device tests as an example, we work with the community to write tests through a generic DPDK API, TestPMD (Test Poll Mode Drivers). This is an API that is supported by all the hardware vendor stakeholders in DPDK, so it’s an application the entire community is familiar with and able to work with in making test suite contributions. It also means the test suites we write can be used across hardware/software platforms, an important “nice to have” in the world of CI testing for open-source projects. When paired with a good communication stream with the community—see Lesson #1—the outcome is a comprehensive body of test coverage.