Back to blog

Brownout on Update Center (updates.jenkins.io): 6 and 9 September 2024

Damien DUPORTAL
Damien DUPORTAL
September 4, 2024

Summary (TL;DR)

The service https://updates.jenkins.io will switch its implementation to a new system during 1 hour twice:

  • Friday 6 September 2024 from 07:00am UTC until 08:00am UTC

  • Monday 9 September 2024 from 02:00pm UTC until 03:00pm UTC

All Jenkins users are impacted but should not see any functional change.

⚠️ Please, check that your organization respects the advertised DNS TTL or you might be stuck in the brownout longer that expected.

Under the hood, any HTTP request made to this service will be redirected to a mirror close to user locations during these brownouts.

What is the "Update Center"?

Jenkins Update Center is a web server at the core of the Jenkins public infrastructure which distributes the plugins, tool installers, and versions index to all Jenkins servers all across the world.

From the Wizard installation to regular plugin updates, if you run Jenkins then you use this service under the hood.

Today, it serves around 50 Tb of data (outbound bandwidth) each month from a single Virtual Machine on AWS, which costs around $6,000 per month.

In order to sustain this service and improve it, the Jenkins infrastructure team has worked relentlessly during the past years to have a new sustainable implementation for this service.

The new Update Center implementation features a highly available system, which redirects user requests to a download mirror close to their location. There is lots of details and additional information that you can read further about in the following GitHub issue.

Why this "Brownout"?

We have reached the point where our functional/performance tests are meeting our expectation:

  • All of our Jenkins controllers are using the new Update Center implementation.

  • We are using the new updates center (with DNS override) in our own networks in different geo-locations.

  • All of our Jenkins Docker image are using the new system.

  • Our initial performance tests

But as the adage says: "No plan survives first contact".

That’s why we are planning these brownouts of one hour each to:

  • Validate our system beyond the tests we already ran to increase trust.

  • Run a real life (production!) workload against the new system while limiting any unforeseen impact due to the short time window.

How

Both current and new Update Centers are updated at the same time and serve the same index.

Starting 4 weeks ago, the Jenkins infrastructure has been using the new Update Center (https://azure.updates.jenkins.io) with a client-side DNS override (updates.jenkins.io hostname points to this new service).

During the brownouts, we’ll simply switch the DNS entry updates.jenkins.io to this new service and watch for the logs and error rate. At the end of each brownout, we’ll switch DNS back to the normal service and then analyze metrics and logs to see how the system behaved.

  • If no major problem is detected then we’ll plan a 24 hours brownout soon.

  • Otherwise we’ll analyze the discovered problem and plan solutions.

The first brownout covers Asia + Europe "activity" timezones while the second covers US + Europe so we can have 2 distinct set of usages.

Please refer to the helpdesk ticket for more information.

About the author

Damien DUPORTAL

Damien DUPORTAL

Damien is the Jenkins Infrastructure officer and a software engineer at CloudBees working as a Site Reliability Engineer for the Jenkins Infrastructure project. Not only he is a decade-old Hudson/Jenkins user but also an open-source citizen who participates in Updatecli, Asciidoctor, Traefik and many others.