The OS upgrade management is implemented by two components:
The controller and the worker applications communicate over a volga topic. The controller issues commands that advance the upgrade process and each worker replies to the relevant commands on the same topic. Both the commands and the replies are JSON messages in a defined format. The following assumptions are made:
os-upgrade:control
. The worker application
should subscribe to this topic using the Volga API.unread
ack
ed after it has been
executed and the reply to this command has been sent. In addition to
this the prepare
command with not-after
timestamp in the past
should be ack
ed immediately with no action taken.ack
irrelevant messages (i.e. messages
directed to other hosts or replies) either, but if it does then it
must be done with great care to avoid implicitly acking a command
that is in progress and has not been completed yet. Keep in mind
that acking a volga message implicitly acks all earlier messages.hosts
or host
field in the commands
received from the controller to distinguish commands relevant to itos-upgrade-controller
.
This helps distinguish contoller commands from the worker replies.prepare
message (not-after
field, see below), then the
upgrade procedure is aborted and no further messages issued by the
controller
The upgrade process is as follows:{
"action": "prepare",
"hosts": [ "h01", "h02", "h03" ],
"not-after": ""
}
not-after
field, then it must ignore the command
and continue consuming from the topic.hosts
list should
prepare for the upgrade. The prepare phase is executed in parallel
and is expected to be non-destructive, i.e. a host is not expected
to be able to enter an inoperable state as a result of this phase.
An example of actions that may be performed in the prepare phase is:
fetching a list of image versions available for upgrade, downloading
the installation files for the packages to be upgraded etc.{
"action": "prepare",
"result": "done"
}
When the value of the result
parameter is done
the prepare phase
for this worker is considered successful. Any other value is
interpreted as the error message and the prepare phase for this
worker is considered failed.{
"action": "upgrade",
"host": "h02"
}
prepare
phase, but
has not executed an upgrade
since the last prepare
, to avoid
executing an replay of an older message;upgrade
phase the worker replies{
"action": "upgrade",
"result": "done"
}
result
field can take the following values:done
means that the operation has completed successfully and
no further steps are requiredreboot-required
means that the operation has completed
successfully and the host needs to be rebooted{
"action": "reboot",
"host": "h02"
}
reboot
message as follows:{
"action": "reboot",
"result": "done"
}
When the value of the result
parameter is done
the reboot phase
for this worker is considered successful. Any other value is
interpreted as the error message and the reboot phase for this
worker is considered failed. The worker has the responsibility of detecting the version(s) of OS or
packages running on the system. Whenever it notices that the version(s)
have changed (at startup or after the upgrade, or at any other time),
it should publish a message on os-upgrade:versions
Volga topic. The
format of the message is plain JSON object (with no nested objects
or lists) containing the key-value mapping of each relevant package
to its version. The versions published in this way will be persistently
stored in the cluster and available via the API under
/v1/state/os-upgrade/hosts
list. The worker may read the relevant
list entry at startup to compare the currently running version with the
stored one, to avoid duplicate messages.
Array of objects List of applications that implement the worker side of the OS upgrade mechanism, i.e. receive commands from the upgrade controller over volga, and perform the OS upgrade. The controller needs to know which hosts are a part of the upgrade when it starts. The list of worker applications is used for this purpose: each host where a service is scheduled that belongs to one of the applications on this list is included into the upgrade. Note that the controller expects each host to perform each command only once. It is possible to have multiple services from one or more applications scheduled to the same host, care should be taken to ensure there is no conflict between them and only one service instance responds to controller's commands. | |
Array of objects |
No Content
Bad Request
Unauthorized
Forbidden
Not Found
Precondition Failed
Service Unavailable (strongbox sealed)
worker-applications: - name: os-upgrade-debian maintenance-windows: - days-of-week: Friday, Saturday start-time: 01:00 timezone: site-local duration: 4h
Array of objects List of applications that implement the worker side of the OS upgrade mechanism, i.e. receive commands from the upgrade controller over volga, and perform the OS upgrade. The controller needs to know which hosts are a part of the upgrade when it starts. The list of worker applications is used for this purpose: each host where a service is scheduled that belongs to one of the applications on this list is included into the upgrade. Note that the controller expects each host to perform each command only once. It is possible to have multiple services from one or more applications scheduled to the same host, care should be taken to ensure there is no conflict between them and only one service instance responds to controller's commands. | |
Array of objects |
Created
No Content
Bad Request
Unauthorized
Forbidden
Not Found
Precondition Failed
Service Unavailable (strongbox sealed)
worker-applications: - name: os-upgrade-debian maintenance-windows: - days-of-week: Friday, Saturday start-time: 01:00 timezone: site-local duration: 4h
fields | string Retrieve only requested fields from the resource See section fields |
validate | string <enumeration> Validate the request but do not actually perform the requested operation |
OK
Not Modified
Bad Request
Unauthorized
Forbidden
Not Found
Precondition Failed
Service Unavailable (strongbox sealed)
worker-applications: - name: os-upgrade-debian maintenance-windows: - days-of-week: Friday, Saturday start-time: 01:00 timezone: site-local duration: 4h
fields | string Retrieve only requested fields from the resource See section fields |
site | string Send the request to the specfifed site |
content | string <enumeration> Filter descendant nodes in the response |
OK
Bad Request
Unauthorized
Forbidden
Not Found
Service Unavailable (strongbox sealed)
worker-applications: - name: os-upgrade-debian maintenance-windows: - days-of-week: Friday, Saturday start-time: 01:00 timezone: site-local duration: 4h status: idle next-upgrade-in: 1d4h18s scheduled-workers: - host: h01 application: os-upgrade-debian - host: h02 application: os-upgrade-debian last-upgrade-info: start-time: 2023-03-17T01:00:00Z end-time: 2023-03-17T01:24:07Z result: completed hosts: - hostname: h01 status: upgraded - hostname: h02 status: upgraded