Cloud-init published its second release of 2018: version 18.2. Among many notable features in the 18.1 release and the 18.2 release, the cloud-init team has been adding some polish to our CLI tooling to make cloud-init easier to inspect and interact with. I will give a run down of some of the new commandline tools that cloud-init offers and why you might care.
status: What is cloud-init up to?
First, cloud-init status gives simple human-readable or programmatic output for what cloud-init is doing and whether it has finished successfully. It can be used as a sanity check on a machine or in scripts to block until cloud-init has completed successfully.
$ cloud-init status --long status: running time: Fri, 30 Mar 2018 04:07:48 +0000 detail: Running in stage: modules-config # Cloud-init reports if it it still in progress root@x12:~# cloud-init status --long status: running time: Mon, 02 Apr 2018 22:16:01 +0000 detail: Running in stage: init # Error conditions are bubbled up to the CLI $ cloud-init status status: error $ cloud-init status --long status: error time: Mon, 02 Apr 2018 20:54:13 +0000 detail: ('ntp', TypeError("argument of type 'NoneType' is not iterable",))
People no longer have to hunt for tracebacks or errors in /var/log/cloud-init.log. No longer would you have to parse /run/cloud-init/result.json or /run/cloud-init/status.json to find out what stage cloud-init is in. Any of these details are now surfaced by cloud-init status.
status: blocking until cloud-init is done
There have been frequent questions in #cloud-init channel about how to make custom scripts or programs block until cloud-init configuration is done. Our suggestion up until now was to deliver your own systemd unit with a dependency on After=cloud-init.target. Now that we have cloud-init status --wait, simple scripts can block until cloud init is done. The example below instruments a reboot-cron job which will block on cloud-init completion, creating /home/ubuntu/post-cloud-init.log the moment cloud-init succeeds:
$ cat > /home/ubuntu/yourscript.sh <<EOF #!/bin/bash set -e # Block until cloud-init completes cloud-init status --wait > /dev/null 2>&1 [ $? -ne 0 ] && echo 'Cloud-init failed' && exit 1 echo 'Cloud-init succeeded at ' `date -R` > /home/ubuntu/post-cloud-init.log # Make your magic happen here EOF $ chmod 755 /home/ubuntu/yourscript.sh $ crontab -e @reboot /home/ubuntu/yourscript.sh $ sudo reboot
clean: Tidy up your cloud-init instance
The cloud-init clean subcommand cleans up logs and cloud-init artifacts from the instance. This is an asset for ensuring a pristine environment and great for iterative development. Cloud-init operations are gated by semaphores which live in /var/lib/cloud/sem and limit the frequency with which those actions are performed. Cloud-init caches what it can so that consumers do not repay discovery costs across every reboot. The CLI now supports removal of all semaphores, caches and optionally logs so that it appears to cloud-init as a fresh machine. The next reboot of the system will re-run all discovery and configuration. Optionally, one can provide --logs to remove /var/log/cloud-init*log files or --reboot to reboot the system after clean.
# --logs removes /var/log/cloud-init*.log files $ cloud-init clean --reboot --logs
collect-logs: Grab all cloud-init artifacts
For filing bugs against cloud-init, we would like to collect a set of known artifacts and files. We have now added a cloud-init collect-logs command which tars any content useful to someone trying to triage a cloud-init failure. It simplifies filing upstream bugs as there is only one attachment to place on a bug when you send it to us.
For Ubuntu specifically, we also built in apport hooks on top of this, so Ubuntu users can simply ubuntu-bug cloud-init and you'll be prompted with a couple of questions to automatically create and attach logs to a launchpad bug.
collect-logs grabs the following:
- cloud-init package version information
- journalctl output
- optionally, user-data configuration if --userdata-include is specified
- on ubuntu: ubuntu-bug cloud-init also prompts you to tell the bug the cloud you are running on
# Create cloud-init.tar.gz in your current working dir $ cloud-init collect-logs # Alternately on ubuntu $ ubuntu-bug cloud-init
analyze: We care about cloud-init's performance on your cloud so you don't have to
During our release cycles we strive to get cloud-init out of your way so you can get to the important business ahead of you. To do this it is imperative for us to refine and monitor the speed with which cloud-init can get your machine configured and operational.
To assess how performant cloud-init is, Ryan Harper introduced a tool which takes cloud-init's event data and analyzes that data to determine how much time is spent by cloud-init in any of the discovery and configuraiton stages cloud-init runs. The cloud-init analyze tool, similar to systemd-analyze, gives us the ability to show total exectution time of each boot and cloud-init stage as well as blame to biggest consumers of clock-time during a given boot.
Already reports from this tool have given us leverage to avoid costly Python library imports where unnecessary and rework expensive logic during the discovery and configuration process.
show: groups events according to cloud-init stages
$ cloud-init analyze show -- Boot Record 01 -- The total time elapsed since completing an event is printed after the "@" character. The time the event takes is printed after the "+" character. Starting stage: init-local |`->no cache found @00.00400s +00.00100s |`->found local data from DataSourceNoCloud @00.00800s +00.09500s Finished stage: (init-local) 00.15800 seconds Starting stage: init-network |`->restored from cache with run check: DataSourceNoCloud [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net] @00.77100s +00.00300s ... Finished stage: (init-network) 00.68900 seconds ... Total Time: 3.96800 seconds
blame: orders the report by most expensive operations
$ cloud-init analyze blame -- Boot Record 01 -- 01.95200s (modules-config/config-snap) 00.76500s (modules-config/config-grub-dpkg) 00.37400s (init-network/config-ssh) 00.23000s (modules-config/config-apt-configure) 00.13400s (init-network/config-users-groups) 00.09500s (init-local/search-NoCloud)
Thanks for checking in, here are some thoughts in the blog bag for one of our next adventures
- supercharge your cloud-init iterative development using lxd
- using python-boto3 for speedy integration testing on EC2
- Making EC2 and OpenStack boots faster
- cloud-init cloud-config schema annotations and why humans stink