Cloud-init

Introduction 

Cloud-init is a widely used tool for customizing cloud instances during the boot process. It enables automatic configuration of virtual machines by applying user-defined settings such as:

Setting hostnames
Creating users and groups
Installing packages
Running scripts

Cloud-init supports multiple data sources and is commonly used in all major cloud platforms like AWS, Azure, and OpenStack. It reads configuration from metadata services or configuration files (e.g., #cloud-config) and applies them during instance initialization.

For more information, visit the cloud-init documentation.

Cloud-Init Configuration 

cloud-init is a powerful tool used to automate the initialization of cloud instances. It reads configuration data from various sources and applies settings during the first boot of a virtual machine.

The configuration is usually provided in a file starting with the header #cloud-config, written in YAML format. It supports multiple modules and directives, such as:

users: Create and configure users
packages: Install software packages
runcmd: Run shell commands
write_files: Create files with specified content

Cloud-init also supports autoinstall for unattended OS installations, where configuration is nested under the autoinstall key.

Example structure:

Example cloud-init configuration with autoinstall and user-data

#cloud-config
autoinstall:
  version: 1
  packages:
    - cowsay
  user-data:
    users:
      - name: ciuser
        sudo: ALL=(ALL) NOPASSWD:ALL
        shell: /bin/bash
  runcmd:
    - echo "Hello from cloud-init!"

Cloud-Init vs Autoinstall 

Cloud-init and autoinstall are both tools used in Ubuntu systems to automate setup, but they serve different purposes and operate at different stages of the provisioning lifecycle.

Cloud-init is used to configure a system after it has been installed, typically during the first boot.
Autoinstall is used to automate the installation process itself, including disk partitioning, user creation, and package selection.

Subiquity 

Subiquity is the modern installer used in Ubuntu Server editions. It replaces the older Debian-based installer and supports autoinstall for fully automated, unattended installations. Subiquity reads configuration from a YAML file embedded in a #cloud-config document and executes the installation accordingly.

Cloud-Init and Autoinstall Interaction 

Autoinstall is implemented as a module within cloud-init. During installation, cloud-init processes the autoinstall section of the configuration file to guide Subiquity through the installation steps. After installation, cloud-init continues to configure the system using the user-data section on first boot.

Comparison of Autoinstall and Cloud-init
Feature	Autoinstall	Cloud-init
Purpose	Automates OS installation	Configures system post-install
Trigger	During installation	On first boot
Configuration Format	YAML under `autoinstall` key	YAML with `#cloud-config` header
Common Use	Ubuntu Server, cloud images	Cloud VMs, custom boot setups
Supported Installer	Subiquity	Cloud-init engine
Desktop Support	No (Ubiquity used)	Yes (limited)

Autoinstall vs Cloud-init Mapping 

The following table maps common autoinstall directives to their cloud-init equivalents, highlighting differences in timing and application:

Autoinstall vs Cloud-init Mapping
Autoinstall Directive	Cloud-init Equivalent	Purpose	Timing
`version`	N/A	Schema version for autoinstall	Install-time only
`identity` (hostname, username, password)	`hostname`, `users`	Configure system identity	Autoinstall applies during install; Cloud-init applies on first boot
`keyboard`	`keyboard` (via `locale` or `keyboard`)	Keyboard layout	Install-time
`locale`	`locale`	System locale	Both supported; auto- install applies earlier
`timezone`	`timezone`	System timezone	Both supported
`network`	`network`	Netplan config for installer and target	Both supported; auto- install ensures connectivity during install
`storage`	No direct equivalent	Disk partitioning, LVM, ZFS (via Curtin)	Autoinstall only
`apt`	`apt`	Mirrors, proxy, geoip	Both supported
`packages`	`packages`	Install packages during install	Autoinstall installs in target image; cloud-init installs after first boot
`snaps`	`snap`	Install snaps during install	Both supported
`updates`	`package_update`, `package_upgrade`	Apply updates during install	Autoinstall applies before reboot
`early-commands`	`bootcmd` (similar timing)	Commands before partitioning	Autoinstall runs in the installer environment
`late-commands`	`runcmd` (runs on first boot)	Commands after install before reboot	Different timing
`user-data`	Full cloud-init schema (embedded)	Embed cloud-init for target system	Runs on first boot

Configuration Hierarchy 

The configuration hierarchy in cloud-init can be visualized as follows:

digraph G {
rankdir=TB;
compound=true;
node [shape=box, style=filled, fillcolor=lightgray, fontname="Helvetica"];
edge [dir=none,style=invis]

subgraph cluster_cloud_init{

subgraph cluster_autoinstall{
rankdir=TB;

subgraph cluster_autoinstall_directives{
rankdir=TB;
autoinstalldirectives [label="version:\linteractive-sections:\learly-commands:\l", style=filled, fillcolor=lightblue];
label="autoinstall directives:";
style = rounded;
color = blue;
}
subgraph cluster_userdata{
rankdir=TB;
userdata [label="user-data:\l users:\l", style=filled, fillcolor=lightpink];
label="user-data directives:";
style = rounded;
color = red;
}

label = "autoinstall:";
style = rounded;
color = gray;
}

label = "cloud-init";
style = rounded;
color = black;
}
autoinstalldirectives -> userdata;

} — Cloud-init Configuration Structure (autoinstall and user-data sections)

Datasources and Provisioning Workflow I 

Important

For more details on cloud-init datasources, refer to the Datasources documentation.

NoCloud Data Source 

The NoCloud data source is a generic method for providing meta-data and user-data to cloud-init. It is ideal for environments without native cloud metadata services, such as bare-metal servers, virtual machines, or custom provisioning systems.

Overview 

The NoCloud data source supports two modes:

NoCloud (local disk): Uses a filesystem (e.g., ISO9660 or VFAT) with a volume label CIDATA containing configuration files.
NoCloud (local image): Uses a mounted filesystem (e.g., ISO, disk image).
NoCloud-Net: Fetches data from a remote HTTP server.

Required Files 

The following files must be present in the data source:

meta-data: Contains instance metadata (hostname, instance-id, etc.).
user-data: Contains cloud-config or shell scripts for provisioning.

Optional files:

vendor-data: Additional configuration from vendor.
network-config: Network configuration in YAML format.

Example: `meta-data`

instance-id: nocloud-instance-001
local-hostname: myserver

Example: `user-data`

#cloud-config
users:
  - name: testuser
    sudo: ALL=(ALL) NOPASSWD:ALL
    groups: users
    shell: /bin/bash
runcmd:
  - echo "Provisioning complete" > /var/log/provision.log

Important

The above example is missing the autoinstall section. For unattended installations, See the Cloud-Init Configuration section.

NoCloud (local disk): Creating a USB Drive labeled CIDATA 

To use a USB drive as the NoCloud data source:

Create configuration files:

mkdir -p /tmp/nocloud
echo "instance-id: nocloud-001" > /tmp/nocloud/meta-data
echo -e "#cloud-config\nruncmd:\n  - echo Hello > /tmp/hello.txt" > /tmp/nocloud/user-data

Create a VFAT filesystem image:

truncate --size 2M seed.img
mkfs.vfat -n CIDATA seed.img

Copy configuration files to the image:

mcopy -oi seed.img /tmp/nocloud/meta-data ::meta-data
mcopy -oi seed.img /tmp/nocloud/user-data ::user-data

Write image to USB drive:

Identify your USB device (e.g., /dev/sdX) and write the image:
```
sudo dd if=seed.img of=/dev/sdX bs=4M status=progress && sync
```

Warning

Ensure /dev/sdX is the correct USB device to avoid data loss.

Boot the target system with the USB drive inserted:

Cloud-init will detect the CIDATA volume and apply the configuration.

Alternative: NoCloud (local image): ISO Image 

You can also create an ISO image:

Create ISO or directory with required files:

mkdir -p /tmp/nocloud
echo "instance-id: nocloud-001" > /tmp/nocloud/meta-data
echo -e "#cloud-config\nruncmd:\n  - echo Hello > /tmp/hello.txt" > /tmp/nocloud/user-data

Create ISO image (optional):

genisoimage -output seed.iso -volid cidata -joliet -rock /tmp/nocloud/user-data /tmp/nocloud/meta-data

Attach ISO to VM or mount directory:

For KVM/QEMU:

qemu-system-x86_64 -cdrom nocloud.iso ...

For cloud-init testing:

sudo cloud-init single --file /tmp/nocloud/user-data --name runcmd --frequency always

Boot the system:

Cloud-init will detect the NoCloud data source and apply the configuration.

NoCloud-Net: Kernel Command Line 

To use NoCloud-Net via HTTP:

ds=nocloud-net;s=http://<your-server>/cloud-init/

Ensure the HTTP server serves meta-data and user-data files at the root of the specified path.

As an example, to serve the configuration files using a Python HTTP server on port 8080:

Create a directory with configuration files:

mkdir -p ~/cloud-init-data
echo "instance-id: nocloud-net-001" > ~/cloud-init-data/meta-data
echo "#cloud-config\nruncmd:\n - echo Hello from NoCloud-Net > /tmp/hello.txt" > ~/cloud-init-data/user-data

Start Python HTTP server:
```
cd ~/cloud-init-data
python3 -m http.server 8080
```
This will serve files at http://<your-ip>:8080/.
Configure kernel command line on target system:

Add the following to the boot parameters:
```
ds=nocloud-net;s=http://<your-ip>:8080/
```
Replace <your-ip> with the IP address of the server running the Python web server.
Boot the target system:

Cloud-init will fetch meta-data` and ``user-data from the specified URL and apply the configuration.

Using GRUB to Enable Autoinstall with Cloud-Init 

To automate OS installation using cloud-init and avoid manual confirmation prompts, you can modify the GRUB boot parameters to include the autoinstall directive.

This is especially useful when using the NoCloud or NoCloud-Net data sources for unattended installations.

Editing GRUB Kernel Line 

Boot into the installer ISO or PXE environment.
At the GRUB menu, press e to edit the boot entry.
Locate the line starting with linux or linuxefi. It typically looks like:
```
linux /casper/vmlinuz ... quiet --
```
Append one of the following to the end of the line:
```
1# For NoCloud with USB
2autoinstall
3
4# For NoCloud-Net with HTTP server
5autoinstall ds=nocloud-net;s=http://<your-server>:<port>/
```
Replace <your-server> and <port> with the IP address or hostname and the port of the server hosting your meta-data and user-data files.

Edited GRUB kernel line example:

linux /casper/vmlinuz ... quiet autoinstall ds=nocloud-net;s=http://192.168.1.100:8080/ --

Press `Ctrl + X` or `F10` to boot with the modified parameters.

This will trigger the autoinstall process using the provided cloud-init configuration without any user interaction.

Important

The autoinstall keyword is required for Ubuntu Server 20.04+ and other cloud-init enabled installers to bypass confirmation.
Ensure your HTTP server is running and accessible before booting the target system.
Optional: You can also use ds=nocloud;s=/media/usb/ if using a USB drive with a CIDATA label.

Datasources and Provisioning Workflow II - VMs and Cloud Instances 

Cloud-init is widely used to automate the initialization of virtual machines and cloud instances across platforms. It supports a variety of data sources and integrates natively with many cloud providers. It reads configuration from a data source, which varies by platform.

Warning

The following examples are simplified for clarity. Refer to the official documentation for detailed setup and security considerations.

Virtual Machines 

See NoCloud Data Source for usage with ISO images or USB drives.

AWS EC2 

AWS uses the EC2 data source, which fetches metadata from the AWS metadata service.

Example: AWS EC2

Launch an EC2 instance with a user-data script:

#cloud-config
packages:
  - nginx
runcmd:
  - systemctl enable nginx
  - systemctl start nginx

Provide user-data via the AWS console or CLI:

aws ec2 run-instances \
  --image-id ami-12345678 \
  --instance-type t2.micro \
  --user-data file://user-data.yaml

Azure 

Azure uses the Azure data source, which reads metadata from the Azure Instance Metadata Service (IMDS).

Example: Azure VM

Create a cloud-init config:

#cloud-config
users:
  - name: azureuser
    ssh-authorized-keys:
      - ssh-rsa AAAAB3Nza...

Deploy VM with cloud-init using Azure CLI:

az vm create \
  --resource-group myGroup \
  --name myVM \
  --image UbuntuLTS \
  --custom-data cloud-config.yaml

OpenStack 

OpenStack uses the ConfigDrive or Metadata Service data sources.

Example: Injecting user-data via OpenStack CLI

Create a cloud-config file:

#cloud-config
users:
  - name: openstackuser
    ssh-authorized-keys:
      - ssh-rsa AAAAB3Nza...
runcmd:
  - echo "OpenStack instance initialized" > /tmp/openstack.txt

Boot an instance with user-data:

openstack server create \
  --image ubuntu-22.04 \
  --flavor m1.small \
  --key-name mykey \
  --user-data cloud-config.yaml \
  --network private-net \
  openstack-vm

Cloud-init will automatically detect the OpenStack metadata service or ConfigDrive and apply the configuration.

Google Cloud Platform (GCP)

GCP uses the GCE data source, which reads metadata from the GCP metadata server.

Example: Setting startup script via gcloud

Create a cloud-config file:

#cloud-config
runcmd:
  - echo "GCP instance initialized" > /tmp/gcp.txt

Create a VM with metadata:

gcloud compute instances create gcp-vm \
  --image-family ubuntu-2204-lts \
  --image-project ubuntu-os-cloud \
  --metadata-from-file user-data=cloud-config.yaml

Cloud-init will fetch the user-data from the GCP metadata server and execute it on first boot.

Troubleshooting 

Validate cloud-config:

# Without Annotations (for file named user-data)
cloud-init schema --config-file user-data

# With Annotations (for file named config.yml)
cloud-init schema -c ./config.yml --annotate

View logs:

cat /var/log/cloud-init.log
cat /var/log/cloud-init-output.log

Cloud-init

Example: meta-data

Example: user-data

Example: AWS EC2

Example: Azure VM

Example: Injecting user-data via OpenStack CLI

Example: Setting startup script via gcloud

Example: `meta-data`

Example: `user-data`