Cloud-init

Introduction

Cloud-init is a widely used tool for customizing cloud instances during the boot process. It enables automatic configuration of virtual machines by applying user-defined settings such as:

  • Setting hostnames

  • Creating users and groups

  • Installing packages

  • Running scripts

Cloud-init supports multiple data sources and is commonly used in all major cloud platforms like AWS, Azure, and OpenStack. It reads configuration from metadata services or configuration files (e.g., #cloud-config) and applies them during instance initialization.

For more information, visit the cloud-init documentation.

Cloud-Init Configuration

cloud-init is a powerful tool used to automate the initialization of cloud instances. It reads configuration data from various sources and applies settings during the first boot of a virtual machine.

The configuration is usually provided in a file starting with the header #cloud-config, written in YAML format. It supports multiple modules and directives, such as:

  • users: Create and configure users

  • packages: Install software packages

  • runcmd: Run shell commands

  • write_files: Create files with specified content

Cloud-init also supports autoinstall for unattended OS installations, where configuration is nested under the autoinstall key.

Example structure:

Example cloud-init configuration with autoinstall and user-data
 1#cloud-config
 2autoinstall:
 3  version: 1
 4  packages:
 5    - cowsay
 6  user-data:
 7    users:
 8      - name: ciuser
 9        sudo: ALL=(ALL) NOPASSWD:ALL
10        shell: /bin/bash
11  runcmd:
12    - echo "Hello from cloud-init!"

Cloud-Init vs Autoinstall

Cloud-init and autoinstall are both tools used in Ubuntu systems to automate setup, but they serve different purposes and operate at different stages of the provisioning lifecycle.

  • Cloud-init is used to configure a system after it has been installed, typically during the first boot.

  • Autoinstall is used to automate the installation process itself, including disk partitioning, user creation, and package selection.

Subiquity

Subiquity is the modern installer used in Ubuntu Server editions. It replaces the older Debian-based installer and supports autoinstall for fully automated, unattended installations. Subiquity reads configuration from a YAML file embedded in a #cloud-config document and executes the installation accordingly.

Cloud-Init and Autoinstall Interaction

Autoinstall is implemented as a module within cloud-init. During installation, cloud-init processes the autoinstall section of the configuration file to guide Subiquity through the installation steps. After installation, cloud-init continues to configure the system using the user-data section on first boot.

Comparison of Autoinstall and Cloud-init

Feature

Autoinstall

Cloud-init

Purpose

Automates OS installation

Configures system post-install

Trigger

During installation

On first boot

Configuration Format

YAML under autoinstall key

YAML with #cloud-config header

Common Use

Ubuntu Server, cloud images

Cloud VMs, custom boot setups

Supported Installer

Subiquity

Cloud-init engine

Desktop Support

No (Ubiquity used)

Yes (limited)

Autoinstall vs Cloud-init Mapping

The following table maps common autoinstall directives to their cloud-init equivalents, highlighting differences in timing and application:

Autoinstall vs Cloud-init Mapping

Autoinstall Directive

Cloud-init Equivalent

Purpose

Timing

version

N/A

Schema version for autoinstall

Install-time only

identity (hostname, username, password)

hostname, users

Configure system identity

Autoinstall applies during install; Cloud-init applies on first boot

keyboard

keyboard (via locale or keyboard)

Keyboard layout

Install-time

locale

locale

System locale

Both supported; auto- install applies earlier

timezone

timezone

System timezone

Both supported

network

network

Netplan config for installer and target

Both supported; auto- install ensures connectivity during install

storage

No direct equivalent

Disk partitioning, LVM, ZFS (via Curtin)

Autoinstall only

apt

apt

Mirrors, proxy, geoip

Both supported

packages

packages

Install packages during install

Autoinstall installs in target image; cloud-init installs after first boot

snaps

snap

Install snaps during install

Both supported

updates

package_update, package_upgrade

Apply updates during install

Autoinstall applies before reboot

early-commands

bootcmd (similar timing)

Commands before partitioning

Autoinstall runs in the installer environment

late-commands

runcmd (runs on first boot)

Commands after install before reboot

Different timing

user-data

Full cloud-init schema (embedded)

Embed cloud-init for target system

Runs on first boot

Configuration Hierarchy

The configuration hierarchy in cloud-init can be visualized as follows:

digraph G {
    rankdir=TB;
    compound=true;
    node [shape=box, style=filled, fillcolor=lightgray, fontname="Helvetica"];
    edge [dir=none,style=invis]

    subgraph cluster_cloud_init{

        subgraph cluster_autoinstall{
            rankdir=TB;

            subgraph cluster_autoinstall_directives{
                rankdir=TB;
                autoinstalldirectives [label="version:\linteractive-sections:\learly-commands:\l", style=filled, fillcolor=lightblue];
                label="autoinstall directives:";
                style = rounded;
                color = blue;
            }
            subgraph cluster_userdata{
                rankdir=TB;
                userdata [label="user-data:\l    users:\l", style=filled, fillcolor=lightpink];
                label="user-data directives:";
                style = rounded;
                color = red;
            }

            label = "autoinstall:";
            style = rounded;
            color = gray;
        }

        label = "cloud-init";
        style = rounded;
        color = black;
    }
    autoinstalldirectives -> userdata;


}

Cloud-init Configuration Structure (autoinstall and user-data sections)

Datasources and Provisioning Workflow I

Important

For more details on cloud-init datasources, refer to the Datasources documentation.

NoCloud Data Source

The NoCloud data source is a generic method for providing meta-data and user-data to cloud-init. It is ideal for environments without native cloud metadata services, such as bare-metal servers, virtual machines, or custom provisioning systems.

Overview

The NoCloud data source supports two modes:

  • NoCloud (local disk): Uses a filesystem (e.g., ISO9660 or VFAT) with a volume label CIDATA containing configuration files.

  • NoCloud (local image): Uses a mounted filesystem (e.g., ISO, disk image).

  • NoCloud-Net: Fetches data from a remote HTTP server.

Required Files

The following files must be present in the data source:

  • meta-data: Contains instance metadata (hostname, instance-id, etc.).

  • user-data: Contains cloud-config or shell scripts for provisioning.

Optional files:

  • vendor-data: Additional configuration from vendor.

  • network-config: Network configuration in YAML format.

Example: meta-data

instance-id: nocloud-instance-001
local-hostname: myserver

Example: user-data

1#cloud-config
2users:
3  - name: testuser
4    sudo: ALL=(ALL) NOPASSWD:ALL
5    groups: users
6    shell: /bin/bash
7runcmd:
8  - echo "Provisioning complete" > /var/log/provision.log

Important

The above example is missing the autoinstall section. For unattended installations, See the Cloud-Init Configuration section.

NoCloud (local disk): Creating a USB Drive labeled CIDATA

To use a USB drive as the NoCloud data source:

  1. Create configuration files:

    1mkdir -p /tmp/nocloud
    2echo "instance-id: nocloud-001" > /tmp/nocloud/meta-data
    3echo -e "#cloud-config\nruncmd:\n  - echo Hello > /tmp/hello.txt" > /tmp/nocloud/user-data
    
  2. Create a VFAT filesystem image:

    truncate --size 2M seed.img
    mkfs.vfat -n CIDATA seed.img
    
  3. Copy configuration files to the image:

    mcopy -oi seed.img /tmp/nocloud/meta-data ::meta-data
    mcopy -oi seed.img /tmp/nocloud/user-data ::user-data
    
  4. Write image to USB drive:

    Identify your USB device (e.g., /dev/sdX) and write the image:

    sudo dd if=seed.img of=/dev/sdX bs=4M status=progress && sync
    

Warning

Ensure /dev/sdX is the correct USB device to avoid data loss.

  1. Boot the target system with the USB drive inserted:

    Cloud-init will detect the CIDATA volume and apply the configuration.

Alternative: NoCloud (local image): ISO Image

You can also create an ISO image:

  1. Create ISO or directory with required files:

    1mkdir -p /tmp/nocloud
    2echo "instance-id: nocloud-001" > /tmp/nocloud/meta-data
    3echo -e "#cloud-config\nruncmd:\n  - echo Hello > /tmp/hello.txt" > /tmp/nocloud/user-data
    
  2. Create ISO image (optional):

    genisoimage -output seed.iso -volid cidata -joliet -rock /tmp/nocloud/user-data /tmp/nocloud/meta-data
    
  3. Attach ISO to VM or mount directory:

    • For KVM/QEMU:

      qemu-system-x86_64 -cdrom nocloud.iso ...
      
    • For cloud-init testing:

      sudo cloud-init single --file /tmp/nocloud/user-data --name runcmd --frequency always
      
  4. Boot the system:

    Cloud-init will detect the NoCloud data source and apply the configuration.

See also

cloud-localds - Utility to create NoCloud seed images.

NoCloud-Net: Kernel Command Line

To use NoCloud-Net via HTTP:

ds=nocloud-net;s=http://<your-server>/cloud-init/

Ensure the HTTP server serves meta-data and user-data files at the root of the specified path.

As an example, to serve the configuration files using a Python HTTP server on port 8080:

  1. Create a directory with configuration files:

    1mkdir -p ~/cloud-init-data
    2echo "instance-id: nocloud-net-001" > ~/cloud-init-data/meta-data
    3echo "#cloud-config\nruncmd:\n - echo Hello from NoCloud-Net > /tmp/hello.txt" > ~/cloud-init-data/user-data
    
  2. Start Python HTTP server:

    cd ~/cloud-init-data
    python3 -m http.server 8080
    

    This will serve files at http://<your-ip>:8080/.

  3. Configure kernel command line on target system:

    Add the following to the boot parameters:

    ds=nocloud-net;s=http://<your-ip>:8080/
    

    Replace <your-ip> with the IP address of the server running the Python web server.

  4. Boot the target system:

    Cloud-init will fetch meta-data` and ``user-data from the specified URL and apply the configuration.

Using GRUB to Enable Autoinstall with Cloud-Init

To automate OS installation using cloud-init and avoid manual confirmation prompts, you can modify the GRUB boot parameters to include the autoinstall directive.

This is especially useful when using the NoCloud or NoCloud-Net data sources for unattended installations.

Editing GRUB Kernel Line

  1. Boot into the installer ISO or PXE environment.

  2. At the GRUB menu, press e to edit the boot entry.

  3. Locate the line starting with linux or linuxefi. It typically looks like:

    linux /casper/vmlinuz ... quiet --
    
  4. Append one of the following to the end of the line:

    1# For NoCloud with USB
    2autoinstall
    3
    4# For NoCloud-Net with HTTP server
    5autoinstall ds=nocloud-net;s=http://<your-server>:<port>/
    

    Replace <your-server> and <port> with the IP address or hostname and the port of the server hosting your meta-data and user-data files.

  5. Edited GRUB kernel line example:

    linux /casper/vmlinuz ... quiet autoinstall ds=nocloud-net;s=http://192.168.1.100:8080/ --
    
  6. Press `Ctrl + X` or `F10` to boot with the modified parameters.

This will trigger the autoinstall process using the provided cloud-init configuration without any user interaction.

Important

  • The autoinstall keyword is required for Ubuntu Server 20.04+ and other cloud-init enabled installers to bypass confirmation.

  • Ensure your HTTP server is running and accessible before booting the target system.

  • Optional: You can also use ds=nocloud;s=/media/usb/ if using a USB drive with a CIDATA label.

Datasources and Provisioning Workflow II - VMs and Cloud Instances

Cloud-init is widely used to automate the initialization of virtual machines and cloud instances across platforms. It supports a variety of data sources and integrates natively with many cloud providers. It reads configuration from a data source, which varies by platform.

Warning

The following examples are simplified for clarity. Refer to the official documentation for detailed setup and security considerations.

Virtual Machines

See NoCloud Data Source for usage with ISO images or USB drives.

AWS EC2

AWS uses the EC2 data source, which fetches metadata from the AWS metadata service.

Example: AWS EC2

  1. Launch an EC2 instance with a user-data script:

    1#cloud-config
    2packages:
    3  - nginx
    4runcmd:
    5  - systemctl enable nginx
    6  - systemctl start nginx
    
  2. Provide user-data via the AWS console or CLI:

    1aws ec2 run-instances \
    2  --image-id ami-12345678 \
    3  --instance-type t2.micro \
    4  --user-data file://user-data.yaml
    

Azure

Azure uses the Azure data source, which reads metadata from the Azure Instance Metadata Service (IMDS).

Example: Azure VM

  1. Create a cloud-init config:

    1#cloud-config
    2users:
    3  - name: azureuser
    4    ssh-authorized-keys:
    5      - ssh-rsa AAAAB3Nza...
    
  2. Deploy VM with cloud-init using Azure CLI:

1az vm create \
2  --resource-group myGroup \
3  --name myVM \
4  --image UbuntuLTS \
5  --custom-data cloud-config.yaml

OpenStack

OpenStack uses the ConfigDrive or Metadata Service data sources.

Example: Injecting user-data via OpenStack CLI

  1. Create a cloud-config file:

    1#cloud-config
    2users:
    3  - name: openstackuser
    4    ssh-authorized-keys:
    5      - ssh-rsa AAAAB3Nza...
    6runcmd:
    7  - echo "OpenStack instance initialized" > /tmp/openstack.txt
    
  2. Boot an instance with user-data:

    1openstack server create \
    2  --image ubuntu-22.04 \
    3  --flavor m1.small \
    4  --key-name mykey \
    5  --user-data cloud-config.yaml \
    6  --network private-net \
    7  openstack-vm
    

Cloud-init will automatically detect the OpenStack metadata service or ConfigDrive and apply the configuration.

Google Cloud Platform (GCP)

GCP uses the GCE data source, which reads metadata from the GCP metadata server.

Example: Setting startup script via gcloud

  1. Create a cloud-config file:

    1#cloud-config
    2runcmd:
    3  - echo "GCP instance initialized" > /tmp/gcp.txt
    
  2. Create a VM with metadata:

    1gcloud compute instances create gcp-vm \
    2  --image-family ubuntu-2204-lts \
    3  --image-project ubuntu-os-cloud \
    4  --metadata-from-file user-data=cloud-config.yaml
    

Cloud-init will fetch the user-data from the GCP metadata server and execute it on first boot.

Troubleshooting

  • Validate cloud-config:

    1# Without Annotations (for file named user-data)
    2cloud-init schema --config-file user-data
    3
    4# With Annotations (for file named config.yml)
    5cloud-init schema -c ./config.yml --annotate
    
  • View logs:

    cat /var/log/cloud-init.log
    cat /var/log/cloud-init-output.log