Control: tags -1 + upstream
Control: forwarded -1
https://github.com/canonical/cloud-init/issues/6205
On Sat, May 03, 2025 at 12:25:01PM +0330, Zar VPN wrote:
This is a critical issue, as it prevents users from booting and
configuring instances in modern IPv6-only cloud environments using the
official Debian cloud image.
I can reproduce this issue, but I don't think it is limited to Debian.
It seems that it's either cloud-init itself or python's HTTP client
(urllib3 and/or requests).
cloudinit/sources/DataSourceOpenStack.py defines a function wait_for_metadata_service(). This contains the default list of IMDS endpoints:
DEF_MD_URLS = [
"
http://[fe80::a9fe:a9fe%25{iface}]".format(
iface=self.distro.fallback_interface
),
"
http://169.254.169.254",
]
urls = self.ds_cfg.get("metadata_urls", DEF_MD_URLS)
It constructs a list of URLs to probe when looking for a functioning
IMDS endpoint by appending the "openstack" path to the default list of endpoints, as well as any passed in the configuration:
for url in urls:
md_url = url_helper.combine_url(url, "openstack")
md_urls.append(md_url)
It then probes those endpoints:
avail_url, _response = url_helper.wait_for_url(
urls=md_urls,
max_wait=url_params.max_wait_seconds,
timeout=url_params.timeout_seconds,
connect_synchronously=False,
)
However, it doesn't actually seem to be able to successfully probe a
link-local endpoint at all. We can test this ourselves by constructing
a simplified test case:
noahm@foo:~$ cat /tmp/t.py
#!/usr/bin/python3
from cloudinit import url_helper
url="
http://[fe80::a9fe:a9fe%enp0s1]"
md_url = url_helper.combine_url(url, "openstack")
md_urls=[md_url]
print(url_helper.wait_for_url(md_urls, max_wait=5, timeout=1))
noahm@foo:~$ python3 /tmp/t.py
(False, None)
Both the server logs and tcpdump show no request is ever issued to the
given URL.
But if we change that to use a globally scoped address, it works:
noahm@foo:~$ cat /tmp/t.py
#!/usr/bin/python3
from cloudinit import url_helper
# url="
http://[fe80::a9fe:a9fe%enp0s1]" url="
http://[fd00:80db:0:5:34e5:8aff:fec5:b9bf]"
md_url = url_helper.combine_url(url, "openstack")
md_urls=[md_url]
print(url_helper.wait_for_url(md_urls, max_wait=5, timeout=1))
noahm@foo:~$ python3 /tmp/t.py ('
http://[fd00:80db:0:5:34e5:8aff:fec5:b9bf]/openstack', b'<!doctype html>\n<html>\n<head>\n <title>untitled</title>\n</head>\n<body>\n</body>\n</html>\n')
And to be sure, the server does reply to queries on link-local
addresses:
noahm@foo:~$ curl -v '
http://[fe80::a9fe:a9fe%enp0s1]/openstack'
* Trying [fe80::a9fe:a9fe]:80...
* Connected to fe80::a9fe:a9fe (fe80::a9fe:a9fe) port 80
* using HTTP/1.x
GET /openstack HTTP/1.1
Host: [fe80::a9fe:a9fe]
User-Agent: curl/8.13.0
Accept: */*
< HTTP/1.1 301 Moved Permanently
< Server: nginx/1.22.1
< Date: Sun, 04 May 2025 02:43:32 GMT
< Content-Type: text/html
< Content-Length: 169
< Location:
http://[fe80::a9fe:a9fe]/openstack/
< Connection: keep-alive
<
<html>
<head><title>301 Moved Permanently</title></head>
<body>
<center><h1>301 Moved Permanently</h1></center> <hr><center>nginx/1.22.1</center>
</body>
</html>
* Connection #0 to host fe80::a9fe:a9fe left intact
We can also see evidence suggesting that something is wrong in
cloud-init from the logs you provided:
2025-04-30 09:59:03,739 - url_helper.py[DEBUG]: [0/1] open '
http://[fe80::a9fe:a9fe%25enp3s0]/openstack' with {'url': '
http://[fe80::a9fe:a9fe%25enp3s0]/openstack', 'stream': False, 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'
User-Agent': 'Cloud-Init/25.1.1'}} configuration
2025-04-30 09:59:03,893 - url_helper.py[DEBUG]: [0/1] open '
http://169.254.169.254/openstack' with {'url': '
http://169.254.169.254/openstack', 'stream': False, 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-
Init/25.1.1'}} configuration
2025-04-30 09:59:03,895 - url_helper.py[DEBUG]: Exception(s) [UrlError('HTTPConnectionPool(host=\'fe80::a9fe:a9fe%25enp3s0\', port=80): Max retries exceeded with url: /openstack (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at
0x7fb44b82fe00>: Failed to resolve \'fe80::a9fe:a9fe%25enp3s0\' ([Errno -2] Name or service not known)"))'), UrlError("HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /openstack (Caused by NewConnectionError('<urllib3.
connection.HTTPConnection object at 0x7fb44b6d9810>: Failed to establish a new connection: [Errno 101] Network is unreachable'))")] during request to
http://169.254.169.254/openstack, raising last exception
2025-04-30 09:59:03,895 - url_helper.py[DEBUG]: Calling '
http://169.254.169.254/openstack' failed [0/-1s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /openstack (Caused by NewConnectionError('<
urllib3.connection.HTTPConnection object at 0x7fb44b6d9810>: Failed to establish a new connection: [Errno 101] Network is unreachable'))]
2025-04-30 09:59:03,895 - DataSourceOpenStack.py[DEBUG]: Giving up on OpenStack md from ['
http://[fe80::a9fe:a9fe%25enp3s0]/openstack', '
http://169.254.169.254/openstack'] after 0 seconds
2025-04-30 09:59:03,895 - log_util.py[WARNING]: No active metadata service found
2025-04-30 09:59:03,895 - log_util.py[DEBUG]: No active metadata service found
Note in particular this:
Exception(s) [UrlError('HTTPConnectionPool(host=\'fe80::a9fe:a9fe%25enp3s0\', port=80): Max retries exceeded with url: /openstack (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fb44b82fe00>: Failed to resolve \'fe80::a9fe:
a9fe%25enp3s0\' ([Errno -2] Name or service not known)"))')
There shouldn't be any name resolution involved here at all. My guess
is that something is not recognizing the scoped link-local address as an
IP address, and is treating it as a hostname that needs to be resolved
in DNS. Which is obviously going to fail. I haven't looked deeply
enough to determine whether this is cloud-init or a lower-level http
client.
noah
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)