When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
(1) Keeping outdated developer boxes around and compile there. We
would freeze portage against accidental emerge sync by creating a
git branch in /var/db/repos/gentoo. This feels hacky and requires a increating number of develper VMs. And sometimes we are hit by a
silent incompatibility we were not aware of.
(3) Distributing apps as VMs or docker: Even those tools advance and
become incompatible, right? And not suitable when for smaller Arm
devices.
Hi and happy new year.
When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
On 2 Jan 2023, at 12:48, m1027 <m1027@posteo.net> wrote:
Hi and happy new year.
When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
So, what we've thought of so far is:
(1) Keeping outdated developer boxes around and compile there. We
would freeze portage against accidental emerge sync by creating a
git branch in /var/db/repos/gentoo. This feels hacky and requires a increating number of develper VMs. And sometimes we are hit by a
silent incompatibility we were not aware of.
(2) Using Ubuntu LTS for production and Gentoo for development is
hit by subtile libjpeg incompatibilites and such.
(3) Distributing apps as VMs or docker: Even those tools advance and
become incompatible, right? And not suitable when for smaller Arm
devices.
(4) Flatpak: No experience, does it work well?
(5) Inventing a full fledged OTA Gentoo OS updater and distribute
that together with the apps... Nah.
Hm... Comments welcome.
So, what we've thought of so far is:..
(1) Keeping outdated developer boxes around and compile there. We
would freeze portage against accidental emerge sync by creating a
git branch in /var/db/repos/gentoo. This feels hacky and requires a increating number of develper VMs. And sometimes we are hit by a
silent incompatibility we were not aware of.
(5) Inventing a full fledged OTA Gentoo OS updater and distribute
that together with the apps... Nah.
Hm... Comments welcome.
Essentially you will be maintaining a private fork of gentoo.git,
Hi and happy new year.
When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
So, what we've thought of so far is:
(1) Keeping outdated developer boxes around and compile there. We
would freeze portage against accidental emerge sync by creating a
git branch in /var/db/repos/gentoo. This feels hacky and requires a increating number of develper VMs. And sometimes we are hit by a
silent incompatibility we were not aware of.
(2) Using Ubuntu LTS for production and Gentoo for development is
hit by subtile libjpeg incompatibilites and such.
(3) Distributing apps as VMs or docker: Even those tools advance and
become incompatible, right? And not suitable when for smaller Arm
devices.
(4) Flatpak: No experience, does it work well?
(5) Inventing a full fledged OTA Gentoo OS updater and distribute
that together with the apps... Nah.
Hm... Comments welcome.
Thanks
On 2 Jan 2023, at 12:48, m1027 <m1027@posteo.net> wrote:
Hi and happy new year.
When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
I'd really suggest just using stable in production and a mix
for developers so you can catch any problems beforehand.
We try to be quite conservative about things like OpenSSL 3,
glibc updates, etc.
On Mon, Jan 2, 2023 at 4:48 AM m1027 <m1027@posteo.net> wrote:
Hi and happy new year.
When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
I wrote a very long reply, but I've removed most of it: I basically
have a few questions, and then some comments:
I don't quite grasp your problem statement, so I will repeat what I
think it is and you can confirm / deny.
- Your devs build using gentoo synced against some recent tree, they
have recent packages, and they build some software that you deploy to
prod.
- Your prod machines are running gentoo synced against some recent
tree, but not upgraded (maybe only glsa-check runs) and so they are
running 'old' packages because you are afraid to update them[0]
- Your software builds OK in dev, but when you deploy it in prod it
breaks, because prod is really old, and your developments are using
packages that are too new.
My main feedback here is:
- Your "build" environment should be like prod. You said you didn't
want to build "developer VMs" but I am unsure why. For example I run
Ubuntu and I do all my gentoo development (admittedly very little
these days)
in a systemd-nspawn container, and I have a few shell scripts to
mount everything and set it up (so it has a tree snapshot, some git
repos, some writable space etc.)
- Your "prod" environment is too risky to upgrade, and you have
difficulty crafting builds that run in every prod environment. I think
this is fixable by making a build environment more like the prod
environment.
The challenge here is that if you have not done that (kept the
copies of ebuilds around, the distfiles, etc) it can be challenging to "recreate" the existing older prod environments.
But if you do the above thing (where devs build in a container)
and you can make that container like the prod environments, then you
can enable devs to build for the prod environment (in a container on
their local machine) and get the outcome you want.
- Understand that not upgrading prod is like, to use a finance term,
picking up pennies in front of a steamroller. It's a great strategy,
but eventually you will actually *need* to upgrade something. Maybe
for a critical security issue, maybe for a feature. Having a build environment that matches prod is good practice, you should do it, but
you should also really schedule maintenance for these prod nodes to
get them upgraded. (For physical machines, I've often seen businesses
just eat the risk and assume the machine will physically fail before
the steamroller comes, but this is less true with virtualized
environments that have longer real lifetimes.)
So, what we've thought of so far is:
(1) Keeping outdated developer boxes around and compile there. We
would freeze portage against accidental emerge sync by creating a
git branch in /var/db/repos/gentoo. This feels hacky and requires a increating number of develper VMs. And sometimes we are hit by a
silent incompatibility we were not aware of.
In general when you build binaries for some target, you should build
on that target when possible. To me, this is the crux of your issue
(that you do not) and one of the main causes of your pain.
You will need to figure out a way to either:
- Upgrade the older environments to new packages.
- Build in copies of the older environments.
I actually expect the second one to take 1-2 sprints (so like 1 engineer month?)
- One sprint to make some scripts that makes a new production 'container'
- One sprint to sort of integrate that container into your dev
workflow, so devs build in the container instead of what they build in
now.
It might be more or less daunting depending on how many distinct
(unique?) prod environments you have (how many containers will you
actually need for good build coverage?), how experienced in Gentoo
your developers are, and how many artifacts from prod you have.
- A few crazy ideas are like:
- Snapshot an existing prod machine, strip of it machine-specific
bits, and use that as your container.
- Use quickpkg to generate a bunch of bin pkgs from a prod machine,
use that to bootstrap a container.
- Probably some other exciting ideas on the list ;)
(2) Using Ubuntu LTS for production and Gentoo for development is
hit by subtile libjpeg incompatibilites and such.
I would advise, if possible, to make dev and prod as similar as
possible[1]. I'd be curious what blockers you think there are to this pattern.
Remember that "dev" is not "whatever your devs are using" but is
ideally some maintained environment; segmented from their daily driver computer (somehow).
(3) Distributing apps as VMs or docker: Even those tools advance and
become incompatible, right? And not suitable when for smaller Arm
devices.
I think if your apps are small and self-contained and easily rebuilt,
your (3) and (4) can be workable.
If you need 1000 dependencies at runtime, your containers are going to
be expensive to build, expensive to maintain, you are gonna have to
build them often (for security issues), it can be challenging to
support incremental builds and incremental updates...you generally
want a clearer problem statement to adopt this pain. Two problem
statements that might be worth it are below ;)
If you told me you had 100 different production environments, or
needed to support 12 different OSes, I'd tell you to use containers
(or similar)
If you told me you didn't control your production environment (because
users installed the software wherever) I'd tell you use containers (or similar)
(4) Flatpak: No experience, does it work well?
Flatpak is conceptually similar to your (3). I know you are basically
asking "does it work" and the answer is "probably", but see the other questions for (3). I suspect it's less about "does it work" and more
about "is some container deployment thing really a great idea."
Peter's comment about basically running your own fork of gentoo.git
and sort of 'importing the updates' is workable. Google did this for
debian testing (called project Rodete)[2]. I can't say it's a
particularly cheap solution (significant automation and testing
required) but I think as long as you are keeping up (I would advise
never falling more than 365d behind time.now() in your fork) then I
think it provides some benefits.
- You control when you take updates.
- You want to stay "close" to time.now() in the tree, since a
rolling distro is how things are tested.
- This buys you 365d or so to fix any problem you find.
- It nominally requires that you test against ::gentoo and ::your-gentoo-fork, so you find problems in ::gentoo before they are
pulled into your fork, giving you a heads up that you need to put work
in.
[0] FWIW this is basically what #gentoo-infra does on our boxes and
it's terrible and I would not recommend it to most people in the
modern era. Upgrade your stuff regularly.
[1] When I was at Google we had a hilarious outage because someone
switched login managers (gdm vs kdm) and kdm had a different default
umask somehow? Anyway it resulted in a critical component having the
wrong permissions and it caused a massive outage (luckily we had
sufficient redundancy that it was not user visible) but it was one of
the scariest outages I had ever seen. I was in charge of investigating
(being on the dev OS team at the time) and it was definitely very
difficult to figure out "what changed" to produce the bad build. We
stopped building on developer workstations soon after, FWIW.
[2] https://cloud.google.com/blog/topics/developers-practitioners/how-google-got-to-rolling-linux-releases-for-desktops
Many thanks for your detailed thoughs for sharing the rich
experiences on this! See below:
antarus:
On Mon, Jan 2, 2023 at 4:48 AM m1027 <m1027@posteo.net> wrote:
Hi and happy new year.
When we create apps on Gentoo they become easily incompatible for
older Gentoo systems in production where unattended remote world
updates are risky. This is due to new glibc, openssl-3 etc.
I wrote a very long reply, but I've removed most of it: I basically
have a few questions, and then some comments:
I don't quite grasp your problem statement, so I will repeat what I
think it is and you can confirm / deny.
- Your devs build using gentoo synced against some recent tree, they
have recent packages, and they build some software that you deploy to
prod.
Yes.
- Your prod machines are running gentoo synced against some recent
tree, but not upgraded (maybe only glsa-check runs) and so they are
running 'old' packages because you are afraid to update them[0]
Well, we did sync (without updading packages) in the past but today we
even fear to sync against recent trees. Without going into details,
as a rule of thumb, weekly or monthly sync + package updates work
near to perfect. (It's cool to see what a good job emerge does on our
own internal production systems.) Updating systems older than 12
months or so may, however, be a hugh task. And too risky for remote production systems of customers.
- Your software builds OK in dev, but when you deploy it in prod it breaks, because prod is really old, and your developments are using packages that are too new.
Exactly.
My main feedback here is:
- Your "build" environment should be like prod. You said you didn't
want to build "developer VMs" but I am unsure why. For example I run
Ubuntu and I do all my gentoo development (admittedly very little
these days)
in a systemd-nspawn container, and I have a few shell scripts to
mount everything and set it up (so it has a tree snapshot, some git
repos, some writable space etc.)
Okay, yes. That is way (1) I mentioned in my OP. It works indeed but
has the mentioned drawbacks: VMs and maintenance pile up, and for
each developer. And you don't know when there is the moment to
create a new VM. But yes it seems to me one of the ways to go:
*Before* creating a production system you need to freeze portage,
create dev VMs, and prevent updates on the VMs, too. (Freezing aka
not updating has many disadvantages, of course.)
- Your "prod" environment is too risky to upgrade, and you have
difficulty crafting builds that run in every prod environment. I think
this is fixable by making a build environment more like the prod environment.
The challenge here is that if you have not done that (kept the
copies of ebuilds around, the distfiles, etc) it can be challenging to "recreate" the existing older prod environments.
But if you do the above thing (where devs build in a container)
and you can make that container like the prod environments, then you
can enable devs to build for the prod environment (in a container on
their local machine) and get the outcome you want.
Not sure I got your point here. But yes, it comes down to what was
said above.
- Understand that not upgrading prod is like, to use a finance term, picking up pennies in front of a steamroller. It's a great strategy,
but eventually you will actually *need* to upgrade something. Maybe
for a critical security issue, maybe for a feature. Having a build environment that matches prod is good practice, you should do it, but
you should also really schedule maintenance for these prod nodes to
get them upgraded. (For physical machines, I've often seen businesses
just eat the risk and assume the machine will physically fail before
the steamroller comes, but this is less true with virtualized
environments that have longer real lifetimes.)
Yes, haha, I agree. And yes, I totally ignored backporting security
here, as well as the need that we might *require* a dependend
package upgrade (e.g. to fix a known memory leak). I left that out
for simlicity only.
So, what we've thought of so far is:
(1) Keeping outdated developer boxes around and compile there. We
would freeze portage against accidental emerge sync by creating a
git branch in /var/db/repos/gentoo. This feels hacky and requires a increating number of develper VMs. And sometimes we are hit by a
silent incompatibility we were not aware of.
In general when you build binaries for some target, you should build
on that target when possible. To me, this is the crux of your issue
(that you do not) and one of the main causes of your pain.
You will need to figure out a way to either:
- Upgrade the older environments to new packages.
- Build in copies of the older environments.
I actually expect the second one to take 1-2 sprints (so like 1 engineer month?)
- One sprint to make some scripts that makes a new production 'container'
- One sprint to sort of integrate that container into your dev
workflow, so devs build in the container instead of what they build in
now.
It might be more or less daunting depending on how many distinct
(unique?) prod environments you have (how many containers will you
actually need for good build coverage?), how experienced in Gentoo
your developers are, and how many artifacts from prod you have.
- A few crazy ideas are like:
- Snapshot an existing prod machine, strip of it machine-specific
bits, and use that as your container.
- Use quickpkg to generate a bunch of bin pkgs from a prod machine,
use that to bootstrap a container.
- Probably some other exciting ideas on the list ;)
Thanks for the enthusiasm on it. ;-) Well:
We cannot build (develop) on that exact target. Imagine hardware
being sold to customers. They just want/need a software update of
our app.
And, unfortunatelly we don't have hardware clones of all the
different customer's hardware at ours to build, test etc.
So, we come back on the question how to have a solid LTS-like
software OS / stack onto which newly compiled developer apps can be distributed and just work. And all this in Gentoo. :-)
(2) Using Ubuntu LTS for production and Gentoo for development is
hit by subtile libjpeg incompatibilites and such.
I would advise, if possible, to make dev and prod as similar as possible[1]. I'd be curious what blockers you think there are to this pattern.
Remember that "dev" is not "whatever your devs are using" but is
ideally some maintained environment; segmented from their daily driver computer (somehow).
That is again VMs per "release" and per dev, right? See above "way
(1)".
(3) Distributing apps as VMs or docker: Even those tools advance and become incompatible, right? And not suitable when for smaller Arm devices.
I think if your apps are small and self-contained and easily rebuilt,
your (3) and (4) can be workable.
If you need 1000 dependencies at runtime, your containers are going to
be expensive to build, expensive to maintain, you are gonna have to
build them often (for security issues), it can be challenging to
support incremental builds and incremental updates...you generally
want a clearer problem statement to adopt this pain. Two problem
statements that might be worth it are below ;)
If you told me you had 100 different production environments, or
needed to support 12 different OSes, I'd tell you to use containers
(or similar)
If you told me you didn't control your production environment (because users installed the software wherever) I'd tell you use containers (or similar)
(4) Flatpak: No experience, does it work well?
Flatpak is conceptually similar to your (3). I know you are basically asking "does it work" and the answer is "probably", but see the other questions for (3). I suspect it's less about "does it work" and more
about "is some container deployment thing really a great idea."
Well thanks for your comments on containers and flatpak. It's
motivating to investigate that further.
Admittedly, we've been sticking to natively built apps for reasons
that might not be relevant these days. (Hardware bound apps, bus
systems etc, performance reasons on IoT like devices, no real
experience in lean containers yet, only Qemu.)
Peter's comment about basically running your own fork of gentoo.git
and sort of 'importing the updates' is workable. Google did this for
debian testing (called project Rodete)[2]. I can't say it's a
particularly cheap solution (significant automation and testing
required) but I think as long as you are keeping up (I would advise
never falling more than 365d behind time.now() in your fork) then I
think it provides some benefits.
- You control when you take updates.
- You want to stay "close" to time.now() in the tree, since a
rolling distro is how things are tested.
- This buys you 365d or so to fix any problem you find.
- It nominally requires that you test against ::gentoo and ::your-gentoo-fork, so you find problems in ::gentoo before they are
pulled into your fork, giving you a heads up that you need to put work
in.
I haven't commented on Peter yet but yes I'll have a look on what he
added. Something tells me that distributing apps in a container
might be the cheaper way for us. We'll see.
[0] FWIW this is basically what #gentoo-infra does on our boxes and
it's terrible and I would not recommend it to most people in the
modern era. Upgrade your stuff regularly.
[1] When I was at Google we had a hilarious outage because someone
switched login managers (gdm vs kdm) and kdm had a different default
umask somehow? Anyway it resulted in a critical component having the
wrong permissions and it caused a massive outage (luckily we had
sufficient redundancy that it was not user visible) but it was one of
the scariest outages I had ever seen. I was in charge of investigating (being on the dev OS team at the time) and it was definitely very
difficult to figure out "what changed" to produce the bad build. We
stopped building on developer workstations soon after, FWIW.
[2] https://cloud.google.com/blog/topics/developers-practitioners/how-google-got-to-rolling-linux-releases-for-desktops
Thanks for sharing! Very interesting insights.
To sum up:
You described interesting ways to create and control own releases of
Gentoo. So production and developer systems could be aligned on
that. The effort depends.
Another way is containers.
Peter Stuge wrote:
Essentially you will be maintaining a private fork of gentoo.git,
If this seems too heavy handed then you can just as well do the reverse:
Maintain an overlay repo with the packages you care to control in the
state you care to have them, set that in the catalyst stage4.spec portage_overlay and add unwanted package versions in gentoo.git to
the package.mask directory used by catalyst.
This may sound complicated but it isn't bad at all.
For total control also make your own profile, e.g. based on embedded,
but that's not per se neccessary, only if the standard profiles has too
much conflicts with what you want in @system.
catalyst will rebuild @system according to spec file but with too much difference that just becomes annoying and feels more trouble than a controlled profile.
This approach falls somewhere between your options (1) and (5).
I am not complaining here. Hey, we are on rolling release. Some of
you may even know individual solutions to work around each of it.
However, we just may get into trouble when distributing newly
compiled apps (on new Gentoo systems) to older Gentoo systems. And
we don't know in advance. I am looking for the best way to avoid
that.
Wow, wasn't aware of catalyst at all. What a beast in terms of control.
(FYI: I enjoyed the links on catalyst you sent me directly.
Unfortunatelly I cannot answer you directly due to the default
TLS guarantee
While being able to build exact environments with catalyst I wonder
how it could help fixing the issue of my original post.
Whenever we need to deliver a updated app to customers whose OS is
too old (but updating it is too risky), we could either
a) keep evenly outdated dev build OSes around forever (oh no!), or
b) post our newly built app in a container (leaving the lovely native
world); and both ignore the fact that customers wish maintenance of
the entire OS actually, too.
So, ideally, there is c): In a hypothetic case we would prepare a
entire OS incl. our app (maybe via catalyst?) which would require
a bootloader-like mini-OS on the customer's side, to receive updates
over the internet, switch the OS at boot time, and fallback.
Whenever we need to deliver a updated app to customers whose OS is
too old (but updating it is too risky), we could either
a) keep evenly outdated dev build OSes around forever (oh no!), or
b) post our newly built app in a container (leaving the lovely native world); and both ignore the fact that customers wish maintenance of
the entire OS actually, too.
So, ideally, there is c): In a hypothetic case we would prepare a
entire OS incl. our app (maybe via catalyst?) which would require
a bootloader-like mini-OS on the customer's side, to receive updates
over the internet, switch the OS at boot time, and fallback.
I had a) in mind. Why "oh no!" ? I didn't mean forever.
catalyst building specific, "old-like" OSes on which you build new (binpkg) versions of your app to be installable by emerge on old customer OSes is admittedly a lot of work, at least initially.
frederik.pfautsch: > >>> So, ideally, there is c): In a hypothetic case we would preparea >>> entire OS incl. our app (maybe via catalyst?) which would require
updates over the internet, switch the OS at boot time, and >>> fallback.a bootloader-like mini-OS on the customer's side, to receive >>>
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 475 |
Nodes: | 16 (2 / 14) |
Uptime: | 19:01:59 |
Calls: | 9,487 |
Calls today: | 6 |
Files: | 13,617 |
Messages: | 6,121,093 |