Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Consider retiring the PI1s ARMv6 (downgrading support to "experimental") #1677

Closed
refack opened this issue Jan 31, 2019 · 63 comments
Closed
Labels

Comments

@refack
Copy link
Contributor

refack commented Jan 31, 2019

PI1 load
image

PI2 for comparison:
image

@refack
Copy link
Contributor Author

refack commented Jan 31, 2019

/CC @rvagg @nodejs/platform-arm

@refack refack added infra ci-public platform:arm ci-change PSA of configuration changes labels Jan 31, 2019
@refack refack changed the title RFC: Retire PI1s RFC: Consider retiring the PI1s Jan 31, 2019
@Trott
Copy link
Member

Trott commented Feb 3, 2019

Is this worth putting on the Build WG agenda? Might be a good idea to lay out the case for retiring them in a sentence or two.

@rvagg
Copy link
Member

rvagg commented Feb 13, 2019

FYI I spent the day tinkering with the Pi1's. Ended up reprovisioning 4 of the Pi1's entirely and replacing two of the SD cards. So currently Jenkins sees them all as online (first time in a long while), but I'm dubious about a few of them. I've been keeping a log for the past 18 months of maintenance so I can better track which Pi's have repeating problems and might suggest more fundamental problems than just a dodgy OS or SD card. Over the next few days we should keep an eye on repeat failures on individual Pi's and take repeat offenders offline. If any of the ones I fixed up today show recurring problems, particularly ones that I replaced SD cards in, then it might be time to be retiring hardware.

@rvagg
Copy link
Member

rvagg commented Feb 15, 2019

graph

I think we're in a better place with the whole compliment online, and there's a ton of green so doesn't appear to be any problems (yet) with the machines I've brought back online.

This is not to nullify the original point though, it's certainly worth considering retirement, especially as the test suite grows and these are holding us back. Mean execution time for test-binary-arm seems to be ~42 minutes with the current test suite. That's up from under 30 minutes a couple of years ago.

If we drop the Pi1's, we're essentially dropping ARMv6 support unless we want to go an emulation route (that might be more pain than the Pi1's are though!). Maybe it's time though? There's very little ARMv6 hardware being shipped anymore, it's nearly 20 years old and the main consumers are users with old devices, like Pi1's.

Pretty much the only market data we have is the download numbers, see 2019 so far below. I honestly don't know how to frame this. ARMv6 is only 0.08%, but that's still more than ARM64 (whodathunk?) and double any of the IBM platforms. What do we do with that information?

arch_distribution_2019

(This is ordered by total recorded download count, not just for the time period I've selected, hence s390x being last even though it beats ppc64 which has been downloaded more times in its lifetime).

@rvagg rvagg changed the title RFC: Consider retiring the PI1s RFC: Consider retiring the PI1s ARMv6 Feb 19, 2019
@Trott
Copy link
Member

Trott commented Feb 19, 2019

Since this has evolved (during today's Build WG meeting, anyway) into a discussion about dropping support for armv6/Pi 1 devices entirely: /ping @nodejs/hardware

@refack refack changed the title RFC: Consider retiring the PI1s ARMv6 RFC: Downgrading support for PI1 (ARM6) to experimental Feb 19, 2019
@rvagg
Copy link
Member

rvagg commented Feb 19, 2019

We are considering dropping official ARMv6 support for Node 12 similar to x86 being dropped for Node 10. The cost of running our growing test suite and maintaining the hardware for it isn't small and each LTS release line locks us in for 3 years.

This would mean that Raspberry Pi 1 and 1+ would not have official binaries available for download from nodejs.org, but it would not prevent anyone from offering unofficial ones. Unfortunately building directly on ARMv6 hardware takes a long time and cross-compiling is extremely complicated.

We would like feedback from who this might impact so we can better understand the costs to users, we have very little insight.

Node 10 and below would still support ARMv6 and still ship binaries on nodejs.org for the duration of their support lifetime.

@dceejay
Copy link

dceejay commented Feb 19, 2019

I think this would also affect Pi0 and Pi0w

@knolleary
Copy link

knolleary commented Feb 19, 2019

The Pi 1 and 1+ have both been superseded by newer ARMv7 models.

The Pi Zero is ARMv6 and from its site:

End of life of the Raspberry Pi Zero is currently stated as being not before January 2022.

(edited to add link: see Specification tab on https://www.raspberrypi.org/products/raspberry-pi-zero/)

If that means this gen of the Pi Zero is here to stay for a while yet, there will be a 8 month window at least once Node 10 reaches EOL where there will be no official build of Node for the Pi Zero. Of course they might refresh the hardware before then.

We in the Node-RED project do have a user-base running on Pis of all shapes and sizes. The Pi Zero is pretty under powered, so I don't believe there are that many users on that particular device, but there will be some. We can try to gauge interest from our user community on this topic.

@boneskull
Copy link
Contributor

I use node on Pi0 boards. They’re definitely popular. not sure about node on them though—it’s awful slow.

still, if there’s a path forward for armv6, I’d like to see support retained.

Unfortunately this seems like one of those things that will be tough to gather much feedback on. I’ll try to put some feelers out in the nodebots community.

@francovp
Copy link

francovp commented Feb 20, 2019

I use nodejs on a A LOT of Pi0 boards. This would affect me for sure if I want to update node version on those boards in the future :/

@thisdavej
Copy link

I'm a huge fan of Node with RPi and have written several guides (downloads exceeding 500,000) teaching people how to install and use Node on the RPi. While the RPi1 is obsolescing, the RPi0 (also based on the ARMv6 architecture) is alive and well and used in Node/IoT projects.

The PI Zero performance is quite acceptable for hobby IoT projects when using Node in conjunction with Raspbian Lite. It is the small form factor that makes the Pi Zero with Node so compelling and this helps make up for any deficits in the performance arena.

I hope support will continue for ARMv6 since the lack of support will alienate current and future users of Node.js until such time that the Pi Zero architecture is updated beyond ARMv6.

@Trott
Copy link
Member

Trott commented Feb 20, 2019

Am I correctly interpreting this table to indicate that Pi Zero devices have faster CPUs and more RAM than Pi 1 devices? And faster CPUs (but not more RAM) than Pi 2 devices?

And that a Pi Zero device was introduced/released as recently as last year?

I don't know how much effort/pain it will be or how effective it will be, but I wonder if swapping out the Pi 1 devices in CI for Pi Zero devices might be a way to reduce our ARMv6 pain? (Doesn't help with cross-compiling, though.)

@nebrius
Copy link

nebrius commented Feb 20, 2019

Echoing what's been said above, the Pi Zero and Zero W are both still in production and are armv6. In my experience as the author of Raspi IO, which brings Raspberry Pi to the Johnny-Five Node.js robotics framework, a sizeable portion of my users use the Zero/Zero W. I don't have exact stats unfortunately as I don't gather that sort of data, but based on issues filed/people who reach out to me to ask questions/show off projects, I'd estimate that Zero/Zero W users make up anywhere from 25-40% of my userbase.

I would recommend waiting to drop armv6 support until the Zero and Zero W are EOLed which are stated as "being not before January 2022." I don't know when the Zero W is slated to be EOLed, but I'd imagine it's at most a year after the Zero.

@nebrius
Copy link

nebrius commented Feb 20, 2019

I also just discovered that the original compute module, which has the same processor as the Pi 1, is still available for sale. I suspect it will likely be retired soon though (and admittedly I thought it already had been).

@nebrius
Copy link

nebrius commented Feb 20, 2019

I also support @Trott's recommendation to replace the aging RPi1s with RPi Zero W's because they're still in production and, as mentioned, they have a faster CPU.

@boneskull
Copy link
Contributor

I can donate some lightly used hardware (Pi0-W, memory card, OTG dongle, etc) if you tell me where to ship it.

@bnb
Copy link

bnb commented Feb 20, 2019

I'm happy to purchase + ship new Pi0/Pi0w as well if it would be helpful.

@nebrius
Copy link

nebrius commented Feb 20, 2019

+1 to Tierney's suggestion. I'm sure we can get Microsoft to sponsor some hardware, if you're interested.

@dtex
Copy link

dtex commented Feb 20, 2019

@thisdavej @boneskull In practice, do y'all use the official node binaries? @nebrius' raspi-io wiki pages point users to the Nodesource binaries.

@thisdavej
Copy link

My Beginner's Guide to Installing Node.js on a Raspberry PI also points people to use the NodeSource binaries. I direct people to articles like this one which instructs people to download binaries from https://nodejs.org/dist/ when they are seeking to run Node on the Pi Zero W or an RPi1.

@rvagg
Copy link
Member

rvagg commented Feb 21, 2019

Thanks for the feedback so far folks, it'd be great to hear more if others are reading this, we've had such a hard time connecting with the Node+ARM user community so we end up making guesses and assumptions.

To be honest, I hadn't even considered the Zero but that does seem like it might be a compelling reason to continue support if we can solve some speed problems we're facing.

I like the idea of ditching the Pi 1 B+'s with Zeros but the challenge is that we NFS-boot (edit: NFS-root is probably more accurate, we still load the initial bootcode via SD for full Pi compatibility) everything now and it's given us a lot more stability than relying on SD cards. NFS-boot without an ethernet port is going to be a bit of a challenge. Since the Zero can act as a device over USB and it's apparently possible to NFS-boot over USB, we might be able to come up with a novel setup for a cluster of Zeros.

I've ordered a Zero W to do some experimenting with. If practical, maybe we do another community-donor drive to get a cluster of them and aim for ~18 of them to future-proof ourselves a bit better. It'll depend on performance and the practicality of running a cluster with our infra. Procuring them might be a bit tricky since it seems that the Pi Foundation are enforcing a 1-per-customer limit on retailers at the moment.

@boneskull
Copy link
Contributor

I always use NodeSource binaries on Linux.

@refack
Copy link
Contributor Author

refack commented Feb 26, 2019

@mhdawson do those times include compilation? As a point of comparison, ATM we only actually test the cross-compiled binary on the PI1, with a multiplicity of 6.
Our typical test jobs take ~40m, so 40 * 6 = ~240m which is double what you got.

@mhdawson
Copy link
Member

mhdawson commented Feb 26, 2019

@refack, unfortunately, I could not even get them to compile on the PI zero. They would take a very long time and then the compile was killed. Most likely I think by the OOM killer.

I'll have to get the specific command line when I'm back home but what I did instead was:

  1. do git checkout in advance (took long time)
  2. pull down and unzip the node.js ARM6 binary from nodejs.org (In advance)
  3. use tools/test.py to run the tests. I ran the "default" target which is a subset of what we normally test.

The times shown were only for step 3 above.

So I think we need to understand what is not covered by default (which I know includes addon tests, and more) as well as the time taken to clone etc. to be able to compare.

@refack
Copy link
Contributor Author

refack commented Feb 26, 2019

3. use tools/test.py to run the tests. I ran the "default" target which is a subset of what we normally test.

The times shown were only for step 3 above.

So I think we need to understand what is not covered by default (which I know includes addon tests, and more) as well as the time taken to clone etc. to be able to compare.

That's good news. AFAIK make test is a superset of make test-ci. So what you are describing is encouraging preliminary results.

IMHO with the support of the community we could make progress with such a migration.

@mhdawson
Copy link
Member

mhdawson commented Feb 26, 2019

@refack to clarify I did not run make test. I ran tools/test.py specifying default as the tests to run. I'm pretty sure that is a subset of test-ci so I still think we need more info to have a good comparison.

@Trott
Copy link
Member

Trott commented Feb 26, 2019

@refack to clarify I did not run make test. I ran tools/test.py specifying default as the tests to run. I'm pretty sure that is a subset of test-ci so I still think we need more info to have a good comparison.

At the current time, the tests that are run by test-ci that are not run by tools/test.py default are:

  • addons
  • doctool
  • js-native-api
  • node-api

@refack
Copy link
Contributor Author

refack commented Feb 26, 2019

I ran tools/test.py specifying default as the tests to run.

Ok. Less clear indication, but still in the ballpark IMO...

@mhdawson
Copy link
Member

later tonight I'll disable from the farm and then log into test-requireio--mhdawson-debian9-armv6l--pi1p-1 and run the same set of tests as I did on the PiZero and that should give us a closer comparison.

@mhdawson
Copy link
Member

mhdawson commented Feb 27, 2019

Command line for reference:

time tools/test.py -j 1 -p tap --logfile test.tap --mode=release --flaky-tests=dontcare default >withj1

@mhdawson
Copy link
Member

Have had some trouble getting the output of time. The ssh connection seems to drop at some point during the run. Had tried running in the background but command I used still did not pipe time output to file. Running again and hope to get it this time.

@mhdawson
Copy link
Member

ok equivalent run on the Pi1

real 184m8.136s
user 152m9.797s
sys 13m8.520s

Which means the Pi Zero does show to be about 37% faster, which makes sense given the increased clock frequency. Given that there are a few mins for the git work, I guess we'd expect 6 PI zeros to reduce the time down to ~32 mins instead of the current ~40 mins. That excludes the time for the cross-compile as well.

@rvagg
Copy link
Member

rvagg commented Mar 12, 2019

I got mine but have only had a brief play. I'm currently stuck on how I'd get a cluster to boot via NFS at will (including reboot), it's not very straightforward and quite hacky at the moment.

Here's a thought I've been toying with, and it came up with x86 Linux support that we dropped but apparently still ship Docker images for (!). We could set up a parallel project, "unofficial builds", maybe as part of nodejs/build, but it might work better as an independent project that outsiders can contribute to and "own" in a sense. We could get unofficial.nodejs.org (or similar) to point to a place where binaries are put that are part of this grey area of builds that are wanted, but don't meet our threshold for support in our stretched resources here at nodejs/build. I could imagine communities owning their bit, x86, armv6, and could even expand to more obscure binary types, like x64-musl for Alpine so the docker-node folks don't need to compile in-containerfor each release or x64-libressl for some of the *BSD folks.

Such a project would have not over-burden the nodejs/build team because in being "unofficial", if it's broken then it's up to users to fix it and it certainly won't stop Node.js releases from moving forward.

So my question here is: is it just the binaries you care about? If you could continue to get binaries for each release from some source then do you care much if we don't test every commit against armv6 and don't have armv6l binaries on nodejs.org/dist?

@nebrius
Copy link

nebrius commented Mar 12, 2019

@rvagg do you have historical data on test failures on armv6? I'd be curious to know if those failures closely tracked failures on armv7+ or not. If they do (which seems likely to me), then I think moving it to a new "unofficial builds" project would be fine.

The binaries are the big thing I care about, yes. I would be fine getting them from another source if that source is reliable (i.e. not having to wait days/weeks for the latest release after the official builds are released).

@mhdawson
Copy link
Member

I agree with @rvagg on the boot front. I think it would likely require some scripting as well as programatic control over the power to the USB port powering the Pi Zero. Not impossible (I've already bought a USB hub that switches could be wired into) but would definitely require some work.

@rvagg
Copy link
Member

rvagg commented Mar 13, 2019

@nebrius no data unfortunately but I can't remember the last time we had something serious that was isolated to ARMv6 aside from resource constraint problems that we regularly have (some tests need skipping because they test allocation of lots of memory, for example). My subjective impression is that there's a tight coupling between ARMv6 and ARMv7 for any bugs we've had in the past and I'd be confident that in the near future at least this would continue. It starts to break down if V8 de-prioritises ARMv6 (and I don't know the status of their testing), same goes for OpenSSL although they have more natural pressure to retain good support.

@rvagg
Copy link
Member

rvagg commented Mar 13, 2019

So we had a discussion about this in our Build WG meeting today and the approach we'd like to propose goes something like this for Node.js 12+ (everything remains as-is for <=11).

  1. Move ARMv6 to "Experimental", which means that we don't test every commit in our CI infrastructure, and therefore don't ship official binaries.
  2. "Experimental" comes with a caveat that we could, at any time, turn it back into a Tier 2 supported platform if we come up with solutions that ease the burden on this team. So perhaps someone invests time and comes up with a magical qemu solution that is easy and efficient so we opt to take it back on again.
  3. We try to spin up an "unofficial builds" project like I mentioned a couple of comments above ^. This would be an arms-length project such that breakages and failure to deliver don't fall back on either the Build WG or the TSC, but rather it's a community-driven project, where the community is comprised of people like those in this thread and people focused on other compilation targets. Build can lend some minimal resources, a single server would get it off the ground I think. But it would need to stand alone. The docker-node project is an example, there's almost no overlap between people who push that forward and Build or even much of the nodejs/node collaborator base. Plus docker-node have developed all of the valuable relationship they need to make it official and well supported. Docker releases of Node are part of our normal release schedule but it's a throw-it-over-the-wall approach where releasers just give a trigger for the docker-node folks to take over with.

I'll outline the "unofficial builds" idea a bit more in an issue or PR to this repo in the near future. For now though, know that we want to continue shipping binaries but we'd like to reduce the support burden on this team and the way to do that is to (1) decouple ARMv6 from our test-all-commits infrastructure and (2) decouple it from the critical release infrastructure (where breakage can mean lost sleep).

I don't think Build really has the last say on this, it's ultimately up to the TSC to decide what burden the project wishes to take on. But it'll probably end up depending entirely on what Build says it can handle.

@vielmetti
Copy link

This is an interesting post from a Microsoft employee on the challenges of cross-build of Arm images on Arm hardware, noting in particular ARMv6 issues.

https://apebox.org/wordpress/linux/1281

@rvagg
Copy link
Member

rvagg commented Mar 13, 2019

The most interesting part of that post for me is that they don't seem to even bother testing on real ARMv{6,7} hardware, they just run the binaries on an ARM64 host in a Raspbian chroot. The problem being addressed come from missing instructions that have to be trapped and emulated by the kernel, causing delay. The "solution" is simply to emulate ARM64 so it can run in a single core, I guess this has something to do with core affinity and the cache advantage, or something like that? But it's still ARM64. That's not an approach we've even considered as an option but I guess it is? I have some doubts about the utility of such testing, does it get you close enough to be even worth doing? Something to consider at least.

@rvagg
Copy link
Member

rvagg commented Apr 24, 2019

OK folks, so this has panned in the following way:

  • ARMv6 was unfortunately demoted to "Experimental" for Node 12: https://github.com/nodejs/node/blob/master/BUILDING.md#platform-list
  • With Experimental status, binaries are no longer being released for ARMv6 to nodejs.org for Node 12 and later.
  • Our Raspberry Pi 1 B+ set has been pulled from CI for Node 12+ code (still active for prior), which has helped with our test suite speed problems. Unfortunately there is no longer any ARMv6 testing occurring on Node commits for Node 12+ (there's still two different ARMv7 variants being tested).

But it's not all bad news. I'm attempting to start an "unofficial-builds" project as I mentioned earlier in this thread. It's producing ARMv6 binaries automatically following every release. The catch is that it's automatic, so may break and may be delayed. The intention is also not for the Build Working Group to be the owner of it, it shouldn't stretch Build resources at all because they're already stretched.

The project is housed at https://github.com/nodejs/unofficial-builds and it's looking for contributors and people to help maintain it. It has a single server that's (so far) pumping out 3 types of binaries that folks have been asking for but the core project (via Build) hasn't been able to accomodate: linux-x86, linux-x64-musl and linux-armv6. Those binaries are published to https://unofficial-builds.nodejs.org/ where you'll find a /download/ directory that's very similar to nodejs.org/download, complete with index.tab and index.json (perhaps someone could talk Jordan to hooking nvm up to it one day).

So this issue is considered closed as far as Build is concerned but I'd encourage you to consider whether there are ways you might be able to contribute to making unofficial-builds sustainable, even if that's just clicking 'Watch' and helping dealing with easy issues as they come in. With no ARMv6 testing of new commits, the users of these binaries are going to have to be the test platform and will have to help report and fix problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests