Page 1 of 1

Understanding WeeDogLinux init operation

Posted: Sun Apr 04, 2021 11:41 am
by wiak

Understanding WeeDogLinux init operation

I haven't studied Puppy Linux's initrd/init (I looked a few years ago, but back then at least it was long, looked complex, and I was lazy). However, per the Puppy init readme:

The init script has been called the heart of Puppy Linux...

Puppy Linux, by default, has always used aufs for its layer functionality (until some relatively recent experiments with overlayfs).

WeeDogLinux initrd/init was designed with a view to using either aufs or overlayfs (though currently published versions use overlayfs, since I prefer that, though it is only a couple of code lines need modified for aufs use). For your convenience (as a potential developer), please find below an explanation of what is perhaps the core part of WeeDogLinux initrd/init creative design:
------------------------------------------------------------------------

Unique layer code routine used in WeeDogLinux init (as created July 2019).

I am writing this post to explain a bit about an important part of the work:

The init used in WeeDogLinux initrd/init

An understanding of this key portion should help a WeeDogLinux user better understand how it works (and thus how to modify its operation according to their own wishes and coding preferences and even if you wish to create a derived work in an alternative coding language).

The most important part of the original code, is actually quite short (in fact, the whole WDL init uses less than a third the lines of code of most), but, by this creative design, achieves a very flexible filesystem layering structure that effectively allows any number of layers in the layer structure (be it for overlayfs, or for alternative aufs implementation). However, current published version allows for 100 layers. If you need more, simply re-write to use NNN instead of NN for 1000 layers... but, for the most part, I would doubt the need or efficiency of that many! Of course you could just use a single numeric N for up to ten layers only, but I consider that too low if you want rollback facility, which the existing 100 layers capability usefully caters for.

In terms of acknowledgement to other distros, it does borrow from something I have seen in practice, which is that Debian Live uses numbers to order the layers. I haven't at all studied how Debian Live does that, but I created my own scheme, shown here but long downloadable in source code format, which has proved to be very efficient in practice.

In terms of implementation, I used shell script for the prototype of this creative work though that could easily be converted into pure C in the unlikely event that greater speed efficiency is required but that would be at the expense of readability, ease of modification and understanding, and difficulty in providing such great flexibility overall.

How it is done in WeeDogLinux init

In simplest to understand form (maybe) WeeDogLinux's _addlayers algorithm is basically (in rough english) as follows:

Code: Select all

Make a directory in tmpfs for mounting layer filesystems to.
while not all files done:
  Look through the directory where the filesystem modules are
  looking for .sfs module filesystems and raw module filesystems.
  For each one found, mount it ready for use in unionfs layers.
  Keep track (variable or array) of each found.
end_while_loop (i.e. keep doing the loop till finished)
Sort all the results found above in the order you want them in the layers.
mount the overlay (be it with aufs or overlayfs) to a tmpfs directory in RAM using the stored list of filesystem modules found above.

Earliest WeeDogLinux systems did the above and ended with a chroot into the root filesystem rather than having the usual switch_root approach (though I prefer the latter now).

Simple in the end (but takes a lot of work for the simple to become obvious - pity Puppy didn't think of this approach) - elegant and powerful in use.

Different 'Sort' mechanisms can of course be used. In WeeDogLinux itself numeric sorting (easy to rearrange) is preferred over more simple but less-flexible alphabetical.

In practice, it is relatively trivial for a C programmer to translate shell script into C (difficult translating C back to shell script though), the result being a derived or part-derived work. Indeed you can easily convert the _addlayer code below to C for even greater speed efficiency. I wouldn't recommend that here though; it is already very fast and efficient and shell script is easier for user-modification and understanding. Using C for any part of WDL init would simply obfuscate the design and make it less accessible/understandable to others.

1. I define a directory on which all layers will be mounted. As a default I used:

Code: Select all

layers_base=/mnt/layers

which is arranged to be in a tmpfs.

2. I called a function, which I named _addlayer (shown in shell script below), to determine the numbered sfs files (and/or numbered raw directories) to use for the filesystem layers, and stored that in a variable (I named the variable 'lower'). Followed by a 'layers' sort. This loop/flexible-sort is a key part of WeeDogLinux creative design. I will explain its operation in detail in a moment...

3. I sorted the list in variable lower and removed any duplicates (using reverse sort for overlayfs but using ascending sort for aufs version)

Code: Select all

# Sort resulting overlay 'lower' layers list
# add new NN item to overlay \$lower list, reverse sort the list, and mount NNfirstrib_rootfs	
lower="`for i in $lower; do echo $i; done | sort -ru`"  # sort the list and remove duplicates

4. Finally (for the overlayfs case), per https://www.kernel.org/doc/html/latest/ ... layfs.html, quote:

At mount time, the two directories given as mount options “lowerdir” and “upperdir” are combined into a merged directory:

mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,workdir=/work /merged

The shell script code I used to do this was:

Code: Select all

cd ${layers_base}	# Since this is where the overlay mountpoints are
# Combine the overlays with result in ${layers_base}/merged
mount -t overlay -o lowerdir=${firmware_modules_sfs}${lower},"${upper_work}" overlay_result merged

NOTE that ${firmware_modules_sfs} is an optional 00 numbered sfs or raw directory used to mount firmware and modules for the systems use (particularly useful is using a huge_kernel such as those Puppy Linux uses). I won't detail its operation here since it isn't important for overall understanding of the _addlayers function.
-----------------------------------------------------------------------

MAIN THING though is the creation/operation of the WeeDogLinux initrd/init function used to identify the (numbered) layers and to mount them, which in the shell script implementation is function _addlayers. I explain its basic operation as follows (as you read this, please refer to its shell code implementation - that code extract is also provided shortly below for your convenience):

1. Change to directory where you will mount the layers

Code: Select all

cd ${layers_base}	# Since this is where the overlay mountpoints are

2. Loop through all the files in the partition/directory being booted from (I used a for loop but while or similar could be used of course - we are going to be looking for sfs modules here, but in flexible WeeDog also for raw directories we want mounted to the layers).

Code: Select all

for addlayer in *; do

3. Use numeric part of file or directory name to determine that file(.sfs) or directory (raw) is to be one of the filesystem layers. The 'trick', indeed the key concept in WeeDog's boot init design, is elegantly simple being that the code recognises if the start of the filename is numeric because the result of numeric characters is always greater than 0. i.e. NN not equal to zero or, in other words, greater than zero (NN != 0) means numeric start to filename such as 01firstrib_rootfs.sfs or even raw directory 01firstrib_rootfs/:

Code: Select all

NN="${addlayer:0:2}" # gets first two characters and below checks they are numeric (-gt 00)
if [ "$NN" -gt 0 ] 2>/dev/null; then

4. Add the detected numeric named file/dir to the variable 'lower' that will be used in the layer (overlay or aufs) mount instruction.

Code: Select all

lower="${NN} ${lower}"

5. Physically mount the numbered sfs file (or the numbered raw directory if that is used instead) to the layers_base (i.e. to /mnt/layers):

Code: Select all

mount "${addlayer}" "${layers_base}/$NN"

# for the case of a numbered sfs file
or

Code: Select all

mount --bind "${addlayer}" "${layers_base}/$NN"

# for the case of a numbered raw (uncompressed) directory

6. That's the main part of the WeeDogLinux initrd/init design.

The WDL init goes on to switch_root to the merged filesystem. However, earlier versions ran the final init via a chroot instead, which remains a documented option (on github FirstRib site).

I hope this explanation helps those wishing to modify WeeDogLinux initrd/init or derive some other work from it. By the way, in the latest design of WeeDogLinux, this code is immediately editable (as an optional module) OUTSIDE of the initrd, so it is particularly easy to read and modify between boots using a simple text editor such as geany. A major part of WDL is in fact to make it as modular and user-accessible/easy-to-modify as possible, both as a build system and also during boot (both via optional plugins).

Of course, all WeeDogLinux code has been published and opensource (MIT licensed) for over a year, so any developer reasonably fluent in shell scripting, who is interested in copying any part of this work already has easy access to it online, but hopefully this further explanation will help those who are not so experienced in reading the existing shell script implementation of the design.

wiak
------------------------------------------------------------------------

The complete _addlayer function code extract follows:

The _addlayer function code actually used in the shell script implementation as of July 2019 modification (change to C as an exercise if you want more speed!!!... though I doubt you'll need it and so, as I say above, better to keep more user-accessible and thus easier-to-modify):

Code: Select all

# mount any NNsfs files or NNdir(s) to layers_base/NN layervfiscate
# and add to overlay "lower" list
_addlayer (){
  for addlayer in *; do
	NN="${addlayer:0:2}" # gets first two characters and below checks they are numeric (-gt 00)
	if [ "$NN" -gt 0 ] 2>/dev/null; then
		if [ "${addlayer##*.}" == "sfs" ]; then
			# layer to mount is an sfs file
			lower="${NN} ${lower}"
			mkdir -p "${layers_base}/$NN"
			# umount any previous lower precedence mount
			mountpoint -q "${layers_base}/$NN" && umount "${layers_base}/$NN"
			mount "${addlayer}" "${layers_base}/$NN"
		elif [ -d "$addlayer" ]; then
			# layer to mount is an uncompressed directory
			lower="${NN} ${lower}"
			mkdir -p "${layers_base}/$NN"
			# umount any previous lower precedence mount
			mountpoint -q "${layers_base}/$NN" && umount "${layers_base}/$NN"
			mount --bind "${addlayer}" "${layers_base}/$NN"
		fi
	fi
  done
  sync
  echo -e "\e[95mCurrent directory is `pwd`\e[0m" >/dev/console
  echo -e "\e[95mlower_accumulated is ${lower:-empty list}\e[0m" >/dev/console
}

As you can see, this critical function design is extremely simple in the end, but very efficient and flexible in practice (as I said, currently allows mixing any order of up to 100 layers of sfs OR raw directory filesystems, and via additional plugin capability WDL init allows further non numeric additions of either sfs or raw directory layers and again in any order of layer overwrite preference you desire). That goes for any and all WeeDogLinux variants, including the recent, soon to be further developed ultra-mini-modular LGO series, and even that one-off WDL_Slitaz production...

By the way, on looking again at it, I should get rid of the double equals (==) in if statements above, since that is an unnecessary 'bashism' (I think) that actually comes from my earlier habit of mainly programming in C... One of my many bad habits I'm sorry.
--------------------------------------------------------------------------------------------------------

NOTES:

I've seen pdfs of both of the following online, but do not know their legality. I own printed copies of the two books.

Quote from "The UNIX Programming Environment":

The crucial observation with both zap and idiff (C implementations of well-known shell script commands) is that most of the hard work has been done by someone else... It's worth watching for opportunities to build on someone else's labor instead of doing it your self - it's a cheap way to be more productive.

A not quite so old book that effectively showed how easy it was to port shell scripts to C was: "Beginning Linux Programming", by Neil Matthew and Richard Stones. Publ by Wrox Press. Quote:

Our first discussion of this application occurs at the end of Chapter 2 and shows how a fairly large
shell script is organized, how the shell deals with user input, and how it can construct menus and store
and search data.
After recapping the basic concepts of compiling programs, linking to libraries, and accessing the online
manuals, you will take a sojourn into shells. You then move into C programming, where we cover work-
ing with files, getting information from the Linux environment, dealing with terminal input and output,
and the curses library (which makes interactive input and output more tractable). You’re then ready to
tackle re-implementing the CD application in C. The application design remains the same...

By the way, in case you imagine that WDL is a one-man developer distro. Well, that is partly true - I develop the core build scripts and it is a for-my-family distro (so in that sense I suppose its model is that of benevolent dictator... but with an important twist (read on). However, as I say above, the whole system is designed to be modular and entirely, to a large extent, open to major user-contributions via its flexible plugin system. In practice, pretty much all current WDL_Void Linux flavours are due to the build plugin contributions of rockedge (indeed, I rely on rockedge for that because I myself forget most details of using the excellent xbps package manager and runit init-system employed by Void). Furthermore, I have two young sons who are computer literate and use WeeDogLinux as their daily workstations, and of course I am training them to maintain the overall core system because I want more time to drink coffee. Moreover, my partner uses WDL_Arch64 as the main operating system for her business needs - that forces me to maintain it. Having said that, I purposively use upstream repos (Arch Linux for WDL_Arch64) along with their official package managers rather than some home-baked mechanism of my own, which greatly aids me in maintaining WDL distros' security and reliability. You can expect WeeDogLinux (which has its own domain) to be around, therefore, in many a shape and form, for a long time.

wiak


Re: Understanding WeeDogLinux init operation

Posted: Sun Apr 04, 2021 5:35 pm
by rockedge

@wiak
Blown away!
I am getting more excited to dig in and try to make something out of this cool stuff


Re: Understanding WeeDogLinux init operation

Posted: Sun Apr 04, 2021 9:07 pm
by wiak

So there is a demonstrated side effect of WeeDogLinux initrd _addlayer function design. It can be used to easily boot a traditional puppy system, which has a main rootfs called "puppy_XXX.sfs" and a number of other sfs files. I have already posted about that (and rockedge I believe has duplicated that):

I posted about that in Dec 2020, showing via attached images how that was done/arranged (to create both a WDL_Fossapup64 and a WDL_BionicPup32). At below link:

viewtopic.php?p=13074#p13074

I've now re-attached the main descriptive image from then to this post (using the WDL_BionicPup32 screenshot as the exemplar for how WDL init so easily does this via its while-no-files-left-to-examine for extension .sfs search loop - but being a WeeDog it could also use uncompressed puppyXXX, adrv, fdrv, zdrv, ydrv, and more, if you wanted and that would also work fine with this "loop/compare/sort/layer-arrange" creative design ... Explanation of how WDL init successfully can boot Puppy follows in words:

1. First, puppy_XXX.sfs has to be identified since the other pup-related sfs files were designed to overwrite that in the unionfs layers. WDL init _addlayer function makes that elegantly easy; all that needed to be done was to put a 01 at the front (i.e. becomes 01puppy_XXX.sfs - though in test I actually called it 01firstrib_rootfs.sfs since 'puppy' part of the name is irrelevant in this more elegant numeric based filename and layer position identifying system).

2. The other sfs files found in the directory simply needed to be numbered in whatever order wanted: in quick run test illustrated I just renamed adrv into 02adrv; fdrv into 03fdrv; zdrv into 04zdrv - I think you get the picture (see image attached).

Of course, if the user of this WeeDogLinux init design insisted on the importance of 'puppy' name in traditional Puppy filename form, and insisted on the WDL _addlayers function using alphabetic names of the other pup-sfs files for the layer order, then that could be derived by slightly modifying _addlayers by:

1. look for 'puppy' string in .sfs filename and fix its position in _addlayers variable lower
2. use the alphabetic order of the remaining found .sfs files (in _addlayers variable lower) instead of putting numeric values at front of their names

Furthermore, for those who prefer aufs - which is an option in WDL already(being just a minor adjustment per below):

3. the line that reverse sorted in WDL init should simply be flagged, for aufs use, to not do 'reverse' (infact I have a version of WDL that uses aufs alternative, exactly like that - involved only a couple of lines of code difference therefore).

So as a challenge, you might like to modify WDL init as above to use alphabetic sort order rather than the more flexible numeric filename positioning method. Tiny change to the design, but I don't recommend the wasted effort. Such very minor changes to the _addlayer function 'for loop' (i.e. while no files left to be examined in the bootfrom directory) would be silly, unnecessary, and less flexible. Because, as things stand, in unmodified WDL init design, ANY sfs (or raw uncompressed directory for that matter) can be positioned in ANY position of the layering structure simply by giving its filename the appropriate numeric value. And as I've now explained, the puppyXXX.sfs can keep the 'puppy' part of the name and even itself be repositioned anywhere at all in the layer order (simply by appropriate numeric NN put in front of filename... nice, eh...).

In other words, the key WeeDogLinux creative work design is simply that "search for .sfs loop" followed by the "sort" (being reverse sort for overlayfs use, and normal sort for aufs). It is as simple as that really - hence the small size of WDL init. But... the flexibility (as shown by the earlier posted WDL_BionicPup32 and WDL_FossaPup64 creations) is the WDL init's use of a numeric to determine layer position rather than just rudimentary alphabetic sort (where two-digit numeric allows 100 layers, and 3-digit numeric alternative can allow 1000 layers...). ;-)

So am I suggesting that WDL loop search for .sfs files design should be adopted by Puppy Linux? Not at all, unless you want to derive your Puppy init from WDL design (modified or otherwise) - it comes across that Puppy prides itself on being an independent distro, so better in that case not to make a modified/re-hashed WDL init its controlling heart. On the otherhand... ... ... ... WDL initrd/init design can certainly (as has been shown) be used to control a Puppy system, providing it with all the basic frugal install layering facilities that it could need, plus the tons of optional extra functionality via WDL init's overall less-than-400 lines of user-friendly shell code. I will certainly myself be producing WDL_Puppy isos for those who prefer WDL init's frugal install versatility.

wiak


Re: Understanding WeeDogLinux init operation

Posted: Mon Apr 05, 2021 2:35 am
by wiak

Continuing this examination of how WeeDogLinux initrd/init operates.

You may remember, for overlayfs use, WDL init employs the following code lines

Code: Select all

cd ${layers_base}	# Since this is where the overlay mountpoints are
# Combine the overlays with result in ${layers_base}/merged
mount -t overlay -o lowerdir=${firmware_modules_sfs}${lower},"${upper_work}" overlay_result merged

and that for temporary simplicity I bypassed any detailed explanation of the appearance of variable "${firmware_modules_sfs}", simply stating:

NOTE that ${firmware_modules_sfs} is an optional 00 numbered sfs or raw directory used to mount firmware and modules for the systems use (particularly useful is using a huge_kernel such as those Puppy Linux uses). I won't detail its operation here since it isn't important for overall understanding of the _addlayers function.

So what does variable ${firmware_modules_sfs} contain, and what is it's purpose?

In practice it is either empty, or it includes the filename of an external filesystem that contains the firmware and modules that would be needed (in /lib or sometimes /usr/lib) by any huge_style_of_kernel (huge_kernel types being commonly used in Puppy world). Or more precisely, it contains the likes of sfs name: "00firmware_modules.sfs:" (note well that end colon is included in the contents string because colons are used to separate the filesystems to be included in the mount overlayfs command above.

00firstrib_modules.sfs, when used, thus contains the firmware and modules inside its internal /lib/modules and /lib/firmware (or for Void/Arch structure, inside /usr/lib/modules and /usr/lib/firmware) ready for merging into the overall initrd internal filesystem structure (and for later use of the main rootfs similarly).

i.e. the mount overlayfs line above uses this reference for determining what is to be merged:

Code: Select all

lowerdir=${firmware_modules_sfs}${lower}

Note that I chose to put that firmware_modules into the lowest layer (that's a bit inflexible so I may revisit that part of the design someday, but I chose 'lowest' so higher layers could overwrite it with new modules).

In the context of possible re-use of Puppy Linux components by forum members, the code segment related to ${firmware_modules_sfs} provides WDL with that very useful and important optional bolt-on firmware/modules facility.

In terms of the WDL creative work, that optional extra functionality is provided via what is another particularly unique piece of WDL init magic (being via what is actually a modified copy of the previously explained _addlayer function "while-no-files-left-to-examine for extension .sfs search loop"):

Code: Select all

for fm in *;do
NN=${fm:0:2}
if [ "$NN" = "00" ];then
	if [ "${fm##*.}" = "sfs" ];then
		fw_modules_lowest="00firmware_modules:"  ##################################
		mkdir -p ${layers_base}/00firmware_modules /usr/lib/modules
		mount "${mountfrom}"/${fm} ${layers_base}/00firmware_modules		
		sleep 1  # may not be required
		if [ "fwmod" = "usrlib" ];then
			mount --bind ${layers_base}/00firmware_modules/usr/lib/modules /usr/lib/modules  # needed for overlayfs module
		else
			mount --bind ${layers_base}/00firmware_modules/lib/modules /usr/lib/modules  # default (as in debian and most pups)
		fi

Here in some detail is how that particularly unique WDL mechanism works:

1. A directory is created in the tmpfs layers_base location:

Code: Select all

mkdir -p ${layers_base}/00firmware_modules /usr/lib/modules

2. As described above, if a 00firmware_modules.sfs is found, variable ${firmware_modules_sfs} stores the name of that squashed filesystem with a colon at the end, for later use merging into the overlayfs in front of ${lower}, which itself contains a colon-separated list of layers to merge.

The bigger code section (for fm in *; do loop) shown above then first:

3. mounts the squashed filesystem named 00firmware_modules.sfs onto the tmpfs directory created at location /mnt/layers/00firmware_modules:

Code: Select all

mount "${mountfrom}"/${fm} ${layers_base}/00firmware_modules

4. But the really unique magic comes next... Once the organisation of the firmware_modules structure is identified via empty or optional kernel commandline argument variable "usrlib" (be it debian_puppy /lib/modules style of layout or instead, say, the /usr/lib/modules of Void or Arch Linux), WDL init then mount binds the already mounted 00firmware_modules.sfs to the initrd /lib location (for case of debian/puppy structure):

Code: Select all

mount --bind ${layers_base}/00firmware_modules/lib/modules /usr/lib/modules  # default (as in debian and most pups)

Owing to that special bind mount, squashed filesystem content (in this use, the firmware and modules) will accordingly remain available to the underlying initrd and ALSO to the main root_filesystem that is later used following the switch_root that finally takes place.

That's enough for now, and hopefully enough to get WDL user/creators a deeper understanding of WDL's core initrd/init.

Of course, there is far more functionality and unique facilities build into the rest of that init, but I'll leave details on their operation to sometime later.
That "while-no-files-left-to-examine for extension .sfs search loop" _addlayer function type algorithm (along with the above bolt-on firmware_module creation is a particularly unique part of WeeDogLinux boot/control design.

Thus far, from experience of its use, I would recommend the use of the whole or part of this MIT licensed creative work (function) in derivations for other distros that would like to be able to handle multiple sfs modules, rather than any clumsy hard-coded individually named sfs layer mounts, and more so since the unique WDL loop searching code function includes ability to use uncompressed raw directories instead (or alongside any squashed filesystem layers).

NOTE that in the old-forum-documented earliest version of bootable WeeDogLinux,

where a chroot was used rather than a switch_root, the whole main root_filesystem was actually left available to the initrd (rather than needing a switch_root at all), but at that time a symlink was used to create the bolt-on firmware/modules magic (from the mounted squash filesystem to the initrd /lib/modules and so on location, rather than a bind mount). To be honest, I was surprised back then my symlink method actually worked... I abandoned that chroot version in favour of using switch_root, but for history's sake here is irrelevant little extract of the old chroot-related code (extract from very old/obsoleted build_firstrib_initramfs01_ver006.sh chroot version):

Code: Select all

# Layer "merged" contains the overlay result of all above layers merged together 
# Init does a chroot to merged and starts a job control shell
# If provided, firstrib_firmware_modules.sfs (e.g. renamed from bionicpup zdrv) is also loaded and available
...
mkdir -p lower firmware_modules middle merged
# If using firstrib_firmware_modules.sfs do the following
# Otherwise you need to make sure required /usr/lib/firmware and modules are in firstrib_rootfs
if [ -s "/mnt/$boot_partition/firstrib/firstrib_firmware_modules.sfs" ];then
	mount /mnt/$boot_partition/firstrib/firstrib_firmware_modules.sfs firmware_modules
	[ -L /usr/lib/firmware ] || ln -sv /mnt/layers/firmware_modules/lib/firmware /usr/lib/firmware
	[ -L /usr/lib/modules ] || ln -sv /mnt/layers/firmware_modules/lib/modules /usr/lib/modules
fi
...
# and need a bind mount of firmware_modules for symlinks /usr/lib/firmware and /usr/lib/modules to find it
[ -s "/mnt/$boot_partition/firstrib/firstrib_firmware_modules.sfs" ] && mount --bind /mnt/layers/firmware_modules merged/mnt/layers/firmware_modules

# Change rootfs / to /mnt/layers/merged and start a job control shell
while true;do chroot merged setsid sh -c 'exec sh </dev/tty1 >/dev/tty1 2>&1';done

Didn't use a proper getty at that early design stage. Overall it was a messy early creative design idea/approach - you are best to ignore it. The change to switch_root version instead really opened the design up for greater community involvement via less obscure coding techniques, much greater flexibility, and greatly improved user plugin ability.

wiak