Understanding WeeDogLinux init operation
Understanding WeeDogLinux init operation
I haven't studied Puppy Linux's initrd/init (I looked a few years ago, but back then at least it was long, looked complex, and I was lazy). However, per the Puppy init readme:
The init script has been called the heart of Puppy Linux...
Puppy Linux, by default, has always used aufs for its layer functionality (until some relatively recent experiments with overlayfs).
WeeDogLinux initrd/init was designed with a view to using either aufs or overlayfs (though currently published versions use overlayfs, since I prefer that, though it is only a couple of code lines need modified for aufs use). For your convenience (as a potential developer), please find below an explanation of what is perhaps the core part of WeeDogLinux initrd/init creative design:
------------------------------------------------------------------------
Unique layer code routine used in WeeDogLinux init (as created July 2019).
I am writing this post to explain a bit about an important part of the work:
The init used in WeeDogLinux initrd/init
An understanding of this key portion should help a WeeDogLinux user better understand how it works (and thus how to modify its operation according to their own wishes and coding preferences and even if you wish to create a derived work in an alternative coding language).
The most important part of the original code, is actually quite short (in fact, the whole WDL init uses less than a third the lines of code of most), but, by this creative design, achieves a very flexible filesystem layering structure that effectively allows any number of layers in the layer structure (be it for overlayfs, or for alternative aufs implementation). However, current published version allows for 100 layers. If you need more, simply re-write to use NNN instead of NN for 1000 layers... but, for the most part, I would doubt the need or efficiency of that many! Of course you could just use a single numeric N for up to ten layers only, but I consider that too low if you want rollback facility, which the existing 100 layers capability usefully caters for.
In terms of acknowledgement to other distros, it does borrow from something I have seen in practice, which is that Debian Live uses numbers to order the layers. I haven't at all studied how Debian Live does that, but I created my own scheme, shown here but long downloadable in source code format, which has proved to be very efficient in practice.
In terms of implementation, I used shell script for the prototype of this creative work though that could easily be converted into pure C in the unlikely event that greater speed efficiency is required but that would be at the expense of readability, ease of modification and understanding, and difficulty in providing such great flexibility overall.
How it is done in WeeDogLinux init
In simplest to understand form (maybe) WeeDogLinux's _addlayers algorithm is basically (in rough english) as follows:
Code: Select all
Make a directory in tmpfs for mounting layer filesystems to.
while not all files done:
Look through the directory where the filesystem modules are
looking for .sfs module filesystems and raw module filesystems.
For each one found, mount it ready for use in unionfs layers.
Keep track (variable or array) of each found.
end_while_loop (i.e. keep doing the loop till finished)
Sort all the results found above in the order you want them in the layers.
mount the overlay (be it with aufs or overlayfs) to a tmpfs directory in RAM using the stored list of filesystem modules found above.
Earliest WeeDogLinux systems did the above and ended with a chroot into the root filesystem rather than having the usual switch_root approach (though I prefer the latter now).
Simple in the end (but takes a lot of work for the simple to become obvious - pity Puppy didn't think of this approach) - elegant and powerful in use.
Different 'Sort' mechanisms can of course be used. In WeeDogLinux itself numeric sorting (easy to rearrange) is preferred over more simple but less-flexible alphabetical.
In practice, it is relatively trivial for a C programmer to translate shell script into C (difficult translating C back to shell script though), the result being a derived or part-derived work. Indeed you can easily convert the _addlayer code below to C for even greater speed efficiency. I wouldn't recommend that here though; it is already very fast and efficient and shell script is easier for user-modification and understanding. Using C for any part of WDL init would simply obfuscate the design and make it less accessible/understandable to others.
1. I define a directory on which all layers will be mounted. As a default I used:
Code: Select all
layers_base=/mnt/layers
which is arranged to be in a tmpfs.
2. I called a function, which I named _addlayer (shown in shell script below), to determine the numbered sfs files (and/or numbered raw directories) to use for the filesystem layers, and stored that in a variable (I named the variable 'lower'). Followed by a 'layers' sort. This loop/flexible-sort is a key part of WeeDogLinux creative design. I will explain its operation in detail in a moment...
3. I sorted the list in variable lower and removed any duplicates (using reverse sort for overlayfs but using ascending sort for aufs version)
Code: Select all
# Sort resulting overlay 'lower' layers list
# add new NN item to overlay \$lower list, reverse sort the list, and mount NNfirstrib_rootfs
lower="`for i in $lower; do echo $i; done | sort -ru`" # sort the list and remove duplicates
4. Finally (for the overlayfs case), per https://www.kernel.org/doc/html/latest/ ... layfs.html, quote:
At mount time, the two directories given as mount options “lowerdir” and “upperdir” are combined into a merged directory:
mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,workdir=/work /merged
The shell script code I used to do this was:
Code: Select all
cd ${layers_base} # Since this is where the overlay mountpoints are
# Combine the overlays with result in ${layers_base}/merged
mount -t overlay -o lowerdir=${firmware_modules_sfs}${lower},"${upper_work}" overlay_result merged
NOTE that ${firmware_modules_sfs} is an optional 00 numbered sfs or raw directory used to mount firmware and modules for the systems use (particularly useful is using a huge_kernel such as those Puppy Linux uses). I won't detail its operation here since it isn't important for overall understanding of the _addlayers function.
-----------------------------------------------------------------------
MAIN THING though is the creation/operation of the WeeDogLinux initrd/init function used to identify the (numbered) layers and to mount them, which in the shell script implementation is function _addlayers. I explain its basic operation as follows (as you read this, please refer to its shell code implementation - that code extract is also provided shortly below for your convenience):
1. Change to directory where you will mount the layers
Code: Select all
cd ${layers_base} # Since this is where the overlay mountpoints are
2. Loop through all the files in the partition/directory being booted from (I used a for loop but while or similar could be used of course - we are going to be looking for sfs modules here, but in flexible WeeDog also for raw directories we want mounted to the layers).
Code: Select all
for addlayer in *; do
3. Use numeric part of file or directory name to determine that file(.sfs) or directory (raw) is to be one of the filesystem layers. The 'trick', indeed the key concept in WeeDog's boot init design, is elegantly simple being that the code recognises if the start of the filename is numeric because the result of numeric characters is always greater than 0. i.e. NN not equal to zero or, in other words, greater than zero (NN != 0) means numeric start to filename such as 01firstrib_rootfs.sfs or even raw directory 01firstrib_rootfs/:
Code: Select all
NN="${addlayer:0:2}" # gets first two characters and below checks they are numeric (-gt 00)
if [ "$NN" -gt 0 ] 2>/dev/null; then
4. Add the detected numeric named file/dir to the variable 'lower' that will be used in the layer (overlay or aufs) mount instruction.
Code: Select all
lower="${NN} ${lower}"
5. Physically mount the numbered sfs file (or the numbered raw directory if that is used instead) to the layers_base (i.e. to /mnt/layers):
Code: Select all
mount "${addlayer}" "${layers_base}/$NN"
# for the case of a numbered sfs file
or
Code: Select all
mount --bind "${addlayer}" "${layers_base}/$NN"
# for the case of a numbered raw (uncompressed) directory
6. That's the main part of the WeeDogLinux initrd/init design.
The WDL init goes on to switch_root to the merged filesystem. However, earlier versions ran the final init via a chroot instead, which remains a documented option (on github FirstRib site).
I hope this explanation helps those wishing to modify WeeDogLinux initrd/init or derive some other work from it. By the way, in the latest design of WeeDogLinux, this code is immediately editable (as an optional module) OUTSIDE of the initrd, so it is particularly easy to read and modify between boots using a simple text editor such as geany. A major part of WDL is in fact to make it as modular and user-accessible/easy-to-modify as possible, both as a build system and also during boot (both via optional plugins).
Of course, all WeeDogLinux code has been published and opensource (MIT licensed) for over a year, so any developer reasonably fluent in shell scripting, who is interested in copying any part of this work already has easy access to it online, but hopefully this further explanation will help those who are not so experienced in reading the existing shell script implementation of the design.
wiak
------------------------------------------------------------------------
The complete _addlayer function code extract follows:
The _addlayer function code actually used in the shell script implementation as of July 2019 modification (change to C as an exercise if you want more speed!!!... though I doubt you'll need it and so, as I say above, better to keep more user-accessible and thus easier-to-modify):
Code: Select all
# mount any NNsfs files or NNdir(s) to layers_base/NN layervfiscate
# and add to overlay "lower" list
_addlayer (){
for addlayer in *; do
NN="${addlayer:0:2}" # gets first two characters and below checks they are numeric (-gt 00)
if [ "$NN" -gt 0 ] 2>/dev/null; then
if [ "${addlayer##*.}" == "sfs" ]; then
# layer to mount is an sfs file
lower="${NN} ${lower}"
mkdir -p "${layers_base}/$NN"
# umount any previous lower precedence mount
mountpoint -q "${layers_base}/$NN" && umount "${layers_base}/$NN"
mount "${addlayer}" "${layers_base}/$NN"
elif [ -d "$addlayer" ]; then
# layer to mount is an uncompressed directory
lower="${NN} ${lower}"
mkdir -p "${layers_base}/$NN"
# umount any previous lower precedence mount
mountpoint -q "${layers_base}/$NN" && umount "${layers_base}/$NN"
mount --bind "${addlayer}" "${layers_base}/$NN"
fi
fi
done
sync
echo -e "\e[95mCurrent directory is `pwd`\e[0m" >/dev/console
echo -e "\e[95mlower_accumulated is ${lower:-empty list}\e[0m" >/dev/console
}
As you can see, this critical function design is extremely simple in the end, but very efficient and flexible in practice (as I said, currently allows mixing any order of up to 100 layers of sfs OR raw directory filesystems, and via additional plugin capability WDL init allows further non numeric additions of either sfs or raw directory layers and again in any order of layer overwrite preference you desire). That goes for any and all WeeDogLinux variants, including the recent, soon to be further developed ultra-mini-modular LGO series, and even that one-off WDL_Slitaz production...
By the way, on looking again at it, I should get rid of the double equals (==) in if statements above, since that is an unnecessary 'bashism' (I think) that actually comes from my earlier habit of mainly programming in C... One of my many bad habits I'm sorry.
--------------------------------------------------------------------------------------------------------
NOTES:
I've seen pdfs of both of the following online, but do not know their legality. I own printed copies of the two books.
Quote from "The UNIX Programming Environment":
The crucial observation with both zap and idiff (C implementations of well-known shell script commands) is that most of the hard work has been done by someone else... It's worth watching for opportunities to build on someone else's labor instead of doing it your self - it's a cheap way to be more productive.
A not quite so old book that effectively showed how easy it was to port shell scripts to C was: "Beginning Linux Programming", by Neil Matthew and Richard Stones. Publ by Wrox Press. Quote:
Our first discussion of this application occurs at the end of Chapter 2 and shows how a fairly large
shell script is organized, how the shell deals with user input, and how it can construct menus and store
and search data.
After recapping the basic concepts of compiling programs, linking to libraries, and accessing the online
manuals, you will take a sojourn into shells. You then move into C programming, where we cover work-
ing with files, getting information from the Linux environment, dealing with terminal input and output,
and the curses library (which makes interactive input and output more tractable). You’re then ready to
tackle re-implementing the CD application in C. The application design remains the same...
By the way, in case you imagine that WDL is a one-man developer distro. Well, that is partly true - I develop the core build scripts and it is a for-my-family distro (so in that sense I suppose its model is that of benevolent dictator... but with an important twist (read on). However, as I say above, the whole system is designed to be modular and entirely, to a large extent, open to major user-contributions via its flexible plugin system. In practice, pretty much all current WDL_Void Linux flavours are due to the build plugin contributions of rockedge (indeed, I rely on rockedge for that because I myself forget most details of using the excellent xbps package manager and runit init-system employed by Void). Furthermore, I have two young sons who are computer literate and use WeeDogLinux as their daily workstations, and of course I am training them to maintain the overall core system because I want more time to drink coffee. Moreover, my partner uses WDL_Arch64 as the main operating system for her business needs - that forces me to maintain it. Having said that, I purposively use upstream repos (Arch Linux for WDL_Arch64) along with their official package managers rather than some home-baked mechanism of my own, which greatly aids me in maintaining WDL distros' security and reliability. You can expect WeeDogLinux (which has its own domain) to be around, therefore, in many a shape and form, for a long time.
wiak