From "Hey, it's a nice day I think I'll open my window." to "It's almost 2am, I better get to bed soon.", here's another Linux install and bootloader misadventure.
Goal ∞
Install TinyME. root = sda3, home = sda6 (both are empty partitions)
Problem ∞
LILO, GRUB and drakboot are steaming piles of crap, and refuse to work in a predictable way for my everyday task.
Solution ∞
Fairy dust.
No seriously, magic. And probably Texstar not sleeping last night either.
The Story ∞
I have a simple partitioning scheme that looks like this:
/dev/sda1 / - test OS /dev/sda2 / - main OS /dev/sda3 / - alternate OS /dev/sda5 "swap" - 8MB, only exists for stupid installers. /dev/sda6 home - alternate home, only exists for stupid installers. /dev/sda7 home - current home, only exists for stupid installers. /dev/sda8 * - big fat partition
Yes, my partitioning scheme will offend some people. I came from Slackware and I know very well how to do better partitioning. Frankly it does not matter one bit. There is no good argument for any partitioning other than to separate / and home. None. There is also no good argument for a swap.. I haven't needed one for years, if ever.//
2015-05-08 - Yes, there's a very good reason to separate home. It's for reinstallations not oblitering user preferences. I've known this for a very long time now.
I examined sda3 and sda6 to make sure there was nothing I would miss from a previous test installation. I formatted them myself at the commandline, since I don't trust installer formats.
My intention was to install TinyME. I had already downloaded it and burned it, and for mystifying reasons graveman said the burn came up as a failure even when I tried it a second time. I know how to do a manual md5sum check on the disk to compare it with the ISO, but I don't care to troubleshoot this kind of innane bullshit anymore.
I booted into the TinyME LiveCD. This was trivial since I had already struggled with - and solved - some mystifying bootup issues with this system. I'm not even going to bother writing about that nonsense.
Glaring installer issues ∞
This first issue is what set off this whole nightmare.
The installer successfully copied the distribution files. Next, it presented me with the bootloader install. The first issue is that there is no way to skip the bootloader install . I already had a nice and simple and functional Lilo install which would also automagically work for this new /dev/sda3 install. I even paused here, and grumbled to myself at this annoyance and its potential damage.
In the bootloader install screen, while modifying an entry in the list, if you decide you don't want to modify the entry you selected, and you choose to close/cancel that editing window, the installer exits without a prompt . Lame.
I ended up having to go through the whole install three times because of this installer exit issue. Luckily I had done a "copy to ram" boot on the LiveCD, and with TinyME being what it is, the file copy took about five seconds. Seriously.
I booted into TinyME. It began with it's graphical bootup, but the progress bar went nowhere. Looks like it hung. Scroll lock did nothing, but I'm not so surprized.
I reset and retried. No change.
I reset and retried in safe mode. No change, but this time I learned more:
RAMDISK: Couldn't find valid RAM disk image starting at 0. VFS: Cannot open root device "863" or unknown-block(8,99) Please append a correct "root=" boot option Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,99) <6>Time: tsc clocksource has been installed.
Great. That tells me that a root= option is wrong, right? But how could that happen?
I booted back into the TinyME LiveCD except this time I didn't do the "copy to RAM" feature of the LiveCD. I figured maybe that could be the culprit and perhaps the install was acting like the first copy to RAM bootup and was somehow expecting a RAM disk image. It was worth a shot.
No change.
I booted up and did a media check. It passed. Since I had recently done 14 passes of a memory check, I didn't think my memory had gone bad in the last few days and continued on..
So I booted back into TinyME and searched me up some help on the error message. Hmm, no web browser was included. Ok, I ran Synaptic and pulled Firefox from the repository. The attempt failed, with Synaptic saying it couldn't find a dependency. I ended up installing links with Synaptic. My brief searching with links didn't help much.
Whatever, I'll re-install the bootloader with drakboot. Except an older issue I had with it now returned - it fails to work either with my existing lilo.conf or if I delete lilo.conf, drakboot fails to begin a new one. I just get one of those generic "I hate you" error messages.
I tried GRUB with drakboot, and rebooted. No change. In fact, GRUB didn't even get installed!
Booted back into the TinyME LiveCD. Confirmed that GRUB just silently does nothing when I try to use it with drakboot. Sigh. Of course, GRUB being what it is, it's completely useless for a normal human and so I have no hope of ever being able to use it to manually install the bootloader myself.
When working with hand-reinstalling a Lilo bootloader, I noticed that the TinyME LiveCD does not see my main HDD as /dev/sda but instead sees it as /dev/sdg .. lolwut?
So I booted back up into my previous install. Except it wouldn't work either. It gave the same error. Now I know it's not the install but the bootloader which is the culprit. Now I'm stuck with no working setup. My backups don't include an MBR backup, and after this mess I intend to remedy that somehow.
After lots of playing with chroot and all that jazz, I gave up and booted into my PCLinuxOS 2007 LiveCD.
It booted fine. I didn't even bother with drakboot, I just hand-installed Lilo. No problems.
Well I must have had a problem, because my notes say that I reinstalled PCLinuxOS a second time. My notes are pretty vague around here because I wasn't doing doing note-taking seriously yet. I couldn't yet grasp the full tidal force of bullshit that was to come.
Now I must have actually installed the full PCLinuxOS into /dev/sda1 at this point. I did confirm that the PCLinuxOS installer's use of Lilo worked just fine. I added my own bootup for sda1, sda2, sda3 like I always do, and I even changed the three defaults it suggests so that instead of booting PCLinuxOS on sda1, it would boot my new TinyME on sda3. At this point I decided to make a name change to one of the items so I could tell if this Lilo install worked.
I booted up. I saw that the Lilo install worked. Except the three defaults I changed to point to sda3 .. were still pointing at sda1! How that's possible is beyond me.
Ok, so I rebooted and manually chose sda3. It booted up just fine this time.
I logged into TinyME. And. The networking wouldn't work.
Ok, I'll restart it myself.
# service network restart Shutting down loopback interface: [ OK ] FATAL:Could not load /lib/modules/2.6.18.8.tex5/modules.dep: No such file or directory Bringing up loopback interface: arping: socket: Address family not supported by protocol [ OK ] Bringing up interface eth0: forcedeth device eth0 does not seem to be present, delaying initialization. [FAILED] FATAL:Could not load /lib/modules/2.6.18.8.tex5/modules.dep: No such file or directory
Umm. How. The.
FINE. I booted back up into my main setup at sda2. Except it's bootup failed! Looks like an fstab issue. Ok, emergency repairs ensue.
Booted into the new PCLinuxOS. I walked away and came back to a single-user mode prompt and no explanation as to how that happened. Whatever, let's check out sda2's fstab. I learn that somehow my fstab got changed to point to sdg! Fixed and rebooted.
I booted into my main sda2 install and this time I got a bunch of "unable to load" messages re. some kernel modules. This is bad.
I checked things out, and learned that /boot/kernel.h was symlinked to an old version. I fixed it and rebooted.
There was no change. I decided to be a brute about it and I symlinked the /lib/modules reference to the existing kernel. I knew this was probably a bad idea, but I was out of options. Rebooted.
No change. Investigated.. and whoa, my kernel.h symlink change undid itself. I redid it.
Then I hung on "starting udev". Fine, I know this one, I'll reboot with ACPI=off .. except, I can't choose any options from my bootloader. Sigh.
I Booted into my PCLinuxOS install on sda1. Last time it gave single-user mode, this time it booted fine. Except it also spits out those same missing kernel modules messages. . And .. it booted into TinyME?! But but.. I had confirmed earlier that the Lilo menu default bootup went to PCLinuxOS. This time it went to TinyME and inbetween I made no bootloader changes!
Fine fine, I symlinked the /lib/modules to the right kernel. I then tried adding a new network device through the GUI stuff. I got
Unable to find network interface for selected device (using forcedeth driver).
I rebooted and manually chose sda1 to get PCLinuxOS. The bootup worked. The networking worked. How. The. Hell..
Firefox 2 worked and networking was there, but I wanted to get Firefox 3 running. I decided to do nothing else.
I guess the dependency chain was broken after all. Trying to run firefox
at the commandline gives
Couldn't load XPCOM
I decided to do a full update. Sigh, 35 minutes of downloading. After 15 minutes, I walked away.
When I came back, synaptic was sitting there staring at me with no status message. Umm, it didn't do anything?
I decided to upgrade a few at a time. I guess the 15 minutes of downloading meant nothing. This was taking forever. I was forced to mark a few at a time, because some items would prompt me to uninstall just about everything, including basesystem.
I was able to gently mark a few at a time, eventually I had all but one item marked for upgrade. The remaining item was missing a dependency. This is very strange.. selecting all wouldn't work, selecting certain packages wouldn't work, but working around the unselectable stuff and leaving them for last did work. I don't get it, and I don't want to.
20 minutes of downloading. The download worked well, and I started the long wait.
It hung on "Hardware Abstraction Layer". There was no disk activity for a long time. I pressed 'x' on the synaptic updating dialogue. The dialogue went away, but synaptic stayed greyed out. Sigh.
I killed synaptic and restarted it. Now I get complaints about broken packages. So I decided to select a few packages at a time and upgrade only a few at a time.
This would take even longer, but it was working.
At one point, marking some packages crashed synaptic with
could not find glade file '/usr/share/synaptic/glade/window_changes.glade'
Sigh, I'll just restart synaptic and continue on. But synaptic wouldn't start. It was gone.
no such file or directory
Fine, let's bring out the big guns: apt-get dist-upgrade. Good thing I remembered this. Yes, good thing I remembered, because man pages are shite as a rule.
apt
gave me a nice evil warning, making me type
Yes, do as I say!
It was warning that vim-X11 would affect the basesystem. "Whatever", I thought, since I didn't see the basesystem or any other critical components being removed in apt's list of changes.
A 6 minute download? Nice. It seemed to go perfect. The "cleaning up / removing" list is disgustingly large, and bounced back and forth between one set of stuff to another. This stuff's smart. I likes me some smart package managers, kthx.
Ok, it worked. But there's no synaptic, so let's grab it
apt-get install synaptic
Seems to work. Launched synaptic and did a "mark all" and it found some additional things to update which the previous apt-get dist-upgrade missed somehow.
Firefox is still not installed. Trying to install it through synaptic gives me a missing "xulrunning" dependency.
I reloaded synaptic, and marked all upgrades, and it found some more for me. I applied them. I guess some nonsense was being done to the repository. Perhaps that's what messed me up for a while there. Good thing packagers don't sleep. Well, it was probably Texstar, and he's the man for packaging.
After that update, firefox could be installed. I guess xulrunner was made to work from the last update.
Now I needed to fix an issue with synaptic having broken packages from previous nonsense. I decided to remove some packages from the local/obsolete category. These packages demanded I uninstall basesystem. Uh, no thanks. I selected a few at a time, and it turned out to be a kernel-related package which was asking me to uninstall basesystem too, if I wanted to uninstall it.
I decided to upgrade the kernel and then remove the old kernel stuff. It seems to have worked.
Firefox runs.
I see nothing left to do, so let's try a reboot!
Seems my boot loader was redone. I see a new entry for the new kernel I just installed. Booting from it seems to work just fine.
Logging in works fine.
This post was written from that setup. It's not what I wanted, but at least it works. At this point, I'm not really sure what I'll be doing to recover from this mess. I do have backups, maybe I'll go to them.. but since those backups don't include an MBR, there's not much point.
This misadventure continues with bootloader hell, part 2.
Last updated 2020-02-09 at 19:22:14