Technicalfeline

Monday, 27 November 2017

EFI System Partition in soft RAID1

One reason you might want to put the EFI System Partition (ESP) in a RAID1 array on a computer with Linux soft RAID is to have redundancy when booting. If one disk fails, you want the boot to continue from the other disk.

At first I thought this wasn't possible since a RAID1 partition wouldn't have the specific FAT filesystem and GUID required by the specification. However the fact that the CentOS 7 install media offered the choice of putting the ESP on a RAID1 array and that it actually works, made me doubt my hypothesis.

The key to this that the CentOS 7 installer uses RAID metadata format 1.0, which is located at the end of the partition. Thus it doesn't clash with the beginning of the partition, which is where the BIOS will check to see if the partition is an ESP. However most Linux partition tools will detect it first as a RAID member so it's not immediately obvious that it's an ESP.

There are some caveats to this scheme. All writing of the ESP must be done while it's mounted as a RAID array so that there is no discrepancy between the two members. If the only OS on the disks is Linux, this won't be a problem. But don't use this scheme if the ESP also boots other operating systems that don't know about Linux RAID.

For CentOS when you look at the choice of boot devices in the BIOS, you should see two disk boot candidates, both labelled CentOS.

On the machines I used, HP z230 workstations, I found that I had to disable Legacy Boot or errors reading the boot sectors would be triggered.

The bottom line is I now have workstations with soft RAID1 whose disks are fully redundant. If one disk fails, the other will continue to boot and run with degraded arrays for each of the partitions.

Thursday, 23 November 2017

grub2 error: failure reading sector 0x0 from 'hd0'. Press any key to continue

After I had installed CentOS 7 as the only OS on a HP z230 workstation in UEFI boot mode, I got this message before booting. It was actually the last of three errors:

error: failure reading sector 0xfc from 'hd0'.
error: failure reading sector 0xe0 from 'hd0'.
error: failure reading sector 0x0 from 'hd0'

Boot would resume from the hard disk after a timeout, but the pause was unacceptable and would worry users.

A search showed many articles like this but none solved my problem. I tried various things: refreshing grub.cfg, disabling the CD/DVD (thinking it might be trying to read the optical drive), checking if having the ESP in a EFI system partition in a RAID1 array was disallowed. (I figured out how ESP can work with RAID1, and its limitations, but that's for another blog entry.) None of my experiments worked.

However the linked to web page alluded to turning off Secure Boot so I went into that part of the BIOS setup. I found that it was already turned off but there was a setting there for Legacy Boot which was enabled. So I turned it off to see what would happen. Lo and behold, the error messages ceased, and UEFI boot worked as expected. Also the Boot Order menu stopped showing a Legacy section.

Since debugging the innards of the GRUB2 loader is beyond me, I can only surmise that the presence of Legacy Boot entries in the BIOS makes GRUB2 try reading the sectors in question but since the disk is formatted with GPT partitions and UEFI is in force, the sector reads fail, for some definition of fail. Maybe somebody can figure out the significance of the sectors 0xfc, 0xe0, and 0x0.

Thursday, 16 November 2017

Dos and Don'ts deploying sssd for authentication against Windows AD

New: For deployment on Redhat/CentOS 6, see here.

sssd (and realmd) in RedHat/CentOS 7 offers the chance to use Windows as a single authentication base. The RedHat manual was the most useful but there were also good debugging tips on stackoverflow and similar forums. However in deploying sssd I found some things worked for me and some things didn't.

Do harmonise all the Windows and Linux login IDs. If there are users with two different IDs, then they'll have to bite the bullet and accept the change of one ID. Unfortunately domain logins cannot have aliases.
When you join Linux to AD using the realm command and an unprivileged account, you may encounter this 10 machine limit. Here's how to raise the limit.
Do use ntpd to keep all the clients in time sync. Specify the domain servers as NTP servers in ntp.conf. I had an issue where one client wouldn't authenticate. All the config files were identical with a working client. Finally I realised I had not enabled and started ntpd. It turned out to be clock skew. Kerberos is sensitive to this.
Do enable GSSAPI with MIC in sshd. It really works and you can use putty to ssh to the server without specifying a password provided the Windows user has authenticated to the domain.
Do use AD security groups to restrict access to the Linux servers. Otherwise all AD users can login by default. This means that enrolling a new Linux user across all the servers is simply adding the user to your chosen security group. Create one if necessary. Oddjobd will take care of creating the home directory on first login, which is very nice. I used the simple access_provider. I couldn't get the ad access_provider and ad_access_filter to work, but this is probably because I couldn't work out the correct LDAP strings.
You can also use a security group to specify who can have extra privileges in sudo.
I used the deterministic hash scheme for mapping SIDs to UIDs because I didn't want to (and didn't have authority to) add attributes to the AD schema.
When migrating existing user accounts, make sure you find all the places a user might have a file. Not just /home but also /var/spool/cron and /var/spool/mail. Kick all the users off and kill all of their processes before you do the chown. Since after the switchover the names will map to the new UIDs, you can cd /home and run a loop: for u in * do; chown -R $u $u; done. Also the cron and mail directories.
If you have software that must have simple login IDs, i.e. fred and not fred@example.com, then you should set use_fully_qualified_names = False. This implies you cannot have a default_domain_suffix. If you have a single domain, then you don't need domain suffixes. If you have multiple domains, then this is beyond my knowledge. I found that some applications cannot handle usernames of the domain form. Even the crontab command will create and require cron files of the domain form if domain suffixes are enabled.
I couldn't get the sssd idmap to work with Samba so I chose winbind. Also you have to use winbind if you have to support NTLM authentication.
New: If you are running 32-bit applications, you should also install the 32-bit libsss* shared libraries corresponding to the 64-bit ones, otherwise those applications may not be able to get user account info via PAM. This showed up in icfb, an old 32-bit Cadence executable, that worked for local users (in /etc/passwd) but failed for SSSD authenticated users.
New: If oddjob_mkhomedir doesn't work, as evidenced by no home directory created for a new login, check /var/log/messages. SELinux is probably blocking this. Either make the policy permissive, or create a policy for this.

Friday, 15 September 2017

Use crontab to notify when a piece of software has been released

Sometimes I eagerly await the release of a distro version or a new version of a piece of software. But I don't want to remember to check constantly, so I wrote a script that can be run from cron to let me know.

#!/bin/sh
case $# in
0|1)
echo Usage: $0 url message
exit
;;
esac
url="$1"
shift
wget -q --spider "$url" && echo "$@"

As you can see, this runs a wget on a specified URL which does not produce any output, but if successful, will print the message. Put this in crontab and the message will be mailed to you.

This script depends on knowing a URL that will exist when the release happens. Often you can guess the URL from previous releases. Here are a couple of examples, each is all on a single line:

1. Check at 0808 every day to see if AntiX 17 has been released:

8 8 * * * watchurl https://sourceforge.net/projects/antix-linux/files/Final/antiX-17/ AntiX 17 released

2. Check at 0809 every day to see if VirtualBox 5.1.30 has been released:

9 8 * * * watchurl https://www.virtualbox.org/download/hashes/5.1.30/MD5SUMS VirtualBox 5.1.30 has been released

Rebuild kernel modules for VirtualBox on Linux automatically

I prefer to install packages distributed by VirtualBox rather than packages built by the distro maintainers as I don't have to wait for a new package when a new kernel is released. Unfortunately this involves rebuilding the vbox kernel modules after the new kernel is booted. So I decided to devise a way to automate this.

First I wrote a script to check if the vbox modules are loaded and if not to run the setup script. It's just this:

#!/bin/sh

sleep 120
if ! lsmod | grep -s vbox > /dev/null
then
/usr/lib/virtualbox/vboxdrv.sh setup
fi

The sleep is needed as in some init systems cron comes up even before all the modules are loaded. Next, I installed a crontab entry that runs the script above once at bootup, using the @reboot notation:

@reboot /root/bin/vboxdrv

This goes into root's crontab, using crontab -e.

And voilà, that's all that's needed.

Friday, 11 August 2017

Fix for format problem importing contacts from Yahoo into Outlook via CSV format

After some unpleasantness with Yahoo mail blocking mail to friends addressed in the BCC field by mistakenly labelling my mail as spam, causing me to have to change the password, I decided to phase out my Yahoo mail account and use Outlook instead for sending out tidbits of interest. So I needed to export my contacts from Yahoo to Outlook.

I followed instructions found on the Internet for transferring contacts Yahoo to Outlook, but Outlook kept saying that the imported CSV file was not in the required format.

So I decided to look at a contact export file from Outlook to see what the difference might be. The first thing I noticed was that it had the Byte Order Mark, U+FEFF, at the beginning.

So I processed the CSV file with this Linux pipeline, the unix2dos is to change NL to CR-NL for good measure:

unix2dos < yahoo_contacts.csv | uconv -f ascii -t utf-8 --add-signature > y.csv

uconv is from the ICU package. After that I imported y.csv with no problems. Success!

Friday, 12 May 2017

A variety of mail address change protocols

I'm currently in the process of phasing out one email address and going through accounts I have on websites. In the process I've encountered a lot of variations on protocol and praxis. I'll start with the most secure examples:

Confirmation required on both the old and new addresses.
Confirmation required on the new address and notification to the old one, or vice versa.
Can just edit and save, perhaps with notification afterwards. This is bad, anybody who gains access to the account can change it. Also what if you make a typo, then you're locked out.
No way to edit, have to ask support to change it. Surprisingly some large sites require this. One asked me to create a new account, then they would migrate the history to it.
There is a user initiated process but there is a glitch such as the contact email can be changed but the login ID is still the old email. Have to contact support again.
The user initiated process doesn't work. Have to contact support.

And a lot of websites have no way to delete the account. All you can do is hope that your password is unique and encrypted securely, and that their database doesn't get stolen some day.