grub Error2: Bad file or directory type
has been annoying me for a
few months, as of end of 2008. I often copy a whole linux-based (usually Gentoo)
system to another computer, on which the receiving partition will have a
filesystem made by the mkfs of some Gentoo or Knoppix boot-cd; I then chroot
into the copied system from the boot-cd to run whatever version of grub is in
that copied system, for installing it to the MBR. The above error has come
up sometimes, only with ext3 roots. I sometimes just switched to jfs to avoid
the problem, but now that I've looked around a bit on the web I see that
recent mkfs.ext3 has a larger default inode size (256B) than before, which
upsets grub. One, quick, solution: use
mkfs.ext3 -I 128 /dev/DISKNAMEto keep to the old default inode size. I'd hope that another solution is to get a recent grub and perhaps some other updated libraries, but this one is quicker for me to do...
config/hal: couldn't initialise context: (null) ((null)
Recent xorg seems to prefer `hal' (hardware abstraction layer)
for access to these devices. Any of the following should help:
compile without support for hal,
make sure that hald is present and running correctly to make
keyboard/mouse available,
or add the following to the xorg.conf file (add to
the existing "ServerFlags" section if there is one).
Section "ServerFlags" Option "AutoAddDevices" "false" Option "AllowEmptyInput" "false" EndSection
On a (Gentoo) vmware-player installation, the following error:
$ vmplayer /space/vms/ /opt/vmware/player/lib/vmware/bin/vmware-modconfig-console: unrecognized option '--icon=vmware-player' Must use a valid mode. Use one of: --get-kernel-headers --get-gcc --validate-kernel-headers [etc. etc.]was first removed by removing the appropriate checks in the vmplayer executable script, since one of these checks was actually giving the error. But it persisted. Checking the DISPLAY environment-variable showed it to be empty (I run proprietary programs [often with bad manners about where they write their files] as another user from my main session). Setting DISPLAY correctly removed the problem. No useful site was turned up by Google on this problem.
Note: in general, an Xorg resolution problem, especially if apparently happening `for no reason' (no config changes, updates, different monitor etc) may be due to a poor connection of the monitor plug, preventing the monitor from sending the information about its capabilities into the graphics card (see Xorg logs as a check of this). A more specific problem follows:
Not using default mode "1600x1200" (exceeds panel dimensions).Rather puzzling till searching for it as a bug. This maximum of 1280x1024 seems a common result of the `nv' (open source driver for nvidia cards) driver being used with a monitor on its digital output. By connecting the monitor to the analogue output, the right 1600x1200 as asked for in xorg.conf became available. It seems the driver gets the wrong idea about what the monitor claims about itself, when on the DVI connection.
* ERROR: app-office/libreoffice-3.4.5.2 failed (pretend phase): * Build requirements not met! * * Call stack: * ebuild.sh, line 75: Called pkg_pretend * libreoffice-3.4.5.2.ebuild, line 211: Called check-reqs_pkg_pretend * check-reqs.eclass, line 105: Called check-reqs_pkg_setup * check-reqs.eclass, line 96: Called check-reqs_output * check-reqs.eclass, line 237: Called die * The specific snippet of code: * [[ ${EBUILD_PHASE} == "pretend" && -z ${I_KNOW_WHAT_I_AM_DOING} ]] && \ * die "Build requirements not met!" *
i.e., how good that it makes clear that one can easily override the (failed) pre-checks, by setting I_KNOW_WHAT_I_AM_DOING. Compare to proprietary software or even to some other Free software installers that would simply exit without any options if believing there to be a problem.
(Quick answer: use ext4, but it's still not at all good.)
On changing a set of 4 320GB disks in (linux md) raid5, to 3 1TB disks,
giving about a 1.8TiB ext3 filesystem, a problem was immediately apparent:
sometimes a mkdir would take an annoying long time, even into tens of
seconds! The disks were new, and so was the filesystem on them, which
had even been given options ( -E stride=16,stripe-width=32
)
supposedly optimal for the raid array. All the other parts of the
computer were as before. This was tried with several kernels, from 2.6.29
through to 2.6.32, with and without gentoo patches.
Primitive tests suggest the large size to be important. Some confirmation of this, and of others having the problem of long delays, has been found on the web. The number of options ( commit=1 or commit=60 , writeback, directory hashing, simultaneous long read/write, raid stripe sizes, etc) and tediously time-consuming rebuilding of filesystems and testing, precluded serious study -- only half a day.... Judging from Documentation/filesystems/ext4.txt under the kernel sourcecode, much of the point of ext4 is making things work better with big filesystems. This was tried, and worked much better. I've not been keeping up to date: it must be years since I glanced at it first and thought I'd wait a while for the end of `experimental'.
Of course, one might ask why not switch from ext3 to reiser or xfs or jfs, none of which had the same problem. The reason is that the data is important. Experience with all of these filesystems has been that the effect of disk corruption, power failure or kernel problems is worse than with the `native' ext[234] filesystems: XFS had its trick of replacing file contents with zeros, reiserfs would be hopelessly lost a few years ago if trying to mount without basic /dev files (like null) at least in a gentoo system [ext3 wasn't], reiser has more easily than ext3 lost all track of data in the event of read-failures on a disk ... can't think of any dirt on jfs, I confess, but I haven't much experience with it. I see that opinions differ on the relative data-protecting merits of different filesystems, but in the absence of a detailed study I go by my experience rather than hearsay. (I'd love to hear of such a study, by the way...)
It's distressing how often I see people struggling to remove one of the traditional viruses (in the generic sense: worm is more correct to the case considered here) from their ms-windows computers, but meeting the usual trouble: scan (minutes on end); find; ask for 'remove it'; be told it's ok; reboot, and find it's all back again.
This is of course the way that the authors of malicious programs want it: the running program checks that any attempts at removing its copy on disk will be met with immediate reinstatement, and attempts at changing registry settings or antivirus installations may also be prevented. What is surprising is how so many so-called antivirus programs waste so much time pretending to help, when often they don't actually help at all. It's appalling what time gets wasted by the users, who should not even have continued after the first failure.
Often, the removal is very easy, as long as the disk can be accessed from another system that isn't running the virus. Look up the virus (see the name in results from a scan by an antivirus program, or search the web on the problems or error-messages being encountered). Find what its critical files are; sometimes it's as simple as a single additional file (a true virus would be nastier, in having modified an existing program; depending on the importance of that program you may still care to remove it).
Start the computer from a 'live cd' (or usb key, or take out
the hard-disk and put it as an extra disk in another system).
The other system (live cd etc) must have read-write support
for whatever filesystem the virus is on.
One good choice is a linux-based
knoppix cd (Free); it
might have a point-and-click way to mount the necessary
disk-partition, but otherwise something like
mkdir /d ; mount -t ntfs-3g -o rw /dev/sda1 /d
will do the job (sda1 is assumed to be the windows partition;
perhaps it will actually be sda2 or hda1 etc -- look using
fdisk -l /dev/sda
).
A good choice for people who like ms-windows and nothing else
might be
bartpe (pebuilder),
which requires fiddling around to make a bootable windows
cd, which will then give an ms-windows interface.
Then simply remove those critical files, unmount the partition, and reboot. I've had great success with this in the last few years, helping people in minutes who'd wasted whole days fiddling with the usual tools. An absurd situation, but that's how it is!
Trying to run PSCAD/EMTDC, with EGCS Fortran-compiler, in an m$ windows
system of vista or 7 vintage, this error arises when using "example"
models: the models are under "c/program files/..." so are protected
from user modification. Apparently the compiler is asked to dump things
there too, which it can't if run as a normal user. The solution is
to copy the examples to one's own directory, or ask for pscad to be
fixed.
Much as I love to knock m$, this choice is the right one: it's
absurd that one could ever change things under installation roots
as a normal user: on multiuser machines it's likely to cause real
problems, and even on single-user machines it promotes careless
distribution of user-modified and system files, making backups much
harder. Pity this wasn't the case from ages back, as typical in unixes.
From about 2003 to 2009 I've formed the strong impression, from some tens of disks, that Hitachi was no bad choice. During that time, almost every hard-worked Maxtor died or at least got very slow and had plenty of read errors reported by 'smart'; two Samsungs in a row broke completely and suddenly on power-loss to a computer (the second was a replacement under warranty); nothing too bad was seen for Seagate or WD, but as the Hitachis had been so good -- outnumbering the Maxtors but having not a single trouble, these eventually won favour.
Last summer, 2009, things started to change: a 500GB Hitachi failed (very annoying and remotely, albeit fairly slowly) within months of new, a 1TB Hitachi in an array of three was in a few months slow enough to slow the array acutely, and its replacement was also showing oodles of Raw_Read_Errors within the time taken to build it into the array, although the other two in the array, and two slightly older ones in another computer, had zero such errors after several months. One should always consider chance with these small numbers of disks, but it's not such a small number as to be worth continuing when there are other options: it begins to seem that at least a certain age of Hitachi disks has rather a `bathtub' curve of reliability, such that some have a rapid decline and possibly failure, even when very new, and those that don't may well live long.
For the latest array, WD 'Green' 1TB disks have been used. After building and rebuilding an array a few times (testing things) there's no evidence of slowing and no reported errors: that's better than the bad ones of the Hitachis.
Some recent (2009/2010) disks with 4KB blocks risk the problem that the kernel doesn't properly report the blocksize, and fdisk either believes this or has a `normal' 512B size hardwired: if partitions then start at points within the actual block-size, rather than at starts of blocks, performance can be reduced -- search for lots of results on the web! By using the `u' command in fdisk before choosing boundaries, then choosing all starting points to be multiples of 8 (for a 4KB actual block-size, 8 times the 512B), the problem is avoided. Some tens of percent difference were seen for ext3 writes in my example, between a first block of 63 (wrong) or 64 (right).
Having been forced (proprietary drivers in the lab) to go back from
current 2.6.3x kernels to pre-2.6.20 (in fact, 2.6.16), my NFS v3 mount
of working files started to be read-only, although the client reported
(using ls and friends) that the files were owned by the current user
and were rw for user. Nothing at the server end (RHEL5.2) helped, and
anyway that server was running fine with lots of other more modern
clients. Trying NFS v4 didn't help, nor did turning on all related
services, changing the username--uid mapping to match the server
(only the uid matched before), or making sure that only the one
line in the server's exports applied to this export (i.e. that there
wasn't the broader export line exporting the whole /srv
partition
to all, but read-only). Going back to an old nfs-utils might have
been good, since this one was compiled with the rest of the system
for a new kernel. Trying a work-around of using CIFS (samba with
extensions) led to some annoyances with file-times and speed. In the
end, NFS v2 was used, successfully -- nfsvers=2
in the options for
the mount in /etc/fstab
was all that was needed. For this use, with
editing a few smallish text files, the NFS v2 limitations aren't a
problem. Perhaps the following is the reason for the success, that
it sticks to a simple check of uid, rather than something that
apparently messes up.
Version 2 clients interpret a file's mode bits themselves to determine whether a user has access to a file.
Version 3 clients can use a new operation (called ACCESS) to ask the server to decide access rights.''
(http://nfs.sourceforge.net/)
Since September 2009, several copies of a newly compiled amd64 Gentoo system (at home, at family home, at work) have had Xorg using more and more memory, until needing to be killed from its several GB after a few days of intensive working. I'd never seen such behaviour before, having been used to work-computers running Gentoo for a whole year without ever even logging out!
I suspected Xorg itself, or certain drivers: but nv or nvidia, or Xorg 1.6 or 1.7 all did the same. And strange that no other user noticed it: couldn't /just/ be that I open and close more things than they. It only happened on the main 64-bit systems, not on 32-bit laptops or lab computers.
The very useful program found from a webpage about memory-hungry X is
xrestop(X resources top [moving list..., like top, htop, iotop, apachetop]). Running xrestop immediately pointed out that Kompose was the problem. Kompose is a sort of desktop switcher that tries to remember what each desktop looked like. I'm the only one here who uses it ... and actually, I never use it but just think it's rather nice to have. So, a problem simply but slowly solved: don't use it. Another suggestion from a website was disabling Xinerama, by adding
Option "Xinerama" "false"
in the ServerFlags
section of xorg.conf. Not necessary in my case.
The TeXmacs wysiwyg `mathematical word-processor' allows one to
insert a session with an external program, where the commands are
remembered and the responses are neatly formatted. This is a good
way to view the normally ascii-based output of Axiom, making long
expressions much more easily read. TeXmacs will include Axiom in the
list iff it finds the command AXIOMsys
available on its path.
Sometimes only a link to axiom
is available from an
on-path directory (e.g. gentoo of 2009) in which case a further link
or adding the axiom bin/ directory to the path will get TeXmacs to
provide Axiom sessions on its next start.
The list of necessary updates on the RHN (RedHat Network -- management)
web-interface grew longer and longer, and attempts at selecting all for
immediate update came to nothing. The command yum check-update
returned an empty list as though nothing needed doing.
The solution was this:
yum clean allafter which yum started realising the true situation.
Some other entries on this page use chroots for dealing with compiling certain modules or programs. Another use for chroots is when compiling or trying out another system (of course, the kernel is the same inside and outside the chroot, but all the libraries and applications aren't).
I upgrade my Gentoo systems (used for almost any computer other than
servers managed in collaboration with other people [RedHat] or on
certain laptops [FreeBSD]) perhaps once in two years; in the interim
it's just occasional security updates from glsa-check --list
.
For an upgrade I simply compile from scratch, taking the latest
baselayout: Gentoo changes so quickly it's not worth fooling about
trying to do an update after two years, or to keep updating things
in between and suffering library breakages and subsequent rebuilding
of large numbers of programs, update of configuration files, and so on.
Compilation takes some days for my huge list of things to install,
particularly with all the likely problems of circular dependencies
when including USE flags such as doc
(which are worked
around by building without documentation, then rebuilding with it
once the required programs are in place).
The compilation is done in a chroot, to allow the existing
system to keep running until the new one is complete: for example,
mkdir /NEW cd /NEW tar -xjf /tmp/stage3-tarball-xxxxxx.tar.bz2 rm -rf dev ; mkdir /dev mount --bind /dev dev ; mount --bind /proc proc mount --bind /sys sys ; mount --bind /usr/portage usr/portage cp /etc/resolv.conf etc/ chroot . /bin/bash
Recently there's been cause to try running the full system graphically from a chroot, preferably on a further X display (e.g. Ctrl-Alt-F12) while still having the existing system on its normal X display. This was because KDE3 has been removed from Gentoo, and KDE4 was (correctly) felt likely to be fraught with regressions of functionality such as to make an upgraded system pointless ... it is fortunate that it was thus tested, to establish the importance of keeping the old system running for a few more years.
The cheat method used to get both systems running graphically was
to set the real system not to listen for XDMCP from kdm, but
the new chrooted one to listen; the chrooted one was started with
/etc/init.d/xdm start
.
Then, the real system's display manager could be set to broadcast for
remote logins, causing it to find a host of its own name,
which was then selected.
There was a problem: xterms (konsole etc.) didn't open: no prompt
came up. Suspecting some sort of pty trouble, a google search was
made which turned up
this,
going into further uses of chroots and pointing out the need of
bind-mounting not only /dev
but also
mount --bind /dev/pts dev/ptsand possibly
/dev/shm
too. All worked easily then.
Using realVNC and tightVNC for computation servers has been a problem for those users who run applications with graphics acceleration; back in 2007 some technical FEM applications had to be worked around to force them to use a software (mesa) opengl implementation. Rather obvious `jpeg-style' noise around edges of fonts was also a highly-noticeable annoyance.
Just now, in 2010, I've `discovered' TigerVNC and have updated all our servers and my own fleet to have this instead. Its X-server (Xvnc) includes acceleration extensions, so applications work happily and acceptably fast even for 3D work. It can be built based on the same Xorg release as the computer's physical X-server (if it has one...) with the advantage of similar appearance of default fonts. The quality of the image is, for whatever reason, higher, with scarcely-detectable noise around edges.
(Note that at least as of May 2010, a fully-functional tigervnc installation in gentoo requires editing the ebuild file as described in bug308465, to cause the Xorg build within the tigerVNC build to use 'glx-tls'.)
Sometimes a change in graphics card or driver makes all the fonts on
the screen get so much bigger or smaller that changes of several 'points'
are needed to make them acceptable. I have seen it written that this is
particularly common with nvidia hardware, but that the solution (other
than adjusting all font-sizes) applies to all cases. In the Monitor section
of xorg.conf (probably /etc/X11/xorg.conf), the Option "DPI"
line can force the DPI:
Section "Monitor" ...... Option "UseEdidDPI" "false" Option "DPI" "96 x 96" EndSectionThe "UseEdidDPI" turned out not to be necessary, but perhaps it would be in some cases. The 96x96 is pretty common.
My installations of texlive-2008 (in gentoo) were done without the
xetex
option. A user of one of these systems wanted to
compile a CV template from
here, which relies on xetex.
XeTeX permits direct use of TrueType and OpenType fonts. It was
easily installed, but the example wouldn't compile:
$ xelatex cv_template_xetex_gentium.tex ..... (/usr/share/texmf-dist/tex/xelatex/xetexconfig/geometry.cfg))kpathsea: Invalid fontname `Gentium Basic', contains ' ' ! Font \zf@basefont="Gentium Basic" at 10.0pt not loadable: Metric (TFM) file o r installed font not found.The sourcecode of kpathsea makes clear that this is a fatal error in itself. Judging from a web-search this error doesn't mean it was wrong to have a space in the font-name, but very likely just that the font isn't there, so the search has come to places where it would be wrong... Gentoo's
sil-gentium
font package was installed,
but was found not to contain the 'Basic' set, so both lots were got
from Gentium download, along with another ttf file (monaco) needed by
the template.
Then how to install the fonts so that xelatex would find them?
Fortunately, it does follow the general font/desktop conventions
rather than having its own special index: simple copying all the ttf
files into ~/.fonts/
made it all work. xelatex produced
pretty pdf output.
The simplest, best thing I've found is
bibutils,
which provides a set of simple commands, using some XML format as an
intermediary: for example, ris2xml, xml2bib, xml2ris
.
The annoyance that prompted it: something called RIS, that several journals all
on the same day were wanting me to accept instead of bibtex, to download
citations. Clearly I'd chosen a bad subject that day.
The first couple were manually edited. Then it seemed so frequent
a problem I went to the Web and found a perlscript that basically
didn't work (too many assumptions about the input field-order etc).
Then a cb2bib program that was GUI-based and required lots of
dependencies then wasn't really the quick one-shot command I sought.
Finally, bibutils was found (via cb2bib!), so my final ris2bib
command is just
#!/bin/sh ris2xml "$@" | xml2bib -b -nbwith apparently good robustness.
All that's needed is to add an argument ARCH=i386
after
the `make
' command, when running such commands as
make ARCH=i386 menuconfig
, make ARCH=i386
,
make ARCH=i386 install
, etc. Without this option,
the presence of a 64bit running kernel causes the kernel build to
assume that only 64bit targets should be considered. My need for
forcing this choice was that I was preparing a (copied) 32bit system's
kernel within a chroot running from a 64bit livecd; it should have
been ok to have had a 64bit kernel in my new system, but the 32bit
chroot had no compiler capable of 64bit output, so the kernel's
autodetection of 64bit caused compilation to fail.
Following the nvidia-related points above, here's another poinnt continuing the line of huge wastes of time due to hardware manufacturers persisting in insulting their customers with proprietary drivers. NI (National Instruments) makes various control/measurement/instrumentation hardware. It therefore makes use of kernel modules to communicate with specialist PCI, USB etc. hardware. These, unfortunately, are proprietary. They work competently with a few `approved' distributions, but attempting to use other systems, or newer systems (the release cycle of the drivers in infrequent) generally gives errors, even in the installer sometimes (wrongly) reports a glowing success after the shower of error messages.
One way to have a better chance of the drivers working is to go (back) to a kernel version very close to those in the supported distributions. I've tried this by going back to linuxes 2.6.16 or 2.6.18 (to match RedHat EL5) instead of the natural choice of 2.6.30 (as of August 2009) for my recent Gentoo system.
The first trouble came about ten lines into
compiling this older kernel with my gcc-4.3.2:
error: ‘PATH_MAX’ undeclared
. By mounting the kernel
source within the (alternative-boot) RedHat system, the kernel compiled
happily (`make') with gcc-4.1.2.
I then ran make install && make install
from
within the kernel source-directory when running the Gentoo system.
At later times during the experimentation, I instead mounted
the relevant parts of the Gentoo system (/boot /lib/modules
)
within the RedHat chroot, so as to allow the compilation and
installation of kernel and modules all in one go.
The next problem came when running the installation for the NI
drivers (GPIB) or later when running updateNIDrivers
to compile the wrappers for the kernel modules. First some check
of kernel version and kernel source version failed: it was claimed
they didn't match, though they clearly must as the kernel came
from the sources. A first step was to follow advice to make it happy:
cd /lib/modules/$(uname -r)/source/include/asm/ ln -s asm-offsets.h asm_offsets.h cd /lib/modules/$(uname -r)/source/include/linux/ cat utsrelease.h >>version.hThis removed the nonsense-error about versions, but left (understandably) the warning that different compiler versions had been used for the kernel and for the new modules. The intended Gentoo kernel was running, and the RedHat system was mounted under
/sl
(Scientific
Linux). By some bind mounts, the relevant parts of the Gentoo system
replaced the RedHat ones, allowing a chroot into the RedHat system
to compile the modules with the RedHat compiler:
list="dev proc sys usr/local lib/modules boot usr/src usr/include" for t in $list; do mount --bind /$t /sl/$t; done # where /sl is RedHat mount chroot /sl /bin/bash cd /usr/src/linux make && make install && make modules_install updateNIDrivers for t in $list; do umount /sl/$t; done
The final trouble was my fault, for not considering that
/usr/local/lib
was hitherto unused, and wasn't included
by the linker. In matlab (instrument control) the following error came up:
??? Error using ==> visa.visa at 242 Invalid RSRCNAME specified. Type 'instrhelp visa' for more information. Error in ==> icm_connection>icm_connection_open at 82 ih.icm = visa('ni', 'GPIB0::1::INSTR');This RSRCNAME error turns out to be nothing to do with the
'GPIB0::1::INSTR'
actually being wrong, or with the specification having changed, but just
that there's some problem with the drivers. In my case, this turned
out to be that the loader path hadn't yet been updated to include
the new NI files that had been copied into /usr/local/lib
:
running env-update (a Gentoo-specific wrapper for ldconfig and other things)
then restarting matlab from a new shell, was enough to get GPIB working.
So now it's running in recent Gentoo (2.6.31-kernel age) on a 2.6.18
kernel (full of holes) that was compiled within a RHEL-5 system on its
gcc-4.1.2 compiler. If only NI would get drivers into the kernel.
On any spell-checking attempt in Kile or Kate:
/usr/bin/
.
This seems to be another strangeness of KDE and perhaps of old config
files. The working solution came from:
http://wisconsinloco.ubuntuforums.org/showthread.php?t=600772 ,
which suggested editing ~/.kde/share/config/kdeglobals
to change the value of KSpell_Client from 0 to 1, so that the KSpell
section starts as:
[KSpell] KSpell_Client=1 ....It was clear that kcontrol had written the later options into the config correctly, but that this 0 (surmised in the above link to indicate the ispell program) was preventing attempts at using aspell. Everything went fine after a restart of kile or kate.
Page started: 2009-01-14
Last change: 2012-04-11