Developers’ Weblog

Sponsored by
HostEurope Logo

Developers’ Weblog

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

A coworker and I debugged a fascinating problem today.

They had a tomcat7 installation with a couple of webapps, and one of the bundled libraries was logging in German. Everything else was logging in English (the webapps themselves, and the things the other bundled libraries did).

We searched around a bit, and eventually found that the wrongly-logging library (something jaxb/jax-ws) was using, after unravelling another few layers of “library bundling another library as convenience copy” (gah, Java!), which contains quite a few com.sun.istack.localization.Localizable members. Looking at the other classes in that package, in particular Localizer, showed that it defaults to the java.util.Locale.getDefault() value for the language.

Which is set from the environment.

Looking at /proc/pid-of-JVM-running-tomcat7/environ showed nothing, “of course”. The system locale was, properly, set to English. (We mostly use en_GB.UTF-8 for better paper sizes and the metric system (unless the person requesting the machine, or the admin creating it, still likes the system to speak German *shudder*), but that one still had en_US.UTF-8.)

Browsing the documentation for java.util.Locale proved more fruitful: it also contains a setDefault method, which sets the new “default” locale… JVM-wide.

Turns out another of the webapps used that for some sort of internal localisation. Clearly, the containment of tomcat7 is incomplete in this case.

Documenting for the larger ’net, in case someone else runs into this. It’s not as if things like this would be showing up in the USA, where the majority of development appears to happen.

OK, time to clean up ↳ tarent so people can work again tomorrow.

Not much to clean though (the participants were nice and cleaned up after themselves ☺), so it’s mostly putting stuff back to where it belongs. Oh, and drinking more of the cool Belgian beer Geert (Linux upstream) brought ☻

We were productive, reporting and fixing kernel bugs, fixing hardware, swapping and partitioning discs, upgrading software, getting buildds (mostly Amiga) back to work, trying X11 (kdrive) on a bare metal Atari Falcon (and finding a window manager that works with it), etc. – I hope someone else writes a report; for now we have a photo and a screenshot (made with trusty xwd). Watch the debian-68k mailing list archives for things to come.

I think that, issues with electric cars aside, everyone liked the food places too ;-)

As I said, I did not certain events that begun with “lea” and end with “ing” prevent me from organising a Debian/m68k hack weekend. Well, that weekend is now.

I’m too unorganised, and I spent too much time in the last few evenings to organise things so I built up a sleep deficit already ☹ and the feedback was slow. (But so are the computers.) And someone I’d have loved to come was hurt and can’t come.

On the plus side, several people I’ve long wanted to meet IRL are coming, either already today or tomorrow. I hope we all will have a lot of fun.

Legal disclaimer: “Debian/m68k” is a port of Debian™ to m68k. It used to be official, but now isn’t. It belongs to, which may run on DSA hardware, but is not acknowledged by Debian at large, unfortunately. Debian is a registered trademark owned by Software in the Public Interest, Inc.

If you’re a Unix person instead of e.g. a Microsoft® Windows® person, you’ve probably been annoyed by Iceweasel (or Mozilla™ Firefox®) creating a ~/Desktop directory, among others (things like ~/Downloads).

Here’s a quick fix I found somewhere in the ’net:

mkdir -p -m0700 ~/.config
cat >~/.config/user-dirs.dirs <<'EOF'

Upon next start, Iceweasel (and other XDG-compliant applications) will throw stuff into ~/ instead.

WTF is Jessie; PA4 paper size

12.12.2014 by tg@
Tags: debian pcli rant

My personal APT repository now has a jessie suite – currently just a clone of the sid suite, but so, people can get on the correct “upgrade channel” already.

Besides that, the usual small updates to my metapackages, bugfixes, etc. – You might have noticed that it’s now on a (hopefully permanent) location. I’ve put a donated eee-pc from my father to good use and am now running a Debian system at home. (Fun, as I’m emeritus now, officially, and haven’t had one during my time as active uploading DD.) I’ve created a couple of cowbuilder chroots (pbuilderrc to achieve that included in the repo) and can build packages, but for i386 only (amd64 is still done on the x32 desktop at work), but, more importantly, I can build, sign and publish the repo, so it may grow. (popcon data is interesting. More than double the amount of machines I have installed that stuff on.)

Update: I’ve started writing a NEWS file and cobbled together an RSS 2.0 feed from that… still plaintext content, but at least signalling in feedreaders upon updates.

Installing gimp and inkscape, I’m asked for a default paper size by libpaper1. PA4 is still not an option, I wonder why. I also haven’t managed to get MirPorts GNU groff and Artifex Ghostscript to use that paper size, so the various PDF manpages I produce are still using DIN ISO A4, rendering e.g. Mexicans unable to print them. Help welcome.

Note, for arngc, you need a server component (MirBSD-current, of course; we’re rolling release nowadays). Config included, but I’m willing to open my firewall to people I know, provided they won’t use “too much” traffic (running a couple of arngc instances is fine, according to what I estimated).

A largish article about how to use some other packages in the repo, such as dash-mksh, is yet to come. In the meantime, I wrote a bit more in README.Debian in mirabilos-support.

A surprise to see my box booting up with the default GRUB 2.x menu, followed by “cannot find a working init”.

What happened?

Well, grub:i386 and grub:x32 are distinct packages, so APT helpfully decided to purge the GRUB config. OK. Manual boot menu entry editing later, re-adding “GRUB_DISABLE_SUBMENU=y” and “GRUB_CMDLINE_LINUX="syscall.x32=y"” to /etc/default/grub, removing “quiet” again from GRUB_CMDLINE_LINUX_DEFAULT, and uncommenting “GRUB_TERMINAL=console”… and don’t forget to “sudo update-grub”. There. This should work.

On the plus side, nvidia-driver:i386 seems to work… but not with boinc-client:x32 (why, again? I swear, its GPU detection has been driving me nuts on >¾ of all systems I installed it on, already!).

On the minus side, I now have to figure out why…

tglase@tglase:~ $ sudo ifup -v tap1
Configuring interface tap1=tap1 (inet)
run-parts --exit-on-error --verbose /etc/network/if-pre-up.d
run-parts: executing /etc/network/if-pre-up.d/bridge
run-parts: executing /etc/network/if-pre-up.d/ethtool
ip addr add broadcast   peer  dev tap1 label tap1
Cannot find device "tap1"
Failed to bring up tap1.

… this happens. This used to work before the cktN kernels.

Bernhard’s article on Plänet Debian about the “colon” command in the shell could use a clarification and a security-relevant correcture.

There is, indeed, no difference between the : and true built-in commands.

Stéphane Chazelas points out that writing : ${VARNAME:=default} is bad, : "${VARNAME:=default}" is correct. Reason: someone could preset $VARNAME with, for example, /*/*/*/*/../../../../*/*/*/*/../../../../*/*/*/* which will exhaust during globbing.

Besides that, the article is good. Thanks Bernhard for posting it!

PS: I sometimes use the colon as comment leader in the last line of a script or function, because it, unlike the octothorpe, sets $? to 0, which can be useful.

Update: As jilles pointed out in IRC, “colon” (‘:’) is a POSIX special built-in (most importantly, it keeps assignments), whereas “true” is a regular built-in utility.


03.12.2014 by tg@
Tags: geocache

Wenn ich meine Geocaches so „genau“ ausmessen würde wie die Munzees hier in der Ecke sind, würden mir wütende Finder die Bude einrennen…

Und wieso überhaupt kann ich in der Ähpp ein DNF als Logtyp auswählen, aber beim Sync sagt er dann, ginge nicht? (NM geht. Note sollte auch.)

Alles in allem: besser als Ingress (nicht schwer…), aber ähnlich stromfressend; nerviger als Geocaching (die Ähpp ist auch furchtbar lahm). Und: it’s all about the numbers, aber teilt sich bei mir nunmal mit anderen GPS-Spielen die Statistik…

Wußtest Du schon, daß eine Abzweigung (eine andere Straße oder sogar auch nur ein Feldweg) dadurch markiert wird, daß sie zwischen Pöllern mit orangefarbenen statt weißen Reflektoren steht?

(tg@ continuing…) Nein, wußte ich nicht, aber jetzt wo Du’s sagst… danke! Hilfreich! Daß die Pöller links wie ein Doppelpunkt und rechts wie ein senkrechter Strich geformt sind wußte ich immerhin schon. Ja, kann mir vorstellen, daß es bei der Navigation im Schnee hilft. Nein, in der Fahrschule hörte ich dies, und so manches anderes, nicht… komme mir im Nachhinein betrogen vor…

RNG for MirBSD and subprojects

29.11.2014 by tg@
Tags: plan

Feel free to ignore those semi-unsorted ramblings of mine, they are unfinished, not binding, notes of plans that may come if I ever learn 影分身の術 (Kage Bunshin no Jutsu) or bilocality…

We currently have arc4random(9) in the kernel and arc4random(3) in userspace. We also have the urandom(4) stuff, but nobody should use them really. OpenBSD simplified theirs, but lost functionality like arc4random_addrandom(3) during that. I complicated ours, to get e.g. arc4random_pushb_fast(3), and for using userspace as additional pools, but that grew complex too, and few applications really add to their state other than using it anyway.

My idea thus far is to begin with those applications. That would be mksh(1) and ntpd(8) only, AFAICT. On the basis of the recently Spritz, an aRC4 successor with great sponge properties, I plan on creating s4random, which could serve their specific needs: an output state Spritz (like arc4random has); an input Spritz (which corresponds to the arc4random_roundhash) tweaked to have, every time Shuffle() is called by the absorption functions, four bytes sent to a BAFH state from Drip(); that 32-bit state is then used to randomly drop from the output state (in addition to a value from the output state itself like arc4random uses) for faster feedback (think state recovery attacks). The output state can then be seeded less often but in larger blocks, taking from the input state as well as arc4random(3) or sysctl(3) KERN_ARND or OpenBSD getentropy() or Linux getrandom() or /dev/urandom, with the usual pushback. It could also need only 16 bytes instead of 128/256 bytes from the kernel on such calls (possibly lowering the a4s_count equivalent for the first two trips). It would also need to work on lesser operating systems, so it can probably have a function to determine seed status (2 = third trip, kernel entropy; 1 = first or second trip, or Win32 CryptGenRandom; 0 = untrusted). Also consider skipping initialisation by hardcoding one at compile time, facilitated through Mirtoconf v2. (Also, reducing the maximum Squeeze() parameter to 64 before random dropping engages, instead of 256, makes sense. The BAFH state also needs feedback from the output state…)

Then, I could simplify MirBSD libc arc4random(3) as all other applications than those mentioned above (and maybe libcrypto, but that’s a special case anyway) don’t need this sort of fast feedback loop. I’ve not yet planned that part out. – Finally, the kernel may or may not adopt Spritz but I’ve got ideas wrt. that, faster feedback loops, less overhead for interrupt handlers, etc. as well. This can wait a bit, as Spritz is still very new, so I’d prefer to not lower the security level accidentally, but it can be prototyped for something eventually ending up in ntpd(8) where it has low impact, and mksh, where the MirJSON and Mirkev code will need it.

OpenSSL’s libcrypto is another case. Just using arc4random(3) now has effectively reduced its state size from about 8184 bit to about 1700 bit of aRC4 state while a Spritz state has about 1476‒1604 bit. Of course, it reads from the kernel, which doesn’t offer more anyway, and people say about security levels, but there’s still always EGD and, more importantly, ~/.rnd (or RANDFILE to be exact). So, an upscaling solution is needed, too, but I can construct one, similar to how arc4random_roundhash is comprised of 32 32-bit BAFH states with appropriate (but slow) mixing. But that’s specific to MirBSD anyway, and can take time.

Meh. Reminds me, I probably should add getentropy() before upgrading OpenSSH to a version doing the sandboxing. And let arc4random(3) use the new MAP_INHERIT_ZERO stuff; at least minherit(2) throws EINVAL as safe fallback but it still requires updating the kernel first. But then it has been there for months already.

d-i preseeding is not the answer

25.11.2014 by tg@
Tags: debian rant work

This post details what the d-i team currently shows as the only way.

It has several shortcomings and one missing documentation part.

Shortcoming: --purge is missing from the apt-get invocation. This leaves packages in “rc” state (requiring a manual dpkg --purge to completely remove them later, as they are then invisible to apt).

Worse shortcoming: this still leaves all dependencies pulled in by systemd around on the system, because packages installed by debootstrap are not eligible for “apt-get --purge autoremove”. Additionally, it does not influence debootstrap’s (nōn-existent, see #557322, #668001, #768062) dependency resolver, leading to possibly pessimistic package selections.

Missing: you can just hit Alt-F2 and enter the command…

	in-target apt-get --purge -y install sysvinit-core

… there, no need to preseed. But this does not eliminate the aforementioned shortcomings, of course.

Another PSA: something surprising about XML.

As you might all know, XML must be valid UTF-8 (or UTF-16 (or another encoding supported by the parser, but one which yields valid Unicode codepoints when read and converted)). Some characters, such as the ampersand ‘&’, must be escaped (“&#38;” or “&#x26;”, although “&amp;” may also work, depending on the domain) or put into a CDATA section (“<![CDATA[&]]>”).

A bit surprisingly, a literal backspace character (ASCII 08h, Unicode U+0008) is not allowed in the text. I filed a bugreport against libxml2, asking it to please encode these characters.

A bit more research followed. Surprisingly, there are characters that are not valid in XML “documents” in any way, not even as entities or in CDATA sections. (xmlstarlet, by the way, errors out somewhat nicely for an unescaped literal or entity-escaped backspace, but behaves absolutely hilarious for a literal backspace in a CDATA section.) Basically, XML contains a whitelist for the following Unicode codepoints:

  • U+0009
  • U+000A
  • U+000D
  • U+0020‥U+D7FF
  • U+E000‥U+FFFD
  • U-00010000‥U-0010FFFF

Additionally, a certain number of codepoints is discouraged: U+007F‥U+0084 (IMHO wise), U+0086‥U+009F (also wise, but why allow U+0085?), U+FDD0‥U+FDEF (a bit surprisingly, but consistent with disallowing the backspace character), and the last two codepoints of every plane (U+FFFE and U+FFFF were already disallowed, but U-0001FFFE, U-0001FFFF, …, U-0010FFFF weren’t; this is extremely wise).

The suggestion seems to be to just strip these characters silently from the XML “document”.

I’m a bit miffed about this, as I don’t even use XML directly (I’m extending a PHP “webapplication” that is a SOAP client and talks to a Java™ SOAP-WS) and would expect this to preserve my strings, but, oh my. I’ve forwarded the suggestion to just strip them silently to the libxml2 maintainers in the aforementioned bug report, for now, and may even hack that myself (on customer-paid time). More robust than hacking the PHP thingy to strip them first, anyway – I’ve got no control over the XML after all.

Sharing this so that more people know that not all UTF-8 is valid in XML. Maybe it saves someone else some time. (Now wondering whether to address this in my xhtml_escape shell function. Probably should. Meh.)

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

MirOS Logo