And here's some fun you can have with busybox. Spot 3 glitches.
root@socrates:/mnt/my# dd if=u-boot.img of=/dev/mmbclk0p1 bs=64K seek=4 3+1 records in 3+1 records out 224220 bytes (224 kB) copied, 0.00113385 s, 198 MB/s root@socrates:~# dd if=/dev/mmcblk0p1 of=orig.spl.1 bs=64k seek=0 count=0 0+0 records in 0+0 records out 0 bytes (0 B) copied, 3.972e-05 s, 0.0 kB/s socrates login: riit^H^H^H^[[2~^H^H^[[3~ login: loginprompt.c:164: login_prompt: Assertion `wlen == (int) len -1' failed
Aside from a number of minor improvements and fixes to various pages, the notable changes in man-pages-3.64 are the following:
- I've written a new open_by_name_at(2) page that documents the name_to_handle_at() and open_by_handle_at() system calls that were added to Linux in version 2.6.39.
- I've made substantial updates to the inotify(7) man page to document some of the limitations, complexities, and pitfalls of the inotify API.
Aside from a number of minor improvements and fixes to various pages, the notable changes in man-pages-3.65 are the following:
- A new inet_net_pton(3) page describes the inet_net_pton() and inet_net_ntop() library functions.
- The fallocate(2) page adds documentation for the FALLOC_FL_COLLAPSE_RANGE operation added in Linux 3.15.
- The prctl(2) page adds documentation of the PR_SET_THP_DISABLE and PR_GET_THP_DISABLE operations added in Linux 3.15.
- Peng Haitao continued the task of adding thread-safety information to various man pages.
|Opened since 2014-04-11||3||14||8||(25)|
|Closed since 2014-04-11||6||9||6||(21)|
|Changed since 2014-04-11||6||29||14||(49)|
Added some code to trinity to use random open flags on the fd’s it opens on startup.
Spent most of the day hitting the same VM bugs as yesterday, or others that Sasha had already reported.
Later in the day, I started seeing this bug after applying a not-yet-merged patch to fix a leak that Coverity had picked up on recently. Spent some time looking into that, without making much progress.
Rounded out the day by trying out latest builds on my freshly reinstalled laptop, and walked into this.
Spent all of yesterday attempting recovery (and failing) on the /home partition of my laptop.
On the weekend, I decided I’d unsuspend it to send an email, and just got a locked up desktop. The disk IO light was stuck on, but it was completely dead to input, couldn’t switch to console. Powered it off and back on. XFS wanted to me to run xfs_repair. So I did. It complained that there was pending metadata in the log, and that I should mount the partition to replay it first. I tried. It failed miserably, so I re-ran xfs_repair with -L to zero the log. Pages and pages of scrolly text zoomed up the screen.
Then I rebooted and.. couldn’t log in any more. Investigating with root showed that /home/davej was now /home/lost & found, and within it were a couple dozen numbered directories containing mostly uninteresting files.
So that’s the story about how I came to lose pretty much everything I’ve written in the last month that I hadn’t already pushed to github. I’m still not entirely sure what happened, but I point the finger of blame more at dm-crypt than at xfs at this point, because the non-encrypted partitions were fine.
Ultimately I gave up, reformatted and reinstalled. Kind of a waste of a day (and a half).
Things haven’t being entirely uneventful though:
So there’s still some fun VM/FS horrors lurking. Sasha has been hitting a bunch more huge page bugs too. It never ends.
It describes a couple of attacks. The first is that some platforms store their Secure Boot policy in a run time UEFI variable. UEFI variables are split into two broad categories - boot time and run time. Boot time variables can only be accessed while in boot services - the moment the bootloader or kernel calls ExitBootServices(), they're inaccessible. Some vendors chose to leave the variable containing firmware settings available during run time, presumably because it makes it easier to implement tools for modifying firmware settings at the OS level. Unfortunately, some vendors left bits of Secure Boot policy in this space. The naive approach would be to simply disable Secure Boot entirely, but that means that the OS would be able to detect that the system wasn't in a secure state. A more subtle approach is to modify the policy, such that the firmware chooses not to verify the signatures on files stored on fixed media. Drop in a new bootloader and victory is ensured.
But that's not a beautiful approach. It depends on the firmware vendor having made that mistake. What if you could just rewrite arbitrary variables, even if they're only supposed to be accessible in boot services? Variables are all stored in flash, connected to the chipset's SPI controller. Allowing arbitrary access to that from the OS would make it straightforward to modify the variables, even if they're boot time-only. So, thankfully, the SPI controller has some control mechanisms. The first is that any attempt to enable the write-access bit will cause a System Management Interrupt, at which point the CPU should trap into System Management Mode and (if the write attempt isn't authorised) flip it back. The second is to disable access from the OS entirely - all writes have to take place in System Management Mode.
The MITRE results show that around 0.03% of modern machines enable the second option. That's unfortunate, but the first option should still be sufficient. Except the first option requires on the SMI actually firing. And, conveniently, Intel's chipsets have a bit that allows you to disable all SMI sources, and then have another bit to disable further writes to the first bit. Except 40% of the machines MITRE tested didn't bother setting that lock bit. So you can just disable SMI generation, remove the write-protect bit on the SPI controller and then write to arbitrary variables, including the SecureBoot enable one.
This is, uh, obviously a problem. The good news is that this has been communicated to firmware and system vendors and it should be fixed in the future. The bad news is that a significant proportion of existing systems can probably have their Secure Boot implementation circumvented. This is pretty unsurprisingly - I suggested that the first few generations would be broken back in 2012. Security tends to be an iterative process, and changing a branch of the industry that's historically not had to care into one that forms the root of platform trust is a difficult process. As the MITRE paper says, UEFI Secure Boot will be a genuine improvement in security. It's just going to take us a little while to get to the point where the more obvious flaws have been worked out.
 Unless the malware was intelligent enough to hook GetVariable, detect a request for SecureBoot and then give a fake answer, but who would do that?
 Impressively, basically everyone enables that.
 Great for dealing with bugs caused by YOUR ENTIRE COMPUTER BEING INTERRUPTED BY ARBITRARY VENDOR CODE, except unfortunately it also probably disables chunks of thermal management and stops various other things from working as well.
|Opened since 2014-04-04||3||19||7||(29)|
|Closed since 2014-04-04||7||13||6||(26)|
|Changed since 2014-04-04||9||32||10||(51)|
|Opened since 2014-03-28||4||17||9||(30)|
|Closed since 2014-03-28||12||17||4||(33)|
|Changed since 2014-03-28||15||29||10||(54)|
Weekly Fedora kernel bug statistics – April 04 2014 is a post from: codemonkey.org.uk
|Opened since 2014-03-01||32||92||19||(143)|
|Closed since 2014-03-01||156||183||99||(438)|
|Changed since 2014-03-01||83||111||31||(225)|
In terms of background: in 2008, Brendan donated money to the campaign for Proposition 8, a Californian constitutional amendment that expressly defined marriage as being between one man and one woman. Both before and after that he had donated money to a variety of politicians who shared many political positions, including the definition of marriage as being between one man and one woman.
Mozilla is an interesting organisation. It consists of the for-profit Mozilla Corporation, which is wholly owned by the non-profit Mozilla Foundation. The Corporation's bylaws require it to work to further the Foundation's goals, and any profit is reinvested in Mozilla. Mozilla developers are employed by the Corporation rather than the Foundation, and as such the CEO is responsible for ensuring that those developers are able to achieve those goals.
The Mozilla Manifesto discusses individual liberty in the context of use of the internet, not in a wider social context. Brendan's appointment was very much in line with the explicit aims of both the Foundation and the Corporation - whatever his views on marriage equality, nobody has seriously argued about his commitment to improving internet freedom. So, from that perspective, he should have been a fine choice.
But that ignores the effect on the wider community. People don't attach themselves to communities merely because of explicitly stated goals - they do so because they feel that the community is aligned with their overall aims. The Mozilla community is one of the most diverse in free software, at least in part because Mozilla's stated goals and behaviour are fairly inspirational. People who identify themselves with other movements backing individual liberties are likely to identify with Mozilla. So, unsurprisingly, there's a large number of socially progressive individuals (LGBT or otherwise) in the Mozilla community, both inside and outside the Corporation.
A CEO who's donated money to strip rights from a set of humans will not be trusted by many who believe that all humans should have those rights. It's not just limited to individuals directly affected by his actions - if someone's shown that they're willing to strip rights from another minority for political or religious reasons, what's to stop them attempting to do the same to you? Even if you personally feel safe, do you trust someone who's willing to do that to your friends? In a community that's made up of many who are either LGBT or identify themselves as allies, that loss of trust is inevitably going to cause community discomfort.
The first role of a leader should be to manage that. Instead, in the first few days of Brendan's leadership, we heard nothing of substance - at best, an apology for pain being caused rather than an apology for the act that caused the pain. And then there was an interview which demonstrated remarkable tone deafness. He made no attempt to alleviate the concerns of the community. There were repeated non-sequiturs about Indonesia. It sounded like he had no idea at all why the community that he was now leading was unhappy.
And, today, he resigned. It's easy to get into hypotheticals - could he have compromised his principles for the sake of Mozilla? Would an initial discussion of the distinction between the goals of members of the Mozilla community and the goals of Mozilla itself have made this more palatable? If the board had known this would happen, would they have made the same choice - and if they didn't know, why not?
But that's not the real point. The point is that the community didn't trust Brendan, and Brendan chose to leave rather than do further harm to the community. Trustworthy leadership is important. Communities should reflect on whether their leadership reflects not only their beliefs, but the beliefs of those that they would like to join the community. Fail to do so and you'll drive them away instead.
 For people who've been living under a rock
 Proposition 8 itself was a response to an ongoing court case that, at the point of Proposition 8 being proposed, appeared likely to support the overturning of Proposition 22, an earlier Californian ballot measure that legally (rather than constitutionally) defined marriage as being between one man and one woman. Proposition 22 was overturned, and for a few months before Proposition 8 passed, gay marriage was legal in California.
 Brendan made a donation on October 25th, 2008. This postdates the overturning of Proposition 22, and as such gay marriage was legal in California at the time of this donation. Donating to Proposition 8 at that point was not about supporting the status quo, it was about changing the constitution to forbid something that courts had found was protected by the state constitution.
At the start of the week my local Red Hat IT guy asked me if I knew anything about DP MST, it turns out the Lenovo T440s and T540s docks have started to use DP MST, so they have one DP port to the dock, and then dock has a DP->VGA, DP->DVI/DP, DP->HDMI/DP ports on it all using MST. So when they bought some of these laptops and plugged in two monitors to the dock, it fellback to using SST mode and only showed one image. This is not optimal, I'd call it a bug :)
Now I have a damaged in transit T440s (the display panel is in pieces) with a dock, and have spent a couple of days with DP 1.2 spec in one hand (monitor), and a lot of my hair in the other. DP MST has a network topology discovery process that is build on sideband msgs send over the auxch which is used in normal DP to read/write a bunch of registers on the plugged in device. You then can send auxch msgs over the sideband msgs over auxch to read/write registers on other devices in the hierarchy!
Today I achieved my first goal of correctly encoding the topology discovery message and getting a response from the dock:
[ 2909.990743] link address reply: 4
[ 2909.990745] port 0: input 1, pdt: 1, pn: 0
[ 2909.990746] port 1: input 0, pdt: 4, pn: 1
[ 2909.990747] port 2: input 0, pdt: 0, pn: 2
[ 2909.990748] port 3: input 0, pdt: 4, pn: 3
There are a lot more steps to take before I can produce anything, along with dealing with the fact that KMS doesn't handle dynamic connectors so well, should make for a fun tangent away from the job I should be doing which is finishing virgil.
I've ordered another DP MST hub that I can plug into AMD and nvidia gpus that should prove useful later, also for doing deeper topologies, and producing loops.
Also some 4k monitors using DP MST as they are really two panels, but I don't have one of them, so unless one appears I'm mostly going to concentrate on the Lenovo docks for now.
The first big thing is Ben Widawsky's support for per-process address spaces. Since a long time we've already supported the per-process gtt page tables the hardware provides, but only with one address space. With Ben's work we can now manage multiple address spaces and switch between them, at least on Ivybridge and Haswell. Support for Baytrail and Broadwell for this feature is still in progress. This finally allows multiple different users to use the gpu concurrently without accidentally leaking information between applications. Unfortunately there have been a few issues with the code still so we had to disable this for 3.15 by default.
Another really big feature was the much more fine-grained display power domain handling and a lot of runtime power management infrastructure work from Imre and Paulo. Intel gfx hardware always had lots of automatic clock and power gating, but recent hardware started to have some explicit power domains which need to be managed by the driver. To do that we need to keep track of the power state of each domain, and if we need to switch something on we also need to restore the hardware state. On top of that every platform has it's own special set of power domains and how the logical pieces are split up between them also changes. To make this all manageable we now have an extensive set of display power domains and functions to handle them as abstraction between the core driver code and the platform specific power management backend. Also a lot of work has happened to allow us to reuse parts of our driver resume/suspend code for runtime power management. Unfortunately the patches merged into 3.15 are all just groundwork, new platforms and features will only be enabled in 3.16.
Another long-standing nuisance with our driver was the take-over from the firmware configuration. With the reworked modesetting infrastructure we could take over the output routing, and with the experimental i915.fastboot=1 option we could eschew the initial modeset. This release Jesse provided another piece of the puzzle by allowing the driver to inherit the firmware framebuffer. There's still more work to do to make fastboot solid and enable it by default, but we now have all the pieces in place for a smooth and fast boot-up.
There have been tons of patches for Broadwell all over the place. And we still have some features which aren't yet fully enabled, so there will be lots more to come. Other smaller features in 3.15 are improved support for framebuffer compression from Ville, again more work still left. 5.4GHz DisplayPort support, which is a required to get 4k working. Unfortunately most 4k DP monitors seem to expose two separate screens and so need a adriver with working MST (multi-stream support) which we don't yet support. On that topic: The i915 driver now uses the generic DP aux helpers from Thierry Redding. Having shared code to handle all the communication with DP sinks should help a lot in getting MST off the ground. And finally I'd like to highlight the large cursor support, which should be especially useful for high-dpi screens.
And of course there's been fixes and small improvements all over the place, as usual.
The big thing that stands out this cycle is that the defect ratio was going down until we hit around 3.14-rc7, and then we got a few hundred new issues. What happened ?
Nothing in the kernel thankfully. This was due to an upgrade server side to a new version of Coverity which has some new checkers. Some of the existing ones got improved too, so a bunch of false positives we had sitting around in the database are no longer reported. The number of new issues unfortunately was greater than the known false positives. In the days following, I did a first sweep through these and closed out the easy ones, bringing the defect density back down.
note: I stopped logging the ‘dismissed’ totals. With Coverity 7.0, the number can go backwards.
If a file gets deleted, the issues against that file that were dismissed also disappears.
Given this happens fairly frequently, the number isn’t really indicative of anything useful.
With the 3.15 merge window now open, I’m hoping a bunch of the queued fixes I sent over the last few weeks get merged, but I’m fully expecting to need to do some resending.
 It was actually worse than this, the ratio went back up to 0.57 right before rc7
It’s been a busy week.
A week ago I flew out to Napa,CA for two days of discussions with various kernel people (ok, and some postgresql people too) about all things VM and FS/IO related. I learned a lot. These short focussed conferences have way more value to me these days personally than the conferences of years ago with a bunch of tracks, and day after day of presentations.
I gave two sessions relating to testing, there are some good write-ups on lwn. It was more of a extended QA than a presentation, so I got a lot of useful feedback (and especially afterwards in the hallway sessions). A couple people asked if trinity was doing certain things yet, which led to some code walkthroughs, and a lot of brainstorming about potential solutions.
By the end of the week I was overflowing with ideas for new things it could be doing, and have started on some of the code for this already. One feature I’d had in mind for a while (children doing root operations) but hadn’t gotten around to writing could be done in a much simpler way, which opens the doors to a bunch more interesting things. I might end up rewriting the current ioctl fuzzing (which isn’t finding a huge amount of bugs right now anyway) once this stuff has landed, because I think it could be doing much more ‘targeted’ things.
It was good to meet up with a bunch of people that I’ve interacted with for a while online and discuss some things. Was surprised to learn Sasha Levin is actually local to me, yet we both had to fly 3000 miles to meet.
Two sessions at LSF/MM were especially interesting outside of my usual work.
The postgresql session where they laid out their pain points with the kernel IO was enlightening, as they started off with a quick overview of postgresql’s process model, and how things interact. The session felt like it went off in a bunch of random directions at once, but the end goal (getting a test case kernel devs can run without needing a full postgresql setup) seemed to be reached the following day.
The second session I found interesting was the “Facebook linux problems” session. As mentioned in the lwn write-up, one of the issues was this race in the pipe code. “This is *very* hard to trigger in practice, since the race window is very small”. Facebook were hitting it 500 times a day. Gave me thoughts on a whole bunch of “testing at scale” problems. A lot of the testing I do right now is tiny in comparison. I do stress tests & fuzz runs on a handful of machines, and most of it is all done by hand. Doing this kind of thing on a bigger scale makes it a little impractical to do in a non-automated way. But given I’ve been buried alive in bugs with just this small number, it has left me wondering “would I find a load more bugs with more machines, or would it just mean the mean time between reproducing issues gets shorter”. (Given the reproducibility problems I’ve had with fuzz testing sometimes, the latter wouldn’t necessarily be a bad thing). More good thoughts on this topic can be found in a post google made a few years ago.
Coincidentally, I’m almost through reading How google tests software, which is a decent book, but with not a huge amount of “this is useful, I can apply this” type knowledge. It’s very focussed on the testing of various web-apps, with no real mention of testing of Android, Chrome etc. (The biggest insights in the book aren’t actually testing related, but more the descriptions of googles internal re-hiring processes when people move between teams).
Collaboration summit followed from Wednesday onwards. One highlight for me were learning that the tracing code has something coming in 3.15/3.16 that I’ve been hoping for for a while. At last years kernel summit, Andi Kleen suggested it might be interesting if trinity had some interaction with ftrace to get traces of “what the hell just happened”. The tracing changes landing over the next few months will allow that to be a bit more useful. Right now, we can only do that on a global system-wide basis, but with that moving to be per-process, things can get a lot more useful.
Another interesting talk was the llvmlinux session. I haven’t checked in on this project in a while, so was surprised to learn how far along they are. Apparently all the necessary llvm changes to build the kernel are either merged, or very close to merging. The kernel changes still have a ways to go, but this too has improved a lot since I last looked. Some good discussion afterwards about the crossover between things like clang’s static analysis warnings and the stuff I’m doing with Coverity.
Speaking of, I left early on Friday to head back to San Francisco to meet up with Coverity. Lots of good discussion about potential workflow improvements, false positive/heuristic improvements etc. A good first meeting if only to put faces to names I’ve been dealing with for the last year. I bugged them about a feature request I’ve had for a while (that a few people the days preceding had also nagged me about); the ability to have per-subsystem notification emails instead of the one global email. If they can hook this up, it’ll save me a lot of time having to manually craft mails to maintainers when new issues are detected.
busy busy week, with so many new ideas I felt like my head was full by the time I got on the plane to get back.
Taking it easy for a day or two, before trying to make progress on some of the things I made notes on last week.
For years, I did my best to ignore the problem, but CKS inspired me to blog the curious networking banality, in case anyone has wisdom to share.
The deal is simple: I have a laptop with a VPN client (I use vpnc). The client creates a tun0 interface and some RFC 1918 routes. My home RFC 1918 routes are more specific, so routing works great. The name service does not.
Obviously, if we trust DHCP-supplied nameserver, it has no work-internal names in it. The stock solution is to let vpnc to install /etc/resolv.conf pointing to work-internal nameservers. Unfortunately this does not work for me, because I have a home DNS zone, zaitcev.lan. Work-internal DNS does not know about that one.
Thus I would like some kind of solution that routes DNS requests somehow according to a configuration. Requests to work-internal namespaces (such as *.redhat.com) would go to nameservers delivered by vpnc (I think I can make it write something like /etc/vpnc/resolv.conf that does not conflict). Other requests go to the infrastructure name service, being it a hotel network or home network. Home network is capable of serving its own private authoritative zones and forwarding the rest. That's the ideal, so how to accomplish it?
I attempted apply a local dnsmasq, but could not figure out if it can do what I want and if yes, how.
For now, I have some scripting that caches work-internal hostnames in /etc/hosts. That works, somewhat. Still, I cannot imagine that nobody thought of this problem. Surely, thousands are on VPNs, and some of them have home networks. And... nobody? (I know that a few people just run VPN on the home infrastructure; that does not help my laptop, unfortunately).
Gnome since 3.8 has restricted the Blank Screen time to between 1 and 15 minutes, or “Never”, to disable screen blanking/locking entirely. If this isn’t granular enough, you can set other values like so:
dconf write /org/gnome/desktop/session/idle-delay 1800
gsettings set org.gnome.desktop.session idle-delay 1800
The value is in seconds, so here we set the delay to 30 minutes (60*30=1800). It seems that once doing this, the UI will show “Never”, but the set value is still used correctly.
There is also a “Presentation Mode” shell extension that adds a button to inhibit screen lock, but for me, I still wanted to have it automatically lock, but just a little bit slower.
EDIT: dconf didn’t actually work! Apparently gsettings is the way to go.
I spent most of the 90s growing up in an environment that was rather more interested in cattle than in computers, and had very little internet access during that time. My entire knowledge of the wider free software community came from a couple of CDs that contained a copy of the jargon file, the source code to the entire GNU project and an early copy of the m68k Linux kernel.
But that was enough. Before I'd even got to university, I knew what free software was. I'd had the opportunity to teach myself how an operating system actually worked. I'd seen the benefits of being able to modify software and share those modifications with others. I met other people with the same interests. I ended up with a job writing free software and collaborating with others on integrating it with upstream code. And, from there, I became more and more involved with a wider range of free software communities, finding an increasing number of opportunities to help make changes that benefited both me and others.
Without free software I'd have started years later. I'd have lost the opportunity to collaborate with people spread over the entire world. My first job would have looked very different, as would my entire career since then. Without free software, almost everything I've achieved in my adult life would have been impossible.
To me, free software means I've lived a significantly better life than would otherwise have been the case. But more than that, it means doing what I can to make sure that other people have the same opportunities. I am here because of the work of others. The most rewarding part of my continued involvement is the knowledge that I am part of a countless number of people working to make sure that others can tell the same story in future.
 I'd link to the actual press release, but it contains possibly the worst photograph of me in the entire history of the universe
As my previous post documented, I’ve experimented with localbitcoins.com. Following the arrest of two Miami men for trading on localbitcoins, I decided to seek legal advice on the sitation in Australia.
Online research led me to Nick Karagiannis of Kelly and Co, who was already familiar with Bitcoin: I guess it’s a rare opportunity for excitement in financial regulatory circles! This set me back several thousand dollars (in fiat, unfortunately), but the result was reassuring.
They’ve released an excellent summary of the situation, derived from their research. I hope that helps other bitcoin users in Australia, and I’ll post more in future should the legal situation change.