A bit delayed by a week after it happened and about a year since the last weekly, here it is.

Work

This week I quit my SRE job at Arista, and returned to my other job, my job at Arista. It was a good run for a while, I spent half of my time doing the SWE work I usually do and the other half shaping up Jenkins. I got a few good things done, from improving availability, to halving peak and average memory used, introducing and building CI/CD infra around managing all jobs with jenkins-job-builder (a.k.a. storing all jobs completely in git) and helping oh so many people with their own pipelines.

At this point, I am happy with what was done. I broke enough things and learned a few more so I decided to call it a day. If you are somehow here, Arista is hiring an SRE full time to maintain our Jenkins/Gerrit/Kubernetes internal infra, if you are interested you can drop me a line, and I can talk to Emer, our v. nice and cool recruiter.

Bug of the week

Along with two other people, we spent a few days debugging continuous mystery segfaults in the Arista OpenConfig agent. Most of them happened while trying to access the same map, in these lines of runtime/map_faststr.go:

func mapaccess2_faststr(t *maptype, h *hmap, ky string) (unsafe.Pointer, bool) {
	if raceenabled && h != nil {
		callerpc := getcallerpc()
		racereadpc(unsafe.Pointer(h), callerpc, funcPC(mapaccess2_faststr))
	}
	if h == nil || h.count == 0 { <- SIGSEGV!
		return unsafe.Pointer(&zeroVal[0]), false
	}

According to delve, h was valid (and even if it wasn’t, how can you crash in h == nil || h.count == 0!?). A few happened in syscalls, one issue showed up in manipulating a value, which when put in a unit test worked perfectly fine.

After a few days of starting at cores and raw bytes with gdb, adding debugging logs, going over assembly manuals, disabling and enabling optimisations, questioning many things about the nature of computers and what pointers are, we rather unceremoniously discovered [spoiler alert!] the switch had buggy RAM. Good thing we know a company that makes them.

Link roundup

Markets, discrimination, and “lowering the bar”

Your Pipeline Argument Is Bullshit

Viewpoint: ‘I feel like I was accidentally hired’

Nix Cookbook

Why Learning NixOS is Difficult, and How to Fix It

Personal OKRs for Success Teach Yourself Computer Science

Abstracting away correctness

You Might as Well Be a Great Copy Editor Maybe I should learn a bit about copy editing?

It’s probably time to stop recommending Clean Code This really tickled my anti-uncle b*b inclinations. At best I don’t care for him and his opinions. At some point he was a bit ahead of the curve, or at least it seemed so at the time, but that’s it. In 2020 there are way better resources. In any case, I’ve steered far away from anyone and any jobs who really buy into him and so far I have not regretted it at all.

Advice for new developers, or Things I wish I had known when I started programming, Part 4