I've also landed the vc4 part of the DSI driver in the drm-misc-next tree, along with a fixup. Working with the new drm-misc-next trees for vc4 submission is delightful -- once I get a review, I can just push code to the tree and it will magically (through Daniel Vetter's pull requests and some behind-the-scenes tools) go upstream at the appropriate time. I am delighted with the work that Daniel has been doing to make the DRM subsystem more welcoming of infrequent contributors and reducing the burden on us all of participating in the Linux kernel process.
In 3D land, the biggest news is that I've fixed a kernel oops (producing a desktop lockup) when CMA returns out of memory. Unfortunately, VC4 doesn't have an MMU, so we require that memory allocations for graphics be contiguous (using the Contiguous Memory Allocator), and to make things worse we have a limit of 256MB for CMA due to an addressing bug of the V3D part, so CMA returning out of memory is a serious and unfortunately frequent problem. I had a bug where a CMA failure would put the failed allocation into the BO cache, so if you asked for a BO of the same size soon enough, you'd get the bad BO back and crash. I've been unable to construct a good minimal testcase for it, but the patch is on the mailing list and in the rpi-4.4.y tree now.
I've also fixed a bug in the downstream tree's "firmware KMS" mode (where I use the closed source firmware for display, but open source 3D) where fullscreen 3D rendering would get a single frame displayed and then freeze.
In userspace, I've fixed a bit of multisample rendering (copies of MSAA depth buffers) that gets more of our regression testing working, and worked on some potential rasterization problems (the texwrap regression tests are failing due to texture filtering troubles, and I'm not sure if we're actually doing anything wrong or not because we're near the cutoff for the "how accurate does the filtering have to be?").
Coming up, I'm looking at fixing a cursor updates bug that Michael Zoran has found, and fixing up the DSI panel driver so that it can hopefully get into 4.12. Also, after a recent discussion with Eben, I've realized that we can actually scan out tiled buffers from the GPU, so we may get big performance wins for un-composited X if I can have Mesa coordinate with the kernel on producing tiled buffers.