On my side, I spent some more time on HDMI audio and the DSI panel. On the audio side, I'm now emitting the GCP packet for audio mute appropriately (I think), and with some more clocking fixes it's now accepting the audio data at the expected rate. On the DSI front, I fixed a bit of sequencing and added debugfs for the registers like we have in our other encoders. There's still no actual audio coming out of HDMI, and only white coming out of the panel.
The DSI situation is making me wish for someone else's panel that I could attach to the connector, so I could see if my bugs are in the Atmel bridge programming or in the DSI driver.
I did some more analysis of 3DMMES's shaders, and improved our code generation, for wins of 0.4%, 1.9%, 1.2%, 2.6%, and 1.2%. I also experimented with changing the UBO (indirect addressed uniform array) upload path, which showed no effect. 3DMMES's uniform arrays are tiny, though, so it may be a win in some other app later.
I also got a couple of new patches from Jonas Pfeil. I went through his thread switch delay slots patch, which is pretty close to ready. He has a similar patch for branching delay slots, though apparently that one isn't showing wins yet in things he's tested. Perhaps most exciting, though, is that he went and implemented an idea I had dropped on github: replacing our shadow copies of raster textures with a bunch of math in the shader and using general memory loads. This could potentially fix X performance without a compositor, which we otherwise really don't have a workaround for other than "use a compositor." It could also improve GL-in-a-window performance: right now all of our DRI surfaces are raster format, because we want to be able to get full screen pageflipping, but that means we do the shadow copy if they aren't fullscreen. Hopefully this week I'll get a chance to test and review it.