Nyx problems [Sunday 12th June 2016 at 4:46 pm]

Nyx has just crashed in some fashion - both monitors spontaneously dropped into power-save mode, and the music that was playing stopped a few seconds later. The keyboard is no longer responding either (protip: the Num Lock light is software-controlled, so if nothing happens when you press Num Lock then things have gone very wrong). So that looks like a rather major kernel fault and I suspect the graphics card is involved (when I first upgrades to the GTX 950, it had a spate of graphics driver reloads - I thought those had gone away, but evidently not).

Except... Nyx is still present on the network. I can ping it from Hemera. The video output has failed, it's not responding to keyboard input... but I can ping it over a wifi connection. I can see the file shares in Explorer. I can even get as far as opening a remote desktop connection, though it gets stuck on "configuring remote session".

Seriously, WTF is going on here? This is unusual even by my computing standards.

Ah well, since it's running and network-visible time to do some poking around. There's a fair amount of remote administration capability built into Windows - most of the stuff in Administrative Tools can control another computer if you've got a suitable account to log in to. So let's see what I can find out!

First off, let's try Event Viewer. Launch Event Viewer, right-click on the top-level Event View item in the tree, pick Connect to Another Computer, and enter Nyx (the same trick works for Computer Management). Hmm... well the Application log shows that Desktop Window Manager disabled itself (this is the service that does all the Aero effects), and then Windows Error Reporting logged a kernel error. Nothing particularly useful here but it does point towards graphics. System log, on the other hand, starts off with a Display log claiming that nvlddmkm hung and was "successfully" recovered (hah), followed by some error logs from nvlddmkm. Now I happen to know that this is part of the nVidia drivers, so I was right - it was indeed the graphics card and/or driver that failed.

Next... hmm, I wonder if I can get Device Manager working? Well, I need to start the Remote Registry service according to the error message from Device Manager (easy enough), and then from the instructions here I also need a group policy tweak and then to run gpupdate.exe, which I can do remotely with Sysinternals' PsExec tool. And with that I now have remote Device Manager! Sadly it's only a read-only view and so doesn't let me do anything interesting.

Well, my conclusion is that there was a serious graphics card/driver error and the system failed to recover the driver, and some aspect of that completely broke the GUI. I can't get any further remotely so my best option is to trigger a remote shutdown (in the hope of preserving whatever state I had in open programs) and then if that doesn't work use the reset button.

And then head out and enjoy the unexpectedly sunny afternoon because my computers are conspiring against me (Hemera's currently refusing to enable its wifi radio so I've got it on the floor plugged directly into the router). I did at least manage to load the latest geocaches onto the phone first, so I achieved something!
