Anyway, I had the idea to check if the last second of each track was similar to the first second of the next one, perhaps by running it through a Fourier Transform and looking to see if there's similar peaks in the output. To do that I need to do two things: decode the audio track (these are in FLAC format), and run a FFT over the output.
Surprisingly, it's the first part that's proving to be the hard one. Java-based FLAC decoders are a bit thin on the ground and the only one I've found is jFLAC, last released in 2008. It looks suspiciously like a port of some C-based code with chunks commented out until it compiled... but it does at least open the stream and parse the header with what's not the most terrible API I've come across. Having got that far I then spent the past hour bashing out code to seek to a probable location, search for the start of the last second of audio, and then pull all the data out.
Does my sample-munging code work? Well, I don't know, because jFLAC just spat a STREAM_DECODER_UNPARSEABLE_STREAM exception out at me which turns out to be because it's actually an incomplete implementation and doesn't support a slightly newer encoding added to the spec in 2007. Sigh.
Ironically enough I'd been considering writing my own Java FLAC decoder because of how unfriendly the API was in places - looks like I will be doing that after all... (yes, I could raise a bug and/or try to fix jFLAC, but that's not been touched since 2011 and appears abandoned...)