Fascinating example -- this reminds me of Milton Friedman's thermostat in many ways. Outside of your randomization approach, I'm not sure you can really apply a conventional confusion matrix analysis here (incidentally, have you tried determining an AUC for the model? might be more useful way of determining the tradeoff between sensitivity and specificity here than the single point estimate)

