I forgot to remove the underscore from BAIL_OUT above, but fixing that had the same result.
Seems to be because of this:
Since the startup method has no tests, it doesn't output the plan until it hits a method that has tests. It bails out before it hits such a method, so despite the plan being for 99 tests and none being run, it doesn't report any failures. Seems like the plan is the plan, regardless of how far you get, so _show_header should just be called at the beginning of runtests no matter what. Maybe I'm missing some piece of the puzzle, though.
If you don't want to make that the default behavior, maybe allow for passing in some flag to indicate that behavior? It would help us a lot with our nightly automated tests to actually know how many tests failed due to an abort during a failed startup method.