The current code is factored such that reads to the entirety of the last
byte should be dropped. This was relying on the fact that last would be
one past the end in that case. Instead of actually reading that byte
when it's completely out of bounds of the bitmask, just skip reads that
would be invalid. Add more tests to make sure that the behavior is
correct for byte aligned reads of byte aligned bitmaps.