Ad
  • Custom User Avatar

    The Unicode standare only uses 1 to 4 byte sequences, but UTF-8 technically supports up to 6 bytes per character. Once decoded, though, these characters just won't resolve to anything. The kata doesn't request for the actual character's representation anywhere in the code, just the codepoint that it would represent. If you have further questions, please feel free to reply with them.

  • Custom User Avatar

    Issue fixed! Tests now include 6 bytes and above, where empty arrays are returned on error. I have also added other randomized error checks, such as continuation byte errors.

  • Custom User Avatar

    Thanks for the update! At the time I was developing the tests, I was not thinking about anything higher than what the Unicode standard supports, (up to 0x10FFFF, although it itself is not used). I will keep this thread updated, although it should be as simple as increasing a number in the test code.