• Custom User Avatar

    The total average is the average of all non-discarded values, not the average of the averages.

  • Custom User Avatar

    This comment is hidden because it contains spoiler information about the solution

  • Custom User Avatar

    Non-spoiler comment: I welcome ideas, suggestions or simply edits to change the tests such that they are about as hard on current hardware as they were seven years ago on old hardware.

  • Custom User Avatar

    This comment is hidden because it contains spoiler information about the solution

  • Custom User Avatar

    In general, you should avoid vague phrases like "does not work" when describing a programming problem. Others can't help you without more specific explanations.

  • Custom User Avatar

    Thanks! Approved.

  • Custom User Avatar

    Looks good! Thanks.

  • Custom User Avatar

    @lechevalier and I came up with a pure Python solution. Finding the fastest Python data structures for this task took a bit of trial and error, but it's definitely possible. Maybe someone will find a pure Python solution that's even faster...

  • Custom User Avatar

    bitarray would be great, but it's an external library and not available on Codewars.

  • Custom User Avatar

    @lechevalier and I came up with a pure Python solution that passes the full tests. No NumPy, no gmpy2. Just over twenty lines, quite readable, nothing fancy. Finding the most efficient data types and iterators for the task took a bit of trial and error, but it's definitely possible.

  • Custom User Avatar

    I played around with the code some more and ran more measurements.

    I went back to bytearray, mostly because x * b'\0' already is a bytes array, and wrapping it in an array('B', ...) seems to be slower.

    I also ran lots of measurements with several ways of generating the list from the sieve. See my fork for details.

  • Custom User Avatar

    Thanks a lot! Great stuff!

    The more I try to optimize this thing, the more I understand that I don't understand Python performance...

    For example, setting the entries for the multiples to zero: I thought that using an iterator should be the fastest approach, and I didn't even try using an array, but your solution shows that creating an array is actually faster, although it may have to allocate quite a bit of memory!

    For example, to mark all the multiples of 5 in the sieve for 375 million numbers, the array size (n-s1+p2-1)//p2 is about 375 million / 3 / 10 = 125 million bytes. And we use two arrays of that size!

    But maybe Python is smart enough to not actually allocate the arrays, because they're all zeros anyway. Or maybe Linux's malloc() helps, because it doesn't actually allocate pages until they are written. I have no idea.

    Or maybe it actually does allocate the arrays, and that's still faster than the iterators, because with the iterators we have 125 million function calls (times two)? I have no idea...

  • Custom User Avatar

    I don't know why this has been invalidated. Looks good to me. I guess the test server was slow.

  • Custom User Avatar

    Thanks! Approved.

  • Custom User Avatar

    Generated by ChatGPT.

  • Loading more items...