I see many of the solutions use a divide and conquer approach (recursive), whereas I picked the random algorithm from footnote 5 of the Wikipedia article (non-recursive). With this algorithm, you can actually skip the call to sqrt to shave a little time.
In any case, good kata!
This solution always passes, but it would fail at ['zabab', 'ababz', 'abz']. Corresponding tests should be added.
['zabab', 'ababz', 'abz']
This solution doesn't pass the random tests only sometimes, but it passes all current fixed tests. An example of failure:
Input: "134", "222", "313", "321", "422", "233":
Shortest common superstring should be no longer than "11" (e.g. "31342223321") , got "12" ("422233132134")
The arguments of assertEquals are in the wrong order.
Thanks! The amount of frustration is usually proportional to the amount of satisfaction after finally finding a working solution.
I am sorry to hear about your frustration. I just hope you find your efforts worth it.
Congratulations for solving it!
The most frustrating kata I have completed so far. It was driving me nuts. Great kata nontheless.
I was trying to use numpy arrays (python) thinking it would make sorting and filtering much quicker. Somehow numpy arrays made my solution time-out. After I got rid of NumPy I passed in 3 or 4 seconds. Go figure!
Hard to give any advice without more information.
Thanks for the reply!
I will try to figure it out.
Okay so you mean not the manhattan distance, but you think that comparison with or without sqrt should yield the same result. This is true only partially.
Omitting sqrt will indeed yield the same result of comparison of the distances. But in a properly applied O(n log n) algorithm you also need to use an actual distance at one step, and if you do not apply sqrt, you end up with a value being too high. As a condequence, too many points are left for further analysis. Calculating actual value of the distance is required to apply the O(n log n) algorithm correctly.
I verified a couple of Python solutions and they still pass. It's difficult for me to give any specific advice to you without seeing your solution and tracking down the bug in it.
I am not calculating hypot I am calculating the square of the hypot.
But why on earth you need a sqrt in the in the distance function? We don't care about the absolute value of the distance! We can compare the squares. If hypot_a^2 > hypot_b^2 then hypot_a > hypot_b!
This kata does in fact have issue with the input!
Your problems do not seem to be caused by any issue with the kata.
About timeout: there's a couple of possibilities which I would expect to cause problems, but proper implementation completes the full test suite in 5-7 seconds. I believe you are still missing some case and perform too many calculations.
One gotcha is than built-in hypot tends to be slower than just calculation with sqrt, because hypot implements some more complex algorithm which handles overflow etc. But still, original Python solution with sqrt replaced with hypot still finishes well below 10 seconds.
About euclidean distance vs manhattan distance: your reasoning is not correct. With sqrt, distance between (0,0) and (1,1) is ~1.414... and is smaller than distance between (0,0) and (0,2). With manhattan distance (without sqrt), the distance between (0,0) and (1,1) is the same as between (0,0) and (0,2).
About floats: changing to ints would not help, your solution would most probably still be buggy or not performant enough.
This is not a kata issue but your code's issue. You may want to ask a question, maybe, so please post it as a question. And without providing your code, I don't see how anyone could help you.
Pls help me out!
Since we only need to compare disatnces, taking the square root should not be necessary in the distance function. However, if I get rid of the square root, my solution (in Python) starts failing. Why is that? As far as I know calculating a square root is computationaly very expensive so it would make sense to get rid of it.
Why the points are given as such a ridicolous floats? If these points are generated randomly, perhaps it would make sense to scale the points to ints?! I have a suspicion that I am timing out mainly due to the calculations within the distance function (stupid floats). I can optimize other parts of my code but I feel that distance funtion is the biggest hole.
Is this kata about Closest pairs or is it about the floats?