Discuss Earth-mover's distance

6 kyu

Earth-mover's distance

18 of 96geoffp

Description

Loading description...

Fundamentals

Algorithms

Mathematics

Arrays

View

8 Reasons Why Codewarriors Practice Coding with Codewars

Not everyone trains the same. Discover new ways to leverage Codewars in your education and career.

Please sign in or sign up to leave a comment.
natan (1 kyu)

2 years ago

Issue
python:

tests are affected by solutions mutating input, and should change to decorator-style and import test lib/solution

I would have submitted a fork, but python (same as all the other languages) are directly added from within the kata editor so I'm not sure how I'd go about that.
- hobovsky (1 dan)
  
  2 years ago
  
  Kata forks work well no matter if language was added with a translation or with editor. That was one of the reasons of introducing the functionality of forking a current state of kata.
  
  Reply
- natan (1 kyu)
  
  2 years ago
  
  I've submitted an update with various touchups
  
  calls reference solution first in case codewarrior mutates input
  
  shows input in assertion failures (except for the big tests which don't fit as plain text within the 1.5MB output limit)
  
  turned the rng into a class so that instances can hold state as opposed to global, no functional change
  
  added imports to codewars_test/solution
  
  decorator it's
  
  Reply
- geoffp (1 kyu)
  
  2 years ago
  
  Thanks for doing this.
  
  Issue marked resolved by geoffp 2 years ago
  
  Reply
- Reply
depial (1 kyu)

2 years ago

Issue
R version has no random tests
- geoffp (1 kyu)
  
  2 years ago
  
  The random tests are generated using a fixed pseudo-random seed, so they are the same every time. This ensures that any bugs in your code will be reproducible.
  
  Issue marked resolved by geoffp 2 years ago
  
  Reply
- depial (1 kyu)
  
  2 years ago
  
  This comment has been hidden.
  
  Reply
- geoffp (1 kyu)
  
  2 years ago
  
  This is a conversation I've had several times before on Codewars. For software testing in general, unpredictable unit tests are a really bad idea - they lead to bugs that occur only sometimes, unpredictably. Such bugs can be very hard to find and fix. But in an adversarial context like Codewars, something is needed to prevent the adversary easily defeating the system. There are some compromise solutions possible, which I've experimented with in my other kata. But for a simple kata like this one, I'll concede the point and switch to using unpredictable random seeds.
  
  Reply
- depial (1 kyu)
  
  2 years ago
  
  Since you've had this conversation before, I don't want to make you repeat yourself. However, if you could point me to a more detailed conversation you've had, I'd be interested in reading it.
  
  On Codewars, I find that the benefits of random tests outlined in the docs most often far outweigh the downsides I've seen, so I don't know if there is something outside my experience that you are talking about. In short, I could understand that unpredictable unit tests could be bad for software testing in principle, but which of those downsides readily transfer to the Codewars specific case in practice?
  
  Reply
- geoffp (1 kyu)
  
  2 years ago
  
  My Optical Character Recognition kata had some discussion of this about 3 years ago.
  
  As for practical Codewars-specific downsides: I actually came across one only yesterday, while changing the R version of this kata. I removed the call to set.seed(), checked that everything still seemed to be working (with different test cases on each submission), and hit "Re-publish" - only to be told that the kata couldn't be published because my solution was failing tests. (Wait, what?) It turned out that the test code itself contained a rarely-occurring bug (arising from the way the sample() function in R behaves differently when its main argument is a vector of length 1). This hadn't been an issue before, because it didn't affect any of the 512 random test cases generated by the fixed random seed I was using. But with a different seed on each submission, you occasionally get test cases that are affected, causing a crash. It was sheer good luck that such a test case was generated by Codewars' (one and only!) final check before publication. If that hadn't happened, then some hapless codewarrior would have had to discover the problem, slowly and painfully realize that the problem wasn't in their own code, get frustrated, raise an issue on the discussion board, etc. - all the stuff that unit testing is supposed to prevent.
  
  Having unpredictable random test cases makes it harder to create good kata that work reliably, because the users will be exposed to test cases that the kata author never used or considered when developing the kata. Often this doesn't cause any problems, but sometimes - as I was reminded yesterday - it does.
  
  Another Codewars-specific problem relates to execution time. With some kata (7x7 Skyscrapers is a good example) randomly-generated test cases can vary widely in required execution time. Big-O is formally defined to be for the worst case, but what Codewars measures is more like the best case, because you can keep re-trying until you get a test set that doesn't include any hard examples. I've certainly had the experience of writing a not-really-good-enough kata solution and getting it to pass anyway by repeatedly hitting "Attempt" until it gets a problem set that it can solve within the required 12 seconds.
  
  There are some compromise solutions. You can have a fixed set of test cases, but present them in an unpredictable random order; this at least makes cheating a bit more difficult. Or you can minimize the number of unpredictable test cases (only a few are really needed to catch the cheaters). It would also help if the Codewars test frameworks all stopped execution after the first failed test, instead of trying all the tests and showing what all the expected answers are.
  
  In the end, I guess it comes down to what the purpose of Codewars is. If it's just to help people learn to program, then perhaps cheating doesn't matter too much and we should emphasize reliable, well-tested, maintainable code. But if the value of 1kyu status goes beyond bragging rights with your friends (if it's helping people get real-world jobs, for example), then it needs to be as difficult as possible to defeat.
  
  Anyway, thanks for your interest. If you have any further thoughts, please post them here.
  
  Reply
- depial (1 kyu)
  
  2 years ago
  
  Thanks for the detailed response. I am acutely aware of the execution time issue, since this effectively shelved a kata I had made. The algorithm (which, judging by your OCR kata, you are likely familiar with) I was promoting is quite sensitive to differing inputs and Voile found an alternative algorithm which can usually solve the problem, but can be made to time out or fail. However, this also resulted in the reference algorithm timing out occasionally. If I used a set random seed to find a quickly executing input set, then I could probably get around this, but I suppose this could (unfairly?) promote my (standard but still arbitrary) initialization.
  
  The sample() bug is an interesting case as well. I've had "fun" with making translations when I've come across these little quirks in a language. They can definitely be frustratingly educational.
  
  As a somewhat relevant aside... I recently solved a coding problem on an educational platform which uses fully fixed tests. When I found that codewars also had a kata with the same problem, I tried the algorithm that I had come up with. While it passed all the tests on the other site, it failed here. It turns out there were some edge cases which their test suite had missed, but the random tests on codewars found.
  
  I guess finding the right balance isn't as straightforward as I had initially thought. Cheers!
  
  Reply
- Kacarott (2 dan)
  
  2 years ago
  
  Having unpredictable random test cases makes it harder to create good kata that work reliably, because the users will be exposed to test cases that the kata author never used or considered when developing the kata.
  
  I would argue that this actually helps to contribute to making better kata. Occasionally people run into situations which was never considered by the author, they raise it as an issue, it gets fixed and now the kata is better. While it is true that its not great that rarely a user will run into this kind of bug, I also think its not great to have (rare) kata which are actually incomplete due to author solutions simply not testing inputs which they fail on.
  
  Reply
- depial (1 kyu)
  
  2 years ago
  
  I have to admit that I generally agree with Kacarott here, but I do greatly value the insight from geoffp. Indeed, I used it to reconfigure the testset on the kata I mentioned above that I had made and given up on. I think now it could work, so I republished it, albeit with a variation of the ideas that we've been discussing. Namely I used a set of favorable seeds from which one is chosen at random to run the testset. The seeds take out the extreme sensitivity of the algorithm to differring inputs, which allows the alternative algo to consistantly fail.
  
  It's still not the strongest way I could make the tests, but I think it could be a good alternative to what I had before, while still requiring relatively little extra time in test production.
  
  Reply
- jpssj (1 dan)
  
  2 years ago
  
  1 edit
  
  Somewhat off-topic, but I really like to read detailed and insightful exchanges like this one. If you power users could make more of it, it would be really great (to me, at least).
  
  Reply
- Reply
Koistinen (3 kyu)

4 years ago

Issue
This is for C: While the probability values look ok, the position values of test #4 in Random_Tests are consistently wrong and seem to be very large and not in order.

11,8 {(-16807268829305383559342375576237254726043224244224.000000, 0.054688), (-16441893419972657829791454368058183971129241108480.000000, 0.109375), (24845527834625349609462642156176811334150853230592.000000, 0.015625), (-8769009823985417509222108996297698117935595257856.000000, 0.234375), (28499281927952606904971854237967518883290684588032.000000, 0.132812), (365375409332725729550921208179070754913983135744.000000, 0.046875), (24480152425292623879911720947997740579236870094848.000000, 0.070312), (28133906518619881175420933029788448128376701452288.000000, 0.085938), (10595886870649046156976715037193051892505510936576.000000, 0.070312), (-2923003274661805836407369665432566039311865085952.000000, 0.039062), (-8038259005319966050120266579939556608107628986368.000000, 0.140625)}

{(-13884265554643577722935005910804688686731359158272.000000, 0.109375), (-7307508186654514591018424163581415098279662714880.000000, 0.156250), (-21191773741298092313953430074386103785011021873152.000000, 0.031250), (-12788139326645400534282242286267476421989409751040.000000, 0.234375), (-30326158974616235552726460278862872657860600266752.000000, 0.078125), (-5115255730658160213712896914506990568795763900416.000000, 0.171875), (-25210903243958075339013563364355882089064836366336.000000, 0.171875), (-2192252455996354377305527249074424529483898814464.000000, 0.046875)}
- geoffp (1 kyu)
  
  4 years ago
  
  Those numbers are correct. The test case just has some big numbers in it, that's all. True, they aren't sorted into any kind of order -- but the kata description never promised they would be.
  
  Issue marked resolved by geoffp 4 years ago
  
  Reply
- Koistinen (3 kyu)
  
  4 years ago
  
  This comment has been hidden.
  
  Reply
- Reply
elmstedt (3 kyu)

5 years ago

Issue
R translation is broken.

Test cases include random NA values in px and py, and there is no consistency in the expected test result. Sometimes the test wants the function to return NA other times it seems to want a missing px value to be treated as 1.

Nowhere in the description is it suggested there may be NA values in the test cases, if you intend to include them you must indicate the proper way to handle them.
- geoffp (1 kyu)
  
  5 years ago
  
  Thanks for giving the R translation a try! The bug that was creating the NA values should be fixed now.
  
  Issue marked resolved by geoffp 5 years ago
  
  Reply
- Reply
u10root (5 kyu)

5 years ago

1 edit

Question
This comment has been hidden.
- geoffp (1 kyu)
  
  5 years ago
  
  Your code changes the content of the arrays x, y, px, and py. Arrays in Javascript are passed by reference, so this can cause problems if the code that called your function needed that data for other purposes.
  
  I've just tweaked the test setup so that it doesn't use those arrays for anything after calling your function; your solution should pass now.
  
  In general, though, if you are passed an array, it's safest to avoid changing its content unless you know for sure that it's not being used anywhere else.
  
  Question marked resolved by geoffp 5 years ago
  
  Reply
- Reply
u10root (5 kyu)

5 years ago

2 edits

Question
This comment has been hidden.
- geoffp (1 kyu)
  
  5 years ago
  
  This comment has been hidden.
  
  Question marked resolved by geoffp 5 years ago
  
  Reply
- dfhwze (2 dan)
  
  2 years ago
  
  I used that same algorithm, just need to tweak for the general case.
  
  Reply
- Reply
hobovsky (1 dan)

5 years ago

Question
In C++, is there some particular reason to have params passed as const value? Why not const reference, or just mutable value?
- geoffp (1 kyu)
  
  5 years ago
  
  Sorry, my mistake. It was meant to be const reference.
  
  Question marked resolved by geoffp 5 years ago
  
  Reply
- Reply
Madjosz (1 kyu)

5 years ago

1 edit

Suggestion
Java has no sample Tests.

Maybe mention in the description that the x and y are not sorted.
- geoffp (1 kyu)
  
  5 years ago
  
  Thanks, the Java sample tests are there now. Some of them have the x and y not sorted, so anyone who skips the sorting isn't going to get far.
  
  Suggestion marked resolved by geoffp 5 years ago
  
  Reply
- Reply
mouwat (4 kyu)

5 years ago

Suggestion
This comment has been hidden.
- dramforever (1 kyu)
  
  5 years ago
  
  3 edits
  
  First, you can make your code look good in comments using markdown formatting
  
  Regarding your code, while I do agree that 6kyu is a bit low for this kata, it's not a 'math problem' requiring a 'formula' at all. In fact, at a glance your approach is valid, and you should consider ways it could be made faster.
  
  Reply
- geoffp (1 kyu)
  
  5 years ago
  
  This comment has been hidden.
  
  Suggestion marked resolved by geoffp 5 years ago
  
  Reply
- Reply
tonylicoding (2 kyu)

5 years ago

Suggestion
Maybe test.expect would be better than test.assert_equals for the last test prohibiting the use of the module.
- geoffp (1 kyu)
  
  5 years ago
  
  Done. Thanks for the suggestion.
  
  Suggestion marked resolved by geoffp 5 years ago
  
  Reply
- Reply
geoffp (1 kyu)

5 years ago
I'll admit I was disconcerted to see my kata approved so soon. I'd have preferred to leave it in beta for a while longer to see some solutions in other languages and gather opinions on the right rank for it. I'd thought 5-6 kyu, but wouldn't have been put out if it ended up 4 kyu.

I knew about the existing R package that solves this problem, but wasn't aware of the Python module. Evidently some people did, though; I guess that's the beta process working as intended. By the time I'd added a test to block the use of the library in Python, the kata was already approved.

The reason this kata appears to be from 2017 is that I had a couple of draft kata that I'd been using for casual experimentation and trying stuff out. Then in 2020 I decided I didn't need both of them, and turned one of them into the Earth-mover's distance kata.
- hobovsky (1 dan)
  
  5 years ago
  
  To prevent exactly such cases of premature approval, I usually create an issue in my kata with comment like "do not resolve this issue, kata is not ready yet and I want more feedback". This usually stops accept-trigger-happy users overly enjoying their new approval priviledges to click buttons blindly.
  
  Reply
- Reply
user7820265 (1 dan)

5 years ago
This is only a math problem,don't need to know it and solve it.waste time!
- user9644768 (1 dan)
  
  5 years ago
  
  1 edit
  
  => is math mere timewaste?
  
  O_O
  
  Reply
- Blind4Basics (2 dan)
  
  5 years ago
  
  when it end's up ranked 6 kyu while it should be around 4, yes...
  
  Remember this?
  
  Reply
- user9644768 (1 dan)
  
  5 years ago
  
  4 edits
  
  2017 kata, estimated rank = 5, community feedback rank = 6. No |ZEDCWT, mrtp0| feedback
  
  Reply
- user9644768 (1 dan)
  
  5 years ago
  
  While approving libraries weren't disabled => I thought author didn't want to disable that => Just a library kata. (mark this as spoiler)
  
  Reply
- Blind4Basics (2 dan)
  
  5 years ago
  
  3 edits
  
  lol...
  
  from 2017
  
  Except that 1 week ago, there weren't any rank suggestion and you perfectly know it since you're the first to have completed it. So it's just like it's from 2020. You're dishonest to a point it makes me sick...
  
  And I'm curious about who are the voters...?
  
  MercyMadmask? Cool we now have another user who began to rank without realizing what he's doing.
  
  I bet you finally voted to get it approved. And we both know you don't get anything to rank suggestions.
  
  who's the third?
  
  (eager to see your flagged message...)
  
  Good for you if you know about that. I don't.
  
  Reply
- user9644768 (1 dan)
  
  5 years ago
  
  unflagged. :D
  
  Reply
- user9644768 (1 dan)
  
  5 years ago
  
  In 2017, the author estimate was 5 kyu.
  
  Reply
- Blind4Basics (2 dan)
  
  5 years ago
  
  and why do I feel like remembering a blue estimation on this, then...?
  
  Reply
- Mercy Madmask (1 dan)
  
  5 years ago
  
  Oh, B4B raging, now that's a sight. :o
  
  Reply
- G_kuldeep (1 dan)
  
  5 years ago
  
  1 edit
  
  it;s usual now a days ;)
  
  Reply
- Mercy Madmask (1 dan)
  
  5 years ago
  
  2 edits
  
  Haven't looked at comments much recently. ¯\_(ツ)_/¯
  
  Reply
- dfhwze (2 dan)
  
  2 years ago
  
  Maybe difficulty across languages differs alot, but I used the simplest of algorithms to solve this kata in JS in 3 seconds. It does look 6 kyu to me, at least in JS.
  
  Reply
- Reply
Mercy Madmask (1 dan)

5 years ago

Issue
In the description, you write (1/6) * (2-1) + (2/6) * (4-2) + (1/6) * (5-3) = 1.

But if I'm not wrong this is: 1/6 * 1 + 2/6 * 2 + 1/6 * 2 = 1/6 + 4/6 + 2/6 = 7/6 != 1
- user9644768 (1 dan)
  
  5 years ago
  
  Fixed.
  
  Issue marked resolved by user9644768 5 years ago
  
  Reply
- geoffp (1 kyu)
  
  5 years ago
  
  Thanks.
  
  Reply
- Reply

Kata

Earth-mover's distance

Please sign in or sign up to leave a comment.

natan (1 kyu) 2 years ago Issue

hobovsky (1 dan) 2 years ago

natan (1 kyu) 2 years ago

geoffp (1 kyu) 2 years ago

depial (1 kyu) 2 years ago Issue

geoffp (1 kyu) 2 years ago

depial (1 kyu) 2 years ago

geoffp (1 kyu) 2 years ago

depial (1 kyu) 2 years ago

geoffp (1 kyu) 2 years ago

depial (1 kyu) 2 years ago

Kacarott (2 dan) 2 years ago

depial (1 kyu) 2 years ago

jpssj (1 dan) 2 years ago 1 edit

Koistinen (3 kyu) 4 years ago Issue

geoffp (1 kyu) 4 years ago

Koistinen (3 kyu) 4 years ago

elmstedt (3 kyu) 5 years ago Issue

geoffp (1 kyu) 5 years ago

u10root (5 kyu) 5 years ago 1 edit Question

geoffp (1 kyu) 5 years ago

u10root (5 kyu) 5 years ago 2 edits Question

geoffp (1 kyu) 5 years ago

dfhwze (2 dan) 2 years ago

hobovsky (1 dan) 5 years ago Question

geoffp (1 kyu) 5 years ago

Madjosz (1 kyu) 5 years ago 1 edit Suggestion

geoffp (1 kyu) 5 years ago

mouwat (4 kyu) 5 years ago Suggestion

dramforever (1 kyu) 5 years ago 3 edits

geoffp (1 kyu) 5 years ago

tonylicoding (2 kyu) 5 years ago Suggestion

geoffp (1 kyu) 5 years ago

geoffp (1 kyu) 5 years ago

hobovsky (1 dan) 5 years ago

user7820265 (1 dan) 5 years ago

user9644768 (1 dan) 5 years ago 1 edit

Blind4Basics (2 dan) 5 years ago

user9644768 (1 dan) 5 years ago 4 edits

user9644768 (1 dan) 5 years ago

Blind4Basics (2 dan) 5 years ago 3 edits

user9644768 (1 dan) 5 years ago

user9644768 (1 dan) 5 years ago

Blind4Basics (2 dan) 5 years ago

Mercy Madmask (1 dan) 5 years ago

G_kuldeep (1 dan) 5 years ago 1 edit

Mercy Madmask (1 dan) 5 years ago 2 edits

dfhwze (2 dan) 2 years ago

Mercy Madmask (1 dan) 5 years ago Issue

user9644768 (1 dan) 5 years ago

geoffp (1 kyu) 5 years ago

({{ user.rank_name }}) 1 edit {{ edit_count }} edits {{ label_text }}

Commenting is not allowed on this discussion

Please sign in or sign up to leave a comment.

Confirm

Collect: undefined

natan (1 kyu)

2 years ago

Issue

hobovsky (1 dan)

2 years ago

natan (1 kyu)

2 years ago

geoffp (1 kyu)

2 years ago

depial (1 kyu)

2 years ago

Issue

geoffp (1 kyu)

2 years ago

depial (1 kyu)

2 years ago

geoffp (1 kyu)

2 years ago

depial (1 kyu)

2 years ago

geoffp (1 kyu)

2 years ago

depial (1 kyu)

2 years ago

Kacarott (2 dan)

2 years ago

depial (1 kyu)

2 years ago

jpssj (1 dan)

2 years ago

1 edit

Koistinen (3 kyu)

4 years ago

Issue

geoffp (1 kyu)

4 years ago

Koistinen (3 kyu)

4 years ago

elmstedt (3 kyu)

5 years ago

Issue

geoffp (1 kyu)

5 years ago

u10root (5 kyu)

5 years ago

1 edit

Question

geoffp (1 kyu)

5 years ago

u10root (5 kyu)

5 years ago

2 edits

Question

geoffp (1 kyu)

5 years ago

dfhwze (2 dan)

2 years ago

hobovsky (1 dan)

5 years ago

Question

geoffp (1 kyu)

5 years ago

Madjosz (1 kyu)

5 years ago

1 edit

Suggestion

geoffp (1 kyu)

5 years ago

mouwat (4 kyu)

5 years ago

Suggestion

dramforever (1 kyu)

5 years ago

3 edits

geoffp (1 kyu)

5 years ago

tonylicoding (2 kyu)

5 years ago

Suggestion

geoffp (1 kyu)

5 years ago

geoffp (1 kyu)

5 years ago

hobovsky (1 dan)

5 years ago

user7820265 (1 dan)

5 years ago

user9644768 (1 dan)

5 years ago

1 edit

Blind4Basics (2 dan)

5 years ago

user9644768 (1 dan)

5 years ago

4 edits

user9644768 (1 dan)

5 years ago

Blind4Basics (2 dan)

5 years ago

3 edits

user9644768 (1 dan)

5 years ago

user9644768 (1 dan)

5 years ago

Blind4Basics (2 dan)

5 years ago

Mercy Madmask (1 dan)

5 years ago

G_kuldeep (1 dan)

5 years ago

1 edit

Mercy Madmask (1 dan)

5 years ago

2 edits

dfhwze (2 dan)

2 years ago

Mercy Madmask (1 dan)

5 years ago

Issue

user9644768 (1 dan)

5 years ago

geoffp (1 kyu)

5 years ago

({{ user.rank_name }})

1 edit {{ edit_count }} edits

{{ label_text }}