Discuss Mean without outliers

5 kyu

Mean without outliers

679 of 783kingcobra

Description

Loading description...

Recursion

Statistics

Algorithms

Data Science

View

Educator, struggling to assess your students?

Assess your classroom or bootcamp with the world leading assessment platform, Qualified.Request your demo today!

Would You Pass the Google SQL Interview? Test Yourself with These 7 Concepts

Study up for your next SQL interview with these interview questions.

Please sign in or sign up to leave a comment.
richardjana (2 kyu)

9 months ago

Issue
In Python, it seems I had the same problem as some others as well: In the random tests, my result was sometimes off by 0.01. This happened a few times, every time on only one of the tests. I was able to pass by attempting a bunch of times though...
- Reply
kit_sho_ets (5 kyu)

10 months ago
pls help, i got all test ok, exept 2: "27.99 should equal 27.98", "5.4 should equal 5.39". Get std and mean of sample -> create new_sample (remove outlier) -> check if len(sample) != len(new_sample) call clean_mean(new_sample, cutoff) -> return the result. For example with sample = [1.01, 0.99, 1.02, 1.01, 0.99, 0.97, 1.03, 0.99, 1.02, 0.99, 3, 10] and cutoff = 2 i call clean_mean 3 tiems (get rid of 10, 3, and return mean of [1.01, 0.99, 1.02, 1.01, 0.99, 0.97, 1.03, 0.99, 1.02, 0.99]
- Reply
Just4FunCoder (2 dan)

15 months ago

Issue
This comment has been hidden.
- Reply
Just4FunCoder (2 dan)

15 months ago

2 edits

Issue
Python: Test should use approximate equality (test.assert_approx_equals) instead of rounding + test.assert_equals when comparing floating point numbers [Doc]
- Reply
saudiGuy (8 kyu)

15 months ago

Suggestion
python new test framework is required. updated in this fork
- Reply
transan (3 kyu)

2 years ago
Nice Kata, thanks
- Reply
Vedanta war (5 kyu)

2 years ago
This comment has been hidden.
- NunoOliveira (1 kyu)
  
  2 years ago
  
  Please don't post solutions in discourse. Read this: https://docs.codewars.com/training/troubleshooting/#post-discourse
  
  Reply
- Reply
Vedanta war (5 kyu)

2 years ago
my soloution
- Reply
Vedanta war (5 kyu)

2 years ago
i got it right but it not working
- Reply
Hunter_71 (3 kyu)

8 years ago
I had some problems with correct result, until I carefully read whole description :) Nice kata! Thx :)
- Reply
mentalplex (3 kyu)

8 years ago

1 edit

Suggestion
Your python test suite is a little inefficient, specifically this part:

cutoff = random.random() while cutoff < 0.5: cutoff = random.random() cutoff = round(cutoff * 5, 2)

It looks like you want a cutoff to be a real number chosen randomly from a uniform distribution between 2.5 and 5, rounded to 2 decimal points.
Might I suggest you replace that with: round(random.uniform(2.5, 5), 2)
- kingcobra (2 kyu)
  
  8 years ago
  
  Thank you! A bit of a hack really, and not a particularly clever one. Your suggestion is much better, I'll change it.
  
  Suggestion marked resolved by kingcobra 8 years ago
  
  Reply
- Reply
mentalplex (3 kyu)

8 years ago

Suggestion
R Translation

Please carefully review and approve. The reference solution is commented, as is the test suite, to help you understand an unfamiliar language (in case you're not familiar with R).

I used the same basic tests as python, and similar parameters for the random tests. Though I chose a different structure to generate the random arguments.

The main difference is that I added a guaranteed outlier 50% of the time (rather than a likely outlier 2% of the time).
- kingcobra (2 kyu)
  
  8 years ago
  
  Looks good to me! Thank you for taking the time to comment the code. I have some knowledge of R, but your comments were a good help.
  
  Suggestion marked resolved by kingcobra 8 years ago
  
  Reply
- Reply
Voile (2 dan)

8 years ago
Approved
- Reply
KenKamau (1 kyu)

8 years ago
Almost gave up due to the nuisance rounding. All in all, a great Kata.
- kingcobra (2 kyu)
  
  8 years ago
  
  Thank you for mentioning that! I've updated the description to make it clearer that you are only supposed to round at the end.
  
  Reply
- Reply
ZozoFouchtra (1 dan)

8 years ago

Suggestion
I didn't understand a word of what was expected with cutoff and outliers since I read these lines in Discourses :

If the cutoff is 3, then any value that is more than 3 standard deviations from the mean must be removed. We first calculate the mean and standard deviation. Then we multiply the standard deviation by 3 to get our actual cutoff value. Then for each value in the sample, we calculate its distance from the mean, i.e. abs(xi - x̅). If this distance is greater than our cutoff value, then the value is an outlier. No?

I think these words should be placed in Description instead of Discourse !

Happy coding! ; ) )
- kingcobra (2 kyu)
  
  8 years ago
  
  There is a similar explanation in the description, but I agree that this one may be clearer. I'll add it to the description, thanks!
  
  Suggestion marked resolved by kingcobra 8 years ago
  
  Reply
- ZozoFouchtra (1 dan)
  
  8 years ago
  
  Thank you. (As a non native english speaker nor 'native'-statitician the cutoff-outlier playing game was a real headache to me, I asked g00gle to translate description but it was even harder to understand, and I think there're some non native english statitician in CV)
  
  ; ) )
  
  Reply
- Reply
Blind4Basics (2 dan)

8 years ago

Question
This comment has been hidden.
- Reply
ChristianECooper (1 kyu)

8 years ago

1 edit

Issue
This comment has been hidden.
- Voile (2 dan)
  
  8 years ago
  
  Your function must remove any outliers and return the mean of the sample, rounded to two decimal places.
  
  You just need to round off your result :)
  
  Reply
- ChristianECooper (1 kyu)
  
  8 years ago
  
  Well now I feel like an idiot, I reread that paragraph several times to be sure I wasn't missing anything!
  
  Issue marked resolved by ChristianECooper 8 years ago
  
  Reply
- kingcobra (2 kyu)
  
  8 years ago
  
  The description is pretty long, I must admit... thank you for your patience!
  
  Reply
- Reply
Blind4Basics (2 dan)

8 years ago

1 edit

Question
Seems I'm not well awake, today... Or there is some issue?

sample [1.01, 0.99, 1.02, 1.01, 0.99, 0.97, 1.03, 0.99, 1.02, 0.99, 3, 10] mean 1.91833333333 sd 2.49805201885 cutoff 2 [1.01, 0.99, 1.02, 1.01, 0.99, 0.97, 1.03, 0.99, 1.02, 0.99, 3] => 1.1836363636363636 should equal 1.0

What!? (I know, I didn't round the result. But rounding leads to 1.2, so... :/ )

Note: information about rounding is wrong, I think: 5.5 is rounded to 1 decimal place, not 2 (at least, that is the way to say it in french... :o ).
- Voile (2 dan)
  
  8 years ago
  
  You need to perform the process multiple times until there are no outliers.
  
  Reply
- ChristianECooper (1 kyu)
  
  8 years ago
  
  I think you need to keep repeating the process until your sample set doesn't change any more, only then do you return your mean value.
  
  Reply
- Blind4Basics (2 dan)
  
  8 years ago
  
  1 edit
  
  oh damn... :o
  
  I think it would be usefull to rewrite this sentence of the description:
  
  Notice that, once outlying values are removed in a first "sweep", other less extreme values may then "become" outliers...
  
  in
  
  Notice that, once outlying values are removed in a first "sweep", other less extreme values may then "become" outliers that you'll have to remove too...
  
  Question marked resolved by Blind4Basics 8 years ago
  
  Reply
- kingcobra (2 kyu)
  
  8 years ago
  
  I was trying not to be too explicit on that for a bit more of a challenge, but maybe it is only confusing and not actually challenging in the true sense. I'll update the description, thank you for your observation!
  
  Reply
- Reply
Voile (2 dan)

8 years ago

Question
sample = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100] cutoff = 3 test.assert_equals(clean_mean(sample, cutoff), 5.5)

This set of data has a mean and stdev of 14.090909090909092 and 28.637229424141385, which gives a maximum bound of 100.00259736333325.
- kingcobra (2 kyu)
  
  8 years ago
  
  Since the cutoff is 3, according to my calculations the maximum bound would be ≈ 28.673 * 3, or 85.911.
  
  Reply
- kingcobra (2 kyu)
  
  8 years ago
  
  Actually, your calculation of stdev is incorrect.
  
  Reply
- Voile (2 dan)
  
  8 years ago
  
  But you need to add the mean to that bound :P
  
  Otherwise consider this:
  
  sample = [1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010] cutoff = 3
  
  The maximum bound of 85.911 will cut off everything.
  
  Reply
- kingcobra (2 kyu)
  
  8 years ago
  
  Well, it depends how you look at it. ;)
  
  From my point of view, we are interested only in the standard deviation and the distance of each observation to the mean. For example :
  
  sample = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100] cutoff = 3 mean = 14.090909090909092, std = 27.304526915561905 "bound" = 81.91358074668571 100 - 14.090909090909092 > bound, i.e. it is an outlier
  
  Reply
- Voile (2 dan)
  
  8 years ago
  
  If you don't add the mean to the range, then it's relative to 0, not to the mean.
  
  An outlier is defined as having a large distance to the mean.
  
  Reply
- kingcobra (2 kyu)
  
  8 years ago
  
  Hmm. To recapitulate :
  
  If the cutoff is 3, then any value that is more than 3 standard deviations from the mean must be removed. We first calculate the mean and standard deviation. Then we multiply the standard deviation by 3 to get our actual cutoff value. Then for each value in the sample, we calculate its distance from the mean, i.e. abs(x_i - x̅). If this distance is greater than our cutoff value, then the value is an outlier. No?
  
  Reply
- Voile (2 dan)
  
  8 years ago
  
  Yeah, that's correct.
  
  Looks like it's good to go, I debugged my solution code and now it's working. :)
  
  Question marked resolved by Voile 8 years ago
  
  Reply
- kingcobra (2 kyu)
  
  8 years ago
  
  Thanks for the upvote on the kata! :)
  
  Reply
- codyhan94 (2 kyu)
  
  8 years ago
  
  @Voile that is a much cleaner way to do what I ended up doing..
  
  Reply
- Voile (2 dan)
  
  8 years ago
  
  :)
  
  Reply
- Reply

Kata

Mean without outliers

Please sign in or sign up to leave a comment.

richardjana (2 kyu) 9 months ago Issue

kit_sho_ets (5 kyu) 10 months ago

Just4FunCoder (2 dan) 15 months ago Issue

Just4FunCoder (2 dan) 15 months ago 2 edits Issue

saudiGuy (8 kyu) 15 months ago Suggestion

transan (3 kyu) 2 years ago

Vedanta war (5 kyu) 2 years ago

NunoOliveira (1 kyu) 2 years ago

Vedanta war (5 kyu) 2 years ago

Vedanta war (5 kyu) 2 years ago

Hunter_71 (3 kyu) 8 years ago

mentalplex (3 kyu) 8 years ago 1 edit Suggestion

kingcobra (2 kyu) 8 years ago

mentalplex (3 kyu) 8 years ago Suggestion

kingcobra (2 kyu) 8 years ago

Voile (2 dan) 8 years ago

KenKamau (1 kyu) 8 years ago

kingcobra (2 kyu) 8 years ago

ZozoFouchtra (1 dan) 8 years ago Suggestion

kingcobra (2 kyu) 8 years ago

ZozoFouchtra (1 dan) 8 years ago

Blind4Basics (2 dan) 8 years ago Question

ChristianECooper (1 kyu) 8 years ago 1 edit Issue

Voile (2 dan) 8 years ago

ChristianECooper (1 kyu) 8 years ago

kingcobra (2 kyu) 8 years ago

Blind4Basics (2 dan) 8 years ago 1 edit Question

Voile (2 dan) 8 years ago

ChristianECooper (1 kyu) 8 years ago

Blind4Basics (2 dan) 8 years ago 1 edit

kingcobra (2 kyu) 8 years ago

Voile (2 dan) 8 years ago Question

kingcobra (2 kyu) 8 years ago

kingcobra (2 kyu) 8 years ago

Voile (2 dan) 8 years ago

kingcobra (2 kyu) 8 years ago

Voile (2 dan) 8 years ago

kingcobra (2 kyu) 8 years ago

Voile (2 dan) 8 years ago

kingcobra (2 kyu) 8 years ago

codyhan94 (2 kyu) 8 years ago

Voile (2 dan) 8 years ago

({{ user.rank_name }}) 1 edit {{ edit_count }} edits {{ label_text }}

Commenting is not allowed on this discussion

Please sign in or sign up to leave a comment.

Confirm

Collect: undefined

richardjana (2 kyu)

9 months ago

Issue

kit_sho_ets (5 kyu)

10 months ago

Just4FunCoder (2 dan)

15 months ago

Issue

Just4FunCoder (2 dan)

15 months ago

2 edits

Issue

saudiGuy (8 kyu)

15 months ago

Suggestion

transan (3 kyu)

2 years ago

Vedanta war (5 kyu)

2 years ago

NunoOliveira (1 kyu)

2 years ago

Vedanta war (5 kyu)

2 years ago

Vedanta war (5 kyu)

2 years ago

Hunter_71 (3 kyu)

8 years ago

mentalplex (3 kyu)

8 years ago

1 edit

Suggestion

kingcobra (2 kyu)

8 years ago

mentalplex (3 kyu)

8 years ago

Suggestion

kingcobra (2 kyu)

8 years ago

Voile (2 dan)

8 years ago

KenKamau (1 kyu)

8 years ago

kingcobra (2 kyu)

8 years ago

ZozoFouchtra (1 dan)

8 years ago

Suggestion

kingcobra (2 kyu)

8 years ago

ZozoFouchtra (1 dan)

8 years ago

Blind4Basics (2 dan)

8 years ago

Question

ChristianECooper (1 kyu)

8 years ago

1 edit

Issue

Voile (2 dan)

8 years ago

ChristianECooper (1 kyu)

8 years ago

kingcobra (2 kyu)

8 years ago

Blind4Basics (2 dan)

8 years ago

1 edit

Question

Voile (2 dan)

8 years ago

ChristianECooper (1 kyu)

8 years ago

Blind4Basics (2 dan)

8 years ago

1 edit

kingcobra (2 kyu)

8 years ago

Voile (2 dan)

8 years ago

Question

kingcobra (2 kyu)

8 years ago

kingcobra (2 kyu)

8 years ago

Voile (2 dan)

8 years ago

kingcobra (2 kyu)

8 years ago

Voile (2 dan)

8 years ago

kingcobra (2 kyu)

8 years ago

Voile (2 dan)

8 years ago

kingcobra (2 kyu)

8 years ago

codyhan94 (2 kyu)

8 years ago

Voile (2 dan)

8 years ago

({{ user.rank_name }})

1 edit {{ edit_count }} edits

{{ label_text }}