Discuss Japanese Romaji-to-Hiragana Converter

Mohrezakhorasany (3 kyu)

4 years ago

Thanks a lot, this was lots of fun. I also learnt about the Japanese language, double win! :)

Blind4Basics (2 dan)

6 years ago

3 edits

Suggestion

hi,

the sample tests are lacking a case with nyo or one of the two others like it.
knowing nothing to this language, I ended up printing the whole map to understand how to get the ん "char" out of it: wasn't that clear to me when reading the fourth bullet point and you didn't talk about it in the "challenge" part while you reminded us about tsuSmall in there. Might be worth of the explicit add in one of those sections.
the final note about ny[aou] totally lost me about how to handle those three cases, because of you providing just before the y[auo]Small stuff in the map. Those 3 keys are actually totally useless to solve the task, so I'd recommend to put those three as a note, and not as a part of the "general challenge/task description".

ewingsa (3 kyu)

6 years ago

1 edit
All the stuff with nya was just to satisfy Voile, and it is not critical for the overall solution due to its ambuguity. However "gyuunyuu", "ぎゅうにゅう" is tested in the basic tests, which I believe to be sufficient for testing that concatenation.

I am sorry you had trouble finding ん, but 4/5 of my kata is devoted to showing tables of hiragana, and I explicitly say "A map called hiragana with romaji keys and hiragana values, containing everything from the above tables has been provided." And honestly, the map is a convenience for challengers, and if challengers are confused, they can just print it like you did.

nyus are not useless to the task because of the "gyuunyuu" test, but I will move the yaSmall stuff to be a note.
Suggestion marked resolved by ewingsa 6 years ago
- Reply

Blind4Basics (2 dan)

6 years ago

1 edit

nope it's not in java (edit: I talked about the sample tests)

import org.junit.Test;
import static org.junit.Assert.assertEquals;

public class ExampleTest {
    @Test
    public void exampleTests() {
        assertEquals("かた", Kata.romajiToHiragana("kata")); // form (of practice)
        assertEquals("いえ", Kata.romajiToHiragana("ie")); // house
        assertEquals("おまつり", Kata.romajiToHiragana("omatsuri")); // festival
        assertEquals("ちはやふる", Kata.romajiToHiragana("chihayafuru")); // a popular manga
        assertEquals("さっか", Kata.romajiToHiragana("sakka")); // writer
        assertEquals("がんばって", Kata.romajiToHiragana("ganbatte")); // good luck
        assertEquals("びょういん", Kata.romajiToHiragana("byouin")); // hispital
        assertEquals("こんにちは", Kata.romajiToHiragana("konnichiha")); // hello
    }
}

seems you don't understand the reason behind my suggestions. Note that I didn't raise an issue, just a suggestion. Meaning all is working, but the description, for occidental people is somewhat suboptimal. Why not make some small changes to it so that it's easier for "us" to find our way in it? I realized after I posted my previous message that the n/ん was effectively in the first table. Where he actually has nothing to do since it's the only entry there that is not consistent with the principle of the table itself (it's in the first line, but does not care about having a a after it. That doesn't make sense)
look at my solution, you'll see that those three keys aren't necessary.

Final note: Are you really down voting the guy who's trying to improvre your kata...? That'd be a nice joke...

Reply

Voile (2 dan)

7 years ago

Issue

I just remembered that romaji representation of hiragana has a specific case of ambiguity due to ny, e.g unyu can be interpreted as うんゆ (運輸) or うにゅ (something notable in Touhou). How should it be handled?

(I haven't found a standardized way to handle this, and there might be none. But at least for the purpose of this kata, it should only map to a specific, defined output ;-))

ewingsa (3 kyu)

7 years ago

1 edit
Good memory! My test cases already made sure that no ん was followed by や, ゆ, or よ, in addition to あ, い, う, え, お, and ん. I updated the description to make that explicit.

And the only way to handle ambiguity would be to parse a longer string of text and figure out what the word is based on a language model. This is used all the time behind the scenes of machine translation, text-to-speech, part-of-speech analysis, and other natural language processing endeavors. For example, the pronunciation of 'read' (present tense), and 'read' (past tense) cannot be distinguished without context.

On a related note, Chinese and Japanese text needs to be word segmented in order for a language model to even be built, since there are no spaces between words in these languages, and characters can take on different pronunciations and meanings in different contexts. For example, hito, 人 becomes jin in nihonjin, 日本人. The basic word segmentation algorithm is to repeatedly iterate over a sentence, and marking the words recognized from largest to smallest. For example, ASMARTTWORD would start at length 10, not recognize ASMARTWORD, and eventually recognize SMART (and not MART), then WORD, then A. If it is not already a kata, I might just make it.
Issue marked resolved by ewingsa 7 years ago
- Reply
Unnamed (2 dan)

7 years ago
Shouldn't apostrophe be used in cases like "んゆ" => "n'yu"?
- Reply
Voile (2 dan)

7 years ago
It is one of the possible solutions, though far from widely used.

Of course there are still other problems with this scheme, specifically with distinguishing long vowels from two short vowels (which macrons are supposed to solve, like yōkai, but it's not widely used either). And we haven't even handled the supposed now deprecated kanas yet, which does pop up from time to time. Romaji is never really a fully adequate system.
- Reply

Reply

Voile (2 dan)

7 years ago

Issue

A map called hiragana with romaji keys and hiragana values, containing everything from the above tables has been provided

The contracted sounds is also a table, but apparently they aren't inside the map object.

ewingsa (3 kyu)

7 years ago
Added the concatenated syllables table to the map instead of just changing the description. It's probably the best design choice.
- Reply
ewingsa (3 kyu)

7 years ago

1 edit
resolved.
Issue marked resolved by ewingsa 7 years ago
- Reply

Reply

Voile (2 dan)

7 years ago

Issue

Initial code still has the wrong function name ;-)

ewingsa (3 kyu)

7 years ago
Let's just forget that ever happened. ; △ ;
Issue marked resolved by ewingsa 7 years ago
- Reply

Reply

Kata

Japanese Romaji-to-Hiragana Converter

Please sign in or sign up to leave a comment.

Mohrezakhorasany (3 kyu)

4 years ago

Blind4Basics (2 dan)

6 years ago

3 edits

Suggestion

ewingsa (3 kyu)

6 years ago

1 edit

Blind4Basics (2 dan)

6 years ago

1 edit

Voile (2 dan)

7 years ago

Issue

ewingsa (3 kyu)

7 years ago

1 edit

Unnamed (2 dan)

7 years ago

Voile (2 dan)

7 years ago

Voile (2 dan)

7 years ago

Issue

ewingsa (3 kyu)

7 years ago

ewingsa (3 kyu)

7 years ago

1 edit

Voile (2 dan)

7 years ago

Issue

ewingsa (3 kyu)

7 years ago

({{ user.rank_name }})

1 edit {{ edit_count }} edits

{{ label_text }}

Commenting is not allowed on this discussion

Please sign in or sign up to leave a comment.

Kata

Japanese Romaji-to-Hiragana Converter

Please sign in or sign up to leave a comment.

Mohrezakhorasany (3 kyu) 4 years ago

Blind4Basics (2 dan) 6 years ago 3 edits Suggestion

ewingsa (3 kyu) 6 years ago 1 edit

Blind4Basics (2 dan) 6 years ago 1 edit

Voile (2 dan) 7 years ago Issue

ewingsa (3 kyu) 7 years ago 1 edit

Unnamed (2 dan) 7 years ago

Voile (2 dan) 7 years ago

Voile (2 dan) 7 years ago Issue

ewingsa (3 kyu) 7 years ago

ewingsa (3 kyu) 7 years ago 1 edit

Voile (2 dan) 7 years ago Issue

ewingsa (3 kyu) 7 years ago

({{ user.rank_name }}) 1 edit {{ edit_count }} edits {{ label_text }}

Commenting is not allowed on this discussion

Please sign in or sign up to leave a comment.

Confirm

Collect: undefined

Mohrezakhorasany (3 kyu)

4 years ago

Blind4Basics (2 dan)

6 years ago

3 edits

Suggestion

ewingsa (3 kyu)

6 years ago

1 edit

Blind4Basics (2 dan)

6 years ago

1 edit

Voile (2 dan)

7 years ago

Issue

ewingsa (3 kyu)

7 years ago

1 edit

Unnamed (2 dan)

7 years ago

Voile (2 dan)

7 years ago

Voile (2 dan)

7 years ago

Issue

ewingsa (3 kyu)

7 years ago

ewingsa (3 kyu)

7 years ago

1 edit

Voile (2 dan)

7 years ago

Issue

ewingsa (3 kyu)

7 years ago

({{ user.rank_name }})

1 edit {{ edit_count }} edits

{{ label_text }}