System Name: moct-kor-Hang-Latn-2000
Authority ID | moct |
System ID | 2000 |
Language | Korean |
Source Script | Hang |
Destination Script | Latn |
Name | Ministry of Culture and Tourism Korean Romanization System (2000) |
URL | https://www.korean.go.kr/front_eng/roman/roman_01.do |
Description | Generation of Jamo from Hangul This is how the Hangul-to-Jamo maps are generated. Please refer to this page for details about Korean text handling in Unicode. http://gernot-katzers-spice-pages.com/var/korean_hangul_unicode.html This formula copied from the page above is used: [stem] ==== tail = mod (Hangul codepoint − 44032, 28) vowel = 1 + mod (Hangul codepoint − 44032 − tail, 588) / 28 lead = 1 + int [ (Hangul codepoint − 44032)/588 ] ==== [source,python] ---- import pandas as pd import re import math leadjamo = [chr(0x1100+i) for i in range(0,19)] # ᄀᄁᄂᄃᄄᄅᄆᄇᄈᄉᄊᄋᄌᄍᄎᄏᄐᄑᄒ voweljamo = [chr(0x1161+i) for i in range(0,21)] # ᅡᅢᅣᅤᅥᅦᅧᅨᅩᅪᅫᅬᅭᅮᅯᅰᅱᅲᅳᅴᅵ tailjamo = ['']+[chr(0x11A8+i) for i in range(0,27)] # ᆨᆩᆪᆫᆬᆭᆮᆯᆰᆱᆲᆳᆴᆵᆶᆷᆸᆹᆺᆻᆼᆽᆾᆿᇀᇁᇂ hanguls = [chr(i) for i in range(44032,55204)] tails = [tailjamo[(i-44032) % 28] for i in range(44032,55204)] vowels = [voweljamo[((i-44032-((i-44032) % 28)) % 588) // 28] for i in range(44032,55204)] leads = [leadjamo[math.floor((i-44032)// 588)] for i in range(44032,55204)] kr_df = pd.DataFrame({'Hangul':hanguls, 'Lead':leads,'Vowel':vowels, 'Tail':tails}) ---- Hangul Lead Vowel Tail 0 가 ᄀ ᅡ 1 각 ᄀ ᅡ ᆨ 2 갂 ᄀ ᅡ ᆩ 3 갃 ᄀ ᅡ ᆪ 4 간 ᄀ ᅡ ᆫ 5 갅 ᄀ ᅡ ᆬ 6 갆 ᄀ ᅡ ᆭ 7 갇 ᄀ ᅡ ᆮ 8 갈 ᄀ ᅡ ᆯ 9 갉 ᄀ ᅡ ᆰ |
Hang
Latn
Condition
- nonespaceafter: digit,
not before: any ( digit, space) - 1일
- 2이
- 3삼
- 4사
- 5오
- 6육
- 7칠
- 8팔
- 9구
- none-after:
any ( 도, 시, 군, 구, 읍, 면, 리, 동, 가) + line end,not before: line start - Run var-kor-Hang-Hang-jamo
- line startspace
- line endspace
- ᆩᄋᆨᄁ
- ᆩᆨ
- ᆪᄋᆨᄉ
- ᆪᆨ
- ᆬᄋᆫᄌ
- ᆬᆫ
- ᆭᄀᆫᄏ
- ᆭᄃᆫᄐ
- ᆭᄇᆫᄑ
- ᆭᄌᆫᄎ
- ᆭᆫ
- ᆮᆺafter:
any ( ᄀ, ᄁ, ᄂ, ᄃ, ᄄ, ᄅ, ᄆ, ᄇ, ᄈ, ᄉ, ᄊ, ᄌ, ᄍ, ᄎ, ᄏ, ᄐ, ᄑ, ᄒ) - ᆳᄋᆯᄉ
- ᆳᆯ
- ᆴᄋᆯᄐ
- ᆴᆯ
- ᆵᄋᆯᄑ
- ᆵᆯafter:
any ( ᄃ, ᄄ, ᄐ) - ᆵᄇ
- Parallel
- Parallel
- ᄀgbefore:
any ( alpha, digit, jamo vowel, -) - ᄂnbefore:
any ( alpha, digit, jamo vowel, -) - ᄃdbefore:
any ( alpha, digit, jamo vowel, -) - ᄅrbefore:
any ( alpha, digit, jamo vowel, -) - ᄆmbefore:
any ( alpha, digit, jamo vowel, -) - ᄇbbefore:
any ( alpha, digit, jamo vowel, -) - ᄉsbefore:
any ( alpha, digit, jamo vowel, -) - ᄋbefore:
any ( alpha, digit, jamo vowel, -) - ᄌjbefore:
any ( alpha, digit, jamo vowel, -) - ᄎchbefore:
any ( alpha, digit, jamo vowel, -) - ᄏkbefore:
any ( alpha, digit, jamo vowel, -) - ᄐtbefore:
any ( alpha, digit, jamo vowel, -) - ᄑpbefore:
any ( alpha, digit, jamo vowel, -) - ᄒhbefore:
any ( alpha, digit, jamo vowel, -) - ᄁkkbefore:
any ( alpha, digit, jamo vowel) - ᄄttbefore:
any ( alpha, digit, jamo vowel) - ᄈppbefore:
any ( alpha, digit, jamo vowel) - ᄊssbefore:
any ( alpha, digit, jamo vowel) - ᄍjjbefore:
any ( alpha, digit, jamo vowel) - Parallel
- ᄀgbefore: space
- ᄂnbefore: space
- ᄃdbefore: space
- ᄅbefore: space, after:
any ( ᅣ, ᅤ, ᅧ, ᅨ, ᅭ, ᅲ) - ᄅnbefore: space
- ᄆmbefore: space
- ᄇbbefore: space
- ᄉsbefore: space
- ᄋbefore: space
- ᄌjbefore: space
- ᄎchbefore: space
- ᄏkbefore: space
- ᄐtbefore: space
- ᄑpbefore: space
- ᄒhbefore: space
- ᄁkkbefore: space
- ᄭkkbefore: space
- ᄄttbefore: space
- ᄯttbefore: space
- ᄈppbefore: space
- ᄲppbefore: space
- ᄊssbefore: space
- ᄍjjbefore: space
- ᄶjjbefore: space
- Parallel
- ᆨkafter:
any ( space, -) - ᆫnafter:
any ( space, -) - ᆮtafter:
any ( space, -) - ᆯlafter:
any ( space, -) - ᆷmafter:
any ( space, -) - ᆸpafter:
any ( space, -) - ᆺtafter:
any ( space, -) - ᆼngafter:
any ( space, -) - ᆽtafter:
any ( space, -) - ᆾtafter:
any ( space, -) - ᆿkafter:
any ( space, -) - ᇀtafter:
any ( space, -) - ᇁpafter:
any ( space, -) - ᆰkafter:
any ( space, -) - ᆲpafter:
any ( space, -) - line start + spacenone
- space + line endnone
- Title case