Korean OCR
Use this free Korean OCR to pull editable text out of an image or scanned PDF — recognition runs entirely in your browser, so your file is never uploaded.
The OCR engine downloads on first use (a few MB) and is then cached.
More presets
Jump to another preset — each opens its own page ready to go:
How to use the Korean OCR
- Drop in an image or scanned PDF — the language is preselected for you.
- Wait while the text is recognised (the language model downloads once, then is cached).
- Copy or download the recognised text.
About Korean OCR
Optical character recognition (OCR) turns the letters in a photo or scan into real, editable text. This Korean OCR uses a language model trained for that script, so language- and script-specific characters are recognised far more accurately than with an English-only model.
Everything happens in your browser — the image or PDF is decoded and recognised locally and never uploaded. The language model is fetched from a CDN on first use and cached, so later runs start instantly. For the best results, use a sharp, well-lit, straight image.
Korean text isn't written letter-by-letter: individual jamo (consonants and vowels) are stacked into square syllabic blocks like 한 and 글, and this OCR reads each composed block as a single character rather than trying to break it back into its parts. That matters for scanning Korean books, signage, hwagongmun and screenshots, and the same model also picks up the hanja (Chinese characters) still sprinkled through older and formal Korean writing.
Frequently asked questions
Which language does this recognise?
This page preselects one language’s model, but you can switch to any supported language — including English, Chinese, Japanese, Korean and many European languages — with the picker above the drop zone.
Is the image uploaded?
No. The image or PDF is recognised entirely in your browser, so it never leaves your device — safe for private documents.
Can it read scanned PDFs?
Yes. Scanned PDFs are rasterised page by page and each page is recognised, then the text is joined together.
Why is the first run slower?
The recognition engine and the language model download from a CDN the first time you use them (a few MB), then they’re cached, so later runs start right away.
Will it keep Korean syllables together or split them into separate jamo?
It outputs whole, pre-composed syllabic blocks (so 한 stays as one character), which is what you want for copying into documents or search — it does not decompose them into loose initial/medial/final jamo.