German OCR
Use this free German OCR to pull editable text out of an image or scanned PDF — recognition runs entirely in your browser, so your file is never uploaded.
The OCR engine downloads on first use (a few MB) and is then cached.
More presets
Jump to another preset — each opens its own page ready to go:
How to use the German OCR
- Drop in an image or scanned PDF — the language is preselected for you.
- Wait while the text is recognised (the language model downloads once, then is cached).
- Copy or download the recognised text.
About German OCR
Optical character recognition (OCR) turns the letters in a photo or scan into real, editable text. This German OCR uses a language model trained for that script, so language- and script-specific characters are recognised far more accurately than with an English-only model.
Everything happens in your browser — the image or PDF is decoded and recognised locally and never uploaded. The language model is fetched from a CDN on first use and cached, so later runs start instantly. For the best results, use a sharp, well-lit, straight image.
German text leans on the umlauts ä, ö, ü and the eszett (ß), and runs to long compound words like "Lebensversicherungsgesellschaft" — exactly the characters a generic model tends to flatten into a, o, u, ss or break across spaces. This recognizer is tuned to keep those marks and joins intact, so it suits scanning German invoices, official Behörden letters, contracts, book pages, and old Fraktur-free print where a single dropped umlaut can change the word.
Frequently asked questions
Which language does this recognise?
This page preselects one language’s model, but you can switch to any supported language — including English, Chinese, Japanese, Korean and many European languages — with the picker above the drop zone.
Is the image uploaded?
No. The image or PDF is recognised entirely in your browser, so it never leaves your device — safe for private documents.
Can it read scanned PDFs?
Yes. Scanned PDFs are rasterised page by page and each page is recognised, then the text is joined together.
Why is the first run slower?
The recognition engine and the language model download from a CDN the first time you use them (a few MB), then they’re cached, so later runs start right away.
Will it keep the umlauts and ß, or turn them into ae/oe/ue/ss?
It preserves ä, ö, ü and ß as the real characters rather than transliterating them to ae/oe/ue/ss, so the recognized text matches the original spelling — though a very faint or low-resolution scan can still drop a dot, so check umlauts on blurry pages.