Arabic OCR
Use this free Arabic OCR to pull editable text out of an image or scanned PDF — recognition runs entirely in your browser, so your file is never uploaded.
The OCR engine downloads on first use (a few MB) and is then cached.
More presets
Jump to another preset — each opens its own page ready to go:
How to use the Arabic OCR
- Drop in an image or scanned PDF — the language is preselected for you.
- Wait while the text is recognised (the language model downloads once, then is cached).
- Copy or download the recognised text.
About Arabic OCR
Optical character recognition (OCR) turns the letters in a photo or scan into real, editable text. This Arabic OCR uses a language model trained for that script, so language- and script-specific characters are recognised far more accurately than with an English-only model.
Everything happens in your browser — the image or PDF is decoded and recognised locally and never uploaded. The language model is fetched from a CDN on first use and cached, so later runs start instantly. For the best results, use a sharp, well-lit, straight image.
Arabic is written right-to-left in a connected, cursive script where most letters take on a different shape depending on whether they sit at the start, middle or end of a word, or stand alone — so the model has to recognize each letter's contextual form rather than a single fixed glyph. Optional diacritics (short-vowel and tashkeel marks) add another layer above and below the line. Arabic OCR is the practical way to lift text out of scanned books, official documents, signage and screenshots in Arabic, Persian or Urdu-style scripts and turn it into editable, searchable, copy-pasteable text.
Frequently asked questions
Which language does this recognise?
This page preselects one language’s model, but you can switch to any supported language — including English, Chinese, Japanese, Korean and many European languages — with the picker above the drop zone.
Is the image uploaded?
No. The image or PDF is recognised entirely in your browser, so it never leaves your device — safe for private documents.
Can it read scanned PDFs?
Yes. Scanned PDFs are rasterised page by page and each page is recognised, then the text is joined together.
Why is the first run slower?
The recognition engine and the language model download from a CDN the first time you use them (a few MB), then they’re cached, so later runs start right away.
Will it keep the right-to-left order and the diacritic marks?
The recognized text is returned in proper right-to-left order, and short-vowel/tashkeel marks are picked up when they're clearly printed — though faint or hand-added diacritics are easy to miss, so check vowelled text closely.