May 28, 2026

End-to-End Machine-Readable Zone Recognition on Augmented Reality Glasses: OCR Studio at SenSys 2026

Ph.D. Сhief technology officer
IEEE Senior Member
End-to-End Machine-Readable Zone Recognition on Augmented Reality Glasses: OCR Studio at SenSys 2026

This month, OCR Studio made its debut at the ACM/IEEE International Conference on Embedded Artificial Intelligence and Sensing Systems (SenSys 2026), held in Saint Malo, France, from May 11 to 14. SenSys is a leading international forum for research on systems, algorithms, and technologies for embedded, energy-constrained, and sensing devices. One example of such energy-constrained devices is augmented reality glasses. At the conference, OCR Studio researchers explained how MRZs in passports and other identity documents can be recognized directly on this type of wearable device – read more in today’s blog post.

OCR Studio’s Ultra-Fast MRZ Recognition Algorithm for Low-Power Devices

Poor image quality and low character resolution make recognition from low-power AR glasses extremely challenging, requiring fast, yet accurate, context-aware models. The proposed recognition method uses a computationally efficient YOLO approach for MRZ localization, a lightweight fully convolutional recognizer with local and global contexts aggregations, and the result postprocessing to achieve better quality. This method significantly outperforms both the dedicated MRZ Scanner by DOCSAID and full-frame text recognition based on PaddleOCR.

MRZ recognition on AR glasses

How Does Our Algorithm Work?

Our algorithm consists of 4 main stages: 

  • Zone localization in the full frame
  • Text lines detection
  • Strings recognition
  • Result postprocessing

At the localization stage, the MRZ region must be found in the input image. For this, we use a lightweight detector based on the popular object detection approach YOLO, which predicts the zone’s bounding box and its tilt angle in an input image. We deliberately chose the YOLO approach over semantic segmentation because it demonstrates better performance, requires simpler postprocessing of the network output, and better takes into account image context when localizing MRZs on complex backgrounds with different text.

In the next step, text strings are detected in the localized region. To facilitate recognition, the detected strings must be tightly bound and straightened. Since MRZ strings are long and the document may curve, a bounding box is insufficient to describe curved strings. Therefore, we use a lightweight model similar to the DBNet text detector but with a more compact backbone. 

Although most modern OCR networks are based on the Transformer architecture, we use a simple fully convolutional network to improve computational performance. The proposed CNN recognizer is based on fusion features from 2 branches with local and global receptive fields: one covers 3 characters (character and trigram level) and the other covers 15 characters (MRZ field context level). The final segmentation and textual representation of the recognizer output are generated using dynamic programming.

During the postprocessing stage, the structure of the recognized MRZ is verified for compliance with the ICAO standard. MRZ fields, such as number, dates, name, and others, have fixed positions in text lines, allowing for correction of recognition errors: for example, replacing “S” to “5” in dates and “0” to “O” in names. The country code and document type in the MRZ can also be corrected using dictionaries.

How We Evaluated Performance

To evaluate the performance of the proposed method, the recognition time was measured on 3 different devices. The desktop with AMD Ryzen Threadripper PRO 7985WX CPU is used as a baseline. Apple iPhone XR (released in 2018, Apple A12 Bionic CPU) is used as an entry-level device. Imno Air 3 AR glasses (released in 2025, Snapdragon X 8-core CPU) are used as the target platform. The results are next:

  • Desktop CPU: 61 ms per frame
  • Apple iPhone XR: 77 ms per frame
  • Imno Air 3 AR glasses: 1020 ms per frame

Although the recognition time on glasses is more than an order of magnitude longer than even on an outdated smartphone, it is still sufficient for comfortable use in real-world scenarios. The proposed solution allows the recognition system to be used in real time, taking the best recognition result from several frames, compensating for inconsistent single-frame quality.

How We Evaluated Quality

To evaluate the quality of the proposed MRZ recognition framework, a dataset of 184 document images with MRZ was created. It contains two versions of photographs of the same set of documents: one taken with a smartglasses camera, the other with a modern smartphone camera. The image quality varies significantly, with individual characters often illegible in the glasses photos. Nevertheless, the proposed system correctly recognized 100% of smartphone photos and 67% of smartglasses photos. The closest third-party system, DOCSAID MRZ, managed to correctly recognize only 24% of the photos.

MRZ image captured both on a smartphone and on AR glasses

About OCR Studio

OCR Studio, a developer of optical character recognition solutions, remains committed to a science-driven approach to innovation. Every year, our researchers take part in leading international conferences, where they present our latest advances in document recognition, ID authenticity verification, and machine-readable objects scanning. This ongoing scientific work helps us transform cutting-edge research into practical technologies for real-world use.

Find out more about our ultra-lightweight MRZ scanning solution here

Contents

Continue reading

Get in Touch With Us Today!

For comprehensive details about our complete
range of solutions and services.

Or contact our sales team:

sales@ocrstudio.ai

    * Required information
    By clicking the “Send request” button, you consent to data processing