OCR Studio SDK for Android Integration with Kotlin

SDK Contents

  • doc — documentation
  • Sample/ — sample project
  • Sample/app/src/main/jniLibs/ — universal static library for Android devices and emulator
  • Sample/app/src/main/libs/*.jar — wrapper for the C++ library
  • Sample/app/src/main/assets/data/*.ocr — configuration file

All files provided as source code can be modified at your discretion.
GitHub Repository

Integration Overview

The C++ library is supplied as binary files for different architectures. Java interacts with it via JNI (Java Native Interface); the .jar wrapper for the C++ code is generated using SWIG.

Any object returned by a JNI function (i.e., by methods of our wrapper) is a local reference, valid only during the method call within the current thread. The engine itself, however, continues to exist globally.

If such references are used incorrectly, you are likely to encounter NullPointerException at various points in your app’s execution. Therefore, make sure to immediately save JNI call results into native Java objects, instead of keeping references to the internal structures of our library.

More details about JNI best practices: Android docs.

Integration Steps

  1. Library. Copy the jniLibs directory with binary libraries from the SDK and place it in your project at app/src/main/jniLibs/.
  2. Wrapper. Copy the jni*.jar file from the SDK and place it in your project at app/src/main/libs/.
  3. Bundle. Copy the *.ocr file from the SDK and place it in your project at app/src/main/assets/data/*.ocr.
  4. In build.gradle (module) add the wrapper:
dependencies {
  implementation(fileTree("libs") { include("*.jar") })
}

5. Proguard rules. If you are using code optimization (minifyEnabled true), add the following lines to proguard-rules.pro; otherwise, you will encounter errors when building the release version of your project.

-keep class ai.ocrstudio.common.* { public <methods>; }
-keep class ai.ocrstudio.id.* { public <methods>; }

Connect this file in build.gradle (module):

apply plugin: 'com.android.application'

android {
  compileSdk 36
  ...
  buildTypes {
    release {
      minifyEnabled true
      proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro' /
    }
    debug {
      minifyEnabled false
    }
  }
}
...

Working with the library

The typical workflow with the library is:

  1. Initialize the library and load the configuration file.
  2. Create session settings that define what needs to be recognized.
  3. Prepare the image to be recognized (in a loop for video stream, or once for a gallery image).
  4. Initialize a session and pass the image to it.
  5. Parse the recognition results.

If the library is used throughout the app, initialize it once in Application.onCreate().

If the library is needed only in certain parts of the app, initialize it in Activity.onCreate().

Loading the native library

System.loadLibrary("jniidengine")

Reading data from the configuration file

val stream: InputStream = context.assets.open("your_bundle_filename", AssetManager.ACCESS_STREAMING)
val size = stream.available()
val data = ByteArray(size)
val result = stream.read(data)
stream.close()
if (result != size) throw Exception("stream reading error")

Initializing the engine

val engine = IdEngine.Create(data,
  true, // lazyConfiguration
  0, // initConcurrency
  true // delayedInitialization
)

Creating a Session

A recognition session can be configured for specific document types, timeout settings, and whether to return projectively corrected images.

val sessionSettings = engine.CreateSessionSettings()

with(sessionSettings){
    SetOption("common.sessionTimeout", "5.0")
    SetCurrentMode(mode)
    AddEnabledDocumentTypes(mask)
}

val session = engine.SpawnSession(
    sessionSettings,
    signature
)

Creating an Image

Regardless of the source (camera frames or gallery), you must create an image of type ocr.common.Image.

Single image processing example:

fun processPhoto(image: Image) {
  val result = session.Process(image)
}

For video streams, process frames sequentially within the same session.

The system can recognize documents from a single image, but combining multiple frames greatly improves accuracy. The engine merges recognition results across frames, which helps in poor lighting, glare, or other adverse conditions.

Terminality — automatic stop of the recognition process in video streams. It becomes true in two cases:

  • Adding more frames will not change the recognition result.
  • The session timeout (as defined in settings) is reached.
fun processVideoFrame(image: Image) {
  val result = session.Process(image)
  if ( result.GetIsTerminal() ) ... stop the process here
}

Parsing Results

The recognition result (an object of class IdResult) contains all document data. Use the class methods (documented separately) to access them.

As a criterion for checking whether a result exists after session termination, use result.GetDocumentType(). The return value is available throughout the session as long as at least one recognition has been performed.

For intermediate results, use result.GetTemplateDetectionResulsCount(), which helps to check if a document is present in the current frame.

Example for reading text fields:

val iterator = result.TextFieldsBegin()
val end = result.TextFieldsEnd()
while (!iterator.Equals(end)) {
  val textField : IdTextField = iterator.GetValue()

  with(textField){
    val info       = GetBaseFieldInfo()
    val key        = GetName()
    val value      = GetValue().GetFirstString().GetCStr()
    val isAccepted = info.GetIsAccepted()
    val attr       = info.parseAttributes()
  }

  iterator.Advance()
}

Memory Management

Java runtime does not manage C++ memory. The GC cannot access C++ objects. Most library classes have factory methods that return pointers to heap-allocated objects. Responsibility for freeing memory lies with the developer.

Always release resources when no longer needed: session.Reset(), image.delete(), engine.delete().

Exception Handling

The library may throw exceptions of type ocr::common::BaseException and its subclasses in case of invalid input, incorrect calls, or other errors. Always handle them.

RFID Support

To read an NFC chip in an Android app, our SDK implements reading using the open-source JMRTD and SCUBA libraries. Below is sample code for implementing data reading from the chip and working with the read data. Our SDK can also be used to parse NFC passports and perform document authentication checks if your product configuration supports parsing such data.

Library

JMRTD, SCUBA

Imports

import org.jmrtd.BACKey;
import org.jmrtd.BACKeySpec;
import org.jmrtd.PassportService;
import org.jmrtd.lds.icao.DG1File;
import org.jmrtd.lds.icao.DG2File;
import org.jmrtd.lds.iso19794.FaceImageInfo;

import net.sf.scuba.smartcards.CardFileInputStream;
import net.sf.scuba.smartcards.CardService;
import net.sf.scuba.smartcards.CardServiceException;;

Code

public class PassportReader {
  /**
   * READ PASSPORT NFC-DATA
   * The process can take a long time
   * @param isoDep
   * @param passportKey
   * @return
   */
  public PassportData readPassportData(
    IsoDep isoDep,
    PassportKey passportKey
  ) throws CardServiceException, IOException {
    BACKeySpec bacKey = new BACKey(passportKey.passportNumber, passportKey.birthDate, passportKey.expirationDate);
    isoDep.setTimeout(10000); // timeout in ms
    CardService cardService = CardService.getInstance(isoDep);
    cardService.open();
    PassportService service = new PassportService(
        cardService,
        PassportService.NORMAL_MAX_TRANCEIVE_LENGTH,
        PassportService.DEFAULT_MAX_BLOCKSIZE,
        false,
        false
    );
    service.open();
    service.sendSelectApplet(false);
    service.doBAC(bacKey);

    // READ DATA
    CardFileInputStream dg1In = service.getInputStream(PassportService.EF_DG1);
    DG1File dg1File = new DG1File(dg1In);
    //successfully read the chip and can now work with the data: dg1File.getMRZInfo()
  }
}