WebAssembly (Wasm) is a technology that allows running compiled code from various programming languages directly in the browser.
Our library is written in C++ and ported to WebAssembly using Emscripten.
To cover a wide range of devices, we provide 4 library builds, each of which should be loaded depending on the browser's feature support.
nosimd.nothreads
simd.nothreads
simd.threads
nosimd.threads
SIMD — a set of instructions that significantly improves computation speed.
THREADS — multi-threading support via additional web workers using a shared SharedArrayBuffer. Proper server-side configuration of COOP and COEP headers is required to use it.
SharedArrayBuffer
Browser support for SIMD and multi-threading can be detected using the wasm-feature-detect library.
wasm-feature-detect
The server must serve *.wasm files with the header:
*.wasm
Content-Type: application/wasm
The server should support compression for .wasm files. WebAssembly files compress well, reducing delivery time.
.wasm
Check the content-encoding header in devtools or using:
content-encoding
curl -H "Accept-Encoding: gzip" -I https://example.com/yourfile.wasm
We recommend serving pre-compressed files to avoid runtime compression overhead.
Include the compiled OCR Studio WebAssembly module in your project. No additional configuration is required — just place it in the project directory.
To prevent blocking the main execution thread with heavy recognition tasks, a web worker is used to communicate with the Wasm module.
The worker acts as a mediator between client JavaScript and the Wasm module.
For real-time recognition, the page needs:
Video and canvas elements to display camera input:
<video id="video" class="video" playsinline muted autoplay></video> <canvas id="canvas" class="canvas"></canvas>
Buttons for scanning documents and face matching:
<button id="scan-button" class="button"> Scan Document </button> <button id="capture-face-button" class="button"> Match Face </button>
Result container to display recognition results:
<div id="result-wrapper" class="result-wrapper"> <h2>Recognition Result</h2> <div id="output"></div> </div>
Include the JavaScript file for client logic.
Request video stream from the camera:
stream = await navigator.mediaDevices.getUserMedia({ video: { facingMode: { ideal: "environment" } } })
Assign it to the video element:
const videoEl = document.querySelector("#video") videoEl.srcObject = stream await videoEl.play()
Draw video frames on the canvas:
const canvasEl = document.querySelector("#canvas") const ctx = canvasEl.getContext("2d", { willReadFrequently: true }) const animate = function () { ctx.drawImage(videoEl, 0, 0, canvasEl.width, canvasEl.height) requestAnimationFrame(animate) } animate()
Initialize the worker and send the document type for recognition:
OCRStudioWorker = new Worker("./worker.js") OCRStudioWorker.postMessage({ requestType: "createSession", docData: "*", })
On button click, capture a video frame and send it to the worker:
const scanButtonEl = document.querySelector("#scan-button") scanButtonEl.addEventListener("click", async () => { OCRStudioWorker.postMessage({ requestType: "frame", imageData: canvasEl.getContext("2d", { willReadFrequently: true }) .getImageData(0, 0, canvasEl.width, canvasEl.height) }) })
The worker returns coordinates of detected elements to highlight them in the video.
Receive and display results on the page:
OCRStudioWorker.onmessage = function (message) { switch (message.data.requestType) { case "result": let result = message.data if (Object.keys(result.data).length === 0) { console.log("Document not found") OCRStudioWorker.postMessage({ requestType: "reset" }) return } printResult(result) canvasHandler.clear(canvasOverlayEl) OCRStudioWorker.postMessage({ requestType: "reset" }) break } }
A minimal web application with real-time image recognition is now ready.
Your message has been sent
Please make sure you have specified an email with a corporate domain