Skip to content
This repository has been archived by the owner on Mar 8, 2024. It is now read-only.

Under the Hood

Alistair Sykes edited this page Jun 17, 2021 · 3 revisions

Under the Hood

If you wish to understand what this library is actually doing, reference this section where we dive into the details.

Camera Source

Camera2Source.kt

We are using the Camera2 API's. These are the new(ish) API's which replace the now deprecated Camera1 API's.

Opening

The first step to using the camera to capture images (frames) is to open a camera.

First you need a CameraManager which can be retrieved from a Context by:

context.getSystemService(CAMERA_SERVICE) as CameraManager

Next you need the id of the camera you wish to open. See fun selectCamera(): String for our implementation, but you could use a specific characteristic about a camera to help you choose, or simply take the first one.

Now ask the CameraManager to open the camera, passing an id and a callback so you know when the camera is opened (and can handle any errors):

cameraManager.openCamera(cameraId, callback, handler)

Within the callback you will be given a CameraDevice, which you can think of as your camera.

Creating a Session

To actually receive any frames from the camera you create a session and set a request. Think of a session as your photo shoot, and you requesting poses from your model.

To start a session call cameraDevice.createCaptureSession(surfaces, callback, handler), passing your surfaces and a callback so you know when your session is configured.

The surfaces you pass in, are where you want your frames to get sent to. For example a SurfaceView to show a preview, or an ImageReader to process a frame yourself.

Within fun onConfigured(session: CameraCaptureSession) of your callback, you will then need to make your requests. You do this using a CaptureRequest.Builder and then calling a request method on your CameraCaptureSession. Which method you call depends on your situation, but for a preview to display to the user, you can use session.setRepeatingRequest(request, listener, handler).

Recognising a Barcode

We use MLKit to do barcode detection. See Google docs for more information on this.

We used to use Firebase MLKit before it was deprecated in favour of Google MLKit. So some architecture may be informed by that.

In order to make the code more understandable (and separate concerns), we have provided an interface (ImageProcessor), base (ImageProcessorBase) and a barcode implementation (BarcodeImageProcessor). The combination of these keep track of whats currently being processed, call into MLKit to process the frame and report back the results in an abstracted manner.

We call into these classes from the Camera2Source by adding an ImageReader surface to the session request. Then using an OnImageAvailableListener, we check if a frame is still currently being processed. If it is we just drop the frame (knowing another will be along in a minute anyway) or request that that frame be processed. Using two callbacks, onBarcodeListener and onImageProcessed we pass the results back and ensure that any frame get closed, ensuring we don't leak any resources.

Output Size

When creating our ImageReader surface we need decide the size we would like output. In order to what know sizes the camera can output we query the characteristics:

val characteristics = cameraManager.getCameraCharacteristics(cameraId)
val configs = characteristics.get(CameraCharacteristics.SCALER_STREAM_CONFIGURATION_MAP)
    ?: throw IllegalStateException()
val sizes = configs.getOutputSizes(IMAGE_FORMAT)

For the IMAGE_FORMAT we use ImageFormat.YUV_420_888, which Google recommend.

To choose a size requires two considerations:

  • The smaller the size the quicker it is processed
  • The larger the size the better quality the results will be

If you break down what a barcode is, it can give an indication of what sort of size you would require to get good results.

For example, EAN-13 barcodes are made up of bars and spaces that are 1, 2, 3, or 4 units wide. Therefore an EAN-13 barcode image ideally has bars and spaces that are at least 2, 4, 6, and 8 pixels wide respectively. 2 pixels for each unit providing sufficient width for recognition. Since an EAN-13 barcode is 95 units wide in total, the barcode should be at least 190 pixels (95 units * 2 pixels) wide.

Therefore depending on the barcode formats we support we can deduce a minimum width we would like our barcode image to be.

We also need to consider how far away the user holds the camera from the barcode. For example on a Google Pixel, holding an EAN-13 barcode (2.5cm in width) 5cm away from the camera lens means the barcode occupies roughly 50% of resulting image.

Combining these together results in:

val maxSizedFormat = barcodeFormats.maxBy { it.getMinWidth() ?: 0 }
val minWidth = maxSizedFormat?.getMinWidth() ?: BARCODE_FORMAT_ALL_MIN_WIDTH
val minWidthProportioned = minWidth / BARCODE_SCREEN_PROPORTION
val largeEnoughSizes = sizes.filter { it.width > minWidthProportioned }
return if (largeEnoughSizes.isNotEmpty()) {
    largeEnoughSizes.minBy { it.width }
} else {
    sizes.maxBy { it.width }
} ?: sizes[0]

See _Int.getMinWidth() for the minimum widths for specific barcode formats.

Rotation

To help Google process the frame they require a rotation to be passed into InputImage.fromMediaImage(image, rotation). To calculate this rotation we have to use three pieces of information:

  • The devices orientation, which way up the user is holding the phone.
  • The camera sensor orientation
  • The facing direction of the camera

When device manufacturers build their phones they sometimes mount the camera sensors in different orientations (typically at 90 or 270 degrees) to help them manage the minimal space available inside the phone.

See BarcodeScanner.getRotationCompensation() and Google Docs for the code and more information.

Camera Configuration

With Camera2 API's comes many options you can apply to your camera requests, from color correction to lens aperture.

Feature Availability

The wide array of devices and manufacturers available means that there are many different features available for different devices. Not all devices have all the features. Therefore when adding options to your requests you should query the characteristics of the camera and select the appropriate setting.

To request the use of feature, add it to your request via the set method on CaptureRequest.Builder:

builder.set(CaptureRequest.FEATURE_CONST, value)

Auto Focus

CaptureRequest.CONTROL_AF_MODE

For a barcode scanner, we want the camera to continually seek the correct focus, therefore using CONTROL_AF_MODE_CONTINUOUS_PICTURE is a good option, if it's available.

Tap to Focus

In combination with auto focus, we have also implemented a tap to focus feature. This allows the user to tap on the view and the camera will try to focus where the user has tapped.

This is a very complex feature and requires a few main steps:

  • Recognise that the user has tapped on the screen. This is achieved with an OnTouchListener on the SurfaceView.
  • Translate the users tap into coordinates which are relative to the camera sensor. This involves using the device rotation, camera sensor rotation, camera facing and camera sensor array. See BarcodeScanner.calculateFocusRegions().
  • Cancel the ongoing repeating request (used for preview and image analysis); session.stopRepeating().
  • Cancel any previous focus requests. This requires making a single capture request passing CameraMetadata.CONTROL_AF_TRIGGER_CANCEL.
  • Create a focus request. This is a single capture request with some specific focus related settings, including the coordinates calculated above. Primarily CameraMetadata.CONTROL_AF_TRIGGER_START and CaptureRequest.CONTROL_AF_REGIONS.
  • Once the focus request has completed, restart the repeating request, passing the same coordinates.
  • After a delay;
  • Cancel the ongoing repeating request, as above.
  • Cancel the focus request, as above.
  • Restart the repeating request without the specific coordinates.

See https://github.com/brightec/KBarcode/pull/43 for more code details.

Architecture

We have tried, wherever possible, to separate concerns. This has resulted in 4 main classes.

BarcodeView

For the majority of users, this is view you will incorporate into your apps. It is the main touch point with the library.

It has responsibility for managing the SurfaceView which renders the preview, and coordinating with the BarcodeScanner.

BarcodeScanner

This class manages the connection between the CameraSource and BarcodeImageProcessor. It also manages the connection with any additional surfaces passed in, for example the SurfaceView that the BarcodeView passes in.

Some users may wish to interact directly with this class, if they have a special use case.

CameraSource

This class is responsible for managing the camera. For more information on the camera implementation see above.

BarcodeImageProcessor

This class is responsible for managing the processing of barcodes. For more information on the this see above.