Compose a big image from overlapping parts

root

Automatic mode

This is the workflow: Scan parts of an image that considerably overlap. They must all be approximately oriented correctly. The program uses the overlapping areas for reconstruction of the layout of the parts. If all parts are in the directory part then in the best case you can simply run:

patch-image --output=collage.jpeg part/*.jpeg

If you get blurred areas, you might enable an additional rotation correction:

patch-image --finetune-rotate --output=collage.jpeg part/*.jpeg

It follows an overview of how the program works. It implies some things you should care about when using the program.

The program runs three phases:

Orientate each image part individually
Find overlapping areas in the parts and reconstruct the part positions within the big image
Blend the parts in order to get the big image

The first phase orientates each part such that horizontal structures become perfectly aligned. Only the brightness channel of the image is analysed. Horizontal structures can be text or the border of the image. This also means that you should orientate the parts horizontally, not vertically. I also recommend not to mix horizontal and vertical scanned parts since the horizontal and vertical resolution of your scanner might differ slightly. However, it should be fine to rotate the image source by 180° and rotate it back digitally, before feeding it to the patch-image program.

1st Phase

Options for the first phase:

--maximum-absolute-angle: Maximum angle to test for.
--number-angles: Number of angles minus one to test between negative and positive maximum angle.
--hint-angle: If the search for horizontal structures does not yield satisfying results for an image part, you may prepend the --hint-angle option with the wanted angle to the image path.

2nd Phase

In the second phase the program looks for overlapping parts between all pairs of images. For every pair it computes a convolution via a Fourier transform. Only the brightness channel of the image is analysed.

--pad-size: Computing a convolution of two big images may exceed your graphics memory. To this end, images are shrunk before convolution. The pad size is the size in pixels after shrinking that holds 2x2 shrunken image parts. After determination of the distance between the shrunken parts the matching is repeated on a non-reduced clip of the original image part, in order to get precise coordinates.
--minimum-overlap: There must be a minimum of overlap in order to accept that the images actually overlap. The overlap is measured as a portion of the image part size.
--maximum-difference: The maximum allowed mean difference within an overlapping area of two overlapping images. If the difference is larger, then the program assumes that the parts do not overlap.
--smooth: It is important to eliminate a brightness offset, that is, big black and big white areas should be handled equally. To this end the image is smoothed and the smoothed image is subtracted from the original one. This option allows to specify the degree (radius) of the smoothing. I don't think you ever need to touch this parameter.
--output-overlap: Writes images for all pairs of image parts. These images allow you to diagnose where the matching algorithm failed. It may help you to adjust the matching parameters. In the future we might add an option to ignore problematic pairs.

Since in the first phase every image part is oriented individually, it may happen that the part orientations don't match. This would result in blurred areas in the final collage. In order to correct this, you can run phase two in an extended mode, that also re-evaluates the part orientations. The orientation of the composed image is then determined by the estimated orientation of the first image.

Options:

--finetune-rotate: Enables the extended overlapping mode. The option --output-overlap will then be ignored.
--number-stamps: The extended mode selects many small clips in the overlapping area and tries to match them. We call these clips /stamps/. This option controls the number of stamps per overlapping area minus one.
--stamp-size: Size of a square stamp in pixels.

3rd Phase

The third phase composes a big image from the parts. These parts are weighted such that the part boundaries cannot be seen anymore and differences in brightness are faded into another. The downside is that the superposition may lead to blur.

Options:

--output: Path of the output JPEG image with the weighted collage.
--output-hard: Alternative output of a JPEG collage where the image parts are simply averaged. You will certainly see bumps in brightness at the borders of the image parts. This output may be mostly useful to promote the great weighting algorithm employed by --output.
--output-distance-map: The weight for every pixel is chosen according to the distance to an image part boundary that lies within other parts. The rationale is that the weight shall become zero when the pixel is close to a position that will be affected by a disruption otherwise. This option allows to emit the distance map for every image part.
--distance-gamma: If the distances are used for weighting as they are, the program fades evenly between the overlapping image parts over the entire overapping area. This may mean that the overlapping area is blurred. Raising the distance to a power greater than one reduces the area of blur. The downside is that it also reduces the area for adaption of differing brightness.

Our LLVM implementation provides an additional way to assemble the image parts. The already known weighting approach tries to blend across all the overlapping area. This can equalize differences in brightness. The downside is that imperfectly matching image parts lead to blurred content in the overlapping area. An alternative algorithm tries to make the overlapping as small as possible and additionally performs blending where it hurts least. More precisely, parts are blended where they differ least. However, if the brightness of the image parts differ then the blending boundaries may become visible.

Options:

--output-shaped: Path of the output JPEG image with smoothly blended image parts along curves of low image difference.
--output-shaped-hard: Like before but the image parts are not smoothly faded. Instead, every pixel belongs to exactly one original image part. This is more for debugging purposes than of practical use.
--output-shape: Emit the smoothed mask of each image part used for blending.
--output-shape-hard: Emit the non-smoothed masks.
--shape-smooth: Smooth radius of the image masks. The higher, the smoother is the blending between parts.

General options:

--quality: JPEG quality percentage for writing the images.

Semi-Automatic mode: inspection and correction

If the program does not work correctly on a particular set of images or if you want to archive what it has done, then the following options may help you.

--output-state=state-%s.csv: Use of this option let the program generate three files that correspond to the stages of the image processing. For each file the program replaces the placeholder %s by an identifier for the particular stage. For the example above the generated files are:
- state-angle.csv: After the rotation detection the program emits this file containing the image paths and their angles in degrees.
- state-relation.csv: This file contains for all possible pairs of image parts whether the program found that they overlap and, if yes, what their relative offset is.
- state-position.csv: From the relative offsets the program computes the absolute part positions. This CSV file is essentially an extension of state-angle.csv by the absolute positions. However, in --finetune-rotate mode the angles may differ from the ones in state-angle.csv.
--state=state-position.csv: You can feed the program with various parameters such that the program does not need to compute them itself. This can be useful for repeating a previous computation, speedup a computation or improve a previous computation. The idea is to take either state-angle.csv or state-position.csv as generated by --output-state. You may omit any column or re-order them, or you may clear individual cells. At least the Image column must be present. The most simple CSV file would thus be a text file where the first line is Image and each of the following lines contains one image path. The images listed in the CSV file are loaded additionally to the files given at the command-line. A complete CSV file would contain the columns Image, Angle, X, Y, and in --finetune-rotate mode additionally DAngle.

A missing column or empty cell means that the program computes the parameter itself. You may even fix say an X coordinate but let the program compute the corresponding Y coordinate. I do not know whether there is a useful application of this feature, but the feature is a natural result of how the computation works.

You may use the CSV file to set the rotation angle of the assembled image. To this end, choose an image part where you know the angle for sure and add the Angle to the image's row while omitting the Angle for all other images.

If an image is grossly turned away from horizontal alignment, that is, rotated by, say, more than 2 degree, or the image does not contain a horizontal structure to align to, then the automatic angle detection will fail. In this case you might use the Angle column to specify the angle directly.

In --finetune-rotate mode the Angle column means the following: If an Angle cell is set then the rotation angle is fixed both in the first rotation detection phase, and in the subsequent rotation correction. If an Angle cell is empty then the rotation angle is both determined by horizontal structures in the first phase and then adjusted in the subsequent correction phase. Further, if you want to say that all angles are only hints that may be adjusted in the second phase, then you should add an empty DAngle column. If you want to exclude some angles from correction, then enter a 0 at the corresponding cells in the DAngle column.

Summarized: If both Angle and DAngle columns are present, then the Angle column specifies the angle as determined in the first phase and the DAngle column specifies the angle difference as determined in the second phase. You may enter values different from 0 in the DAngle column. E.g., setting 1 means that any angle determined in the first phase is unconditionally increased by 1 degree in the second phase. I do not know, whether this can ever be useful, but providing the columns Angle and DAngle this way seemed to be the most natural way to support all sensible use cases. If only the Angle column is available this is equivalent to adding a DAngle column and set DAngle=0 if Angle is set and leave DAngle empty if Angle is empty.

If you list images both on the command-line and in a CSV file, then both lists are concatenated. I recommend that you use either command-line arguments or a CSV file to specify the image list. You can easily get this wrong when running the program with a command-line file list first and then rerunning the computation starting with a saved state.csv file.
--relations=state-relation.csv: The table relation.csv contains pairs of images and their relative offsets. The most simple use case is to take the state-relation.csv file as generated by --output-state and alter or delete rows that you find inappropriate. The common columns are ImageA, ImageB, and Rel. Rel can be:
- empty: The user wants the program to determine whether the image parts overlap.
- X: The user knows that the image parts overlap.
- -: The user knows that the image parts do not overlap.
Without --finetune-rotate the columns DX and DY follow. They tell the offset of the image parts to each other. In --finetune-rotate mode there are the additional columns XA, YA, XB, YB. They contain coordinates of corresponding points in the image pairs. There are usually many such correspondences for one image pair and they are listed below header lines for the image pairs.

It is uncommon to have both a state.csv file with absolute positions and a relation.csv file, though the program supports it. If there is a pair of image parts where both images have fixed positions in state.csv and there is also a fixed offset registered in relation.csv, then the fixed positions win. The differing offset between the image parts are seen as an unfortunate impossibility to optimally match the optimization goals.

There is another difficulty in mixing fixed positions together with fixed offsets: For a pair of image parts with fixed positions the program does not match images in order to compute the relative offsets of the parts because it would not affect the final positions, anyway. This optimization makes the program skip the phase of determining image relations completely, if all part positions are fixed. This is the wanted effect. The unwanted effect is that since no relative offsets are computed, no relative offsets can be written to state-relation.csv as requested by --output-state. That is, re-running the program with --state using a CSV file as previously emitted by --output-state while using --output-state anew will generate a new state-relation.csv file with less information than before or no information at all.

Files listed in relation.csv are not loaded. Instead of this, the program checks consistency with state.csv and files listed as command-line arguments.

Trouble shooting

Unrelated images are recognized as overlapping or vice versa

If the program misses related image pairs or recognizes unrelated images as overlapping, you may first watch the overlap differences the program prints. You may then adapt the threshold via the --maximum-difference option.

The total image might contain repetitive elements that make unrelated image parts look like they overlap. So, if there is no clear threshold or you find the above procedure inappropriate, you may alternatively let the program emit a relation.csv file using the --output-state option. You can then edit relation.csv file and re-run the program with the option --relations=relation.csv.

hub.darcs.net :: thielema -> patch-image -> files

root

Automatic mode

1st Phase

2nd Phase

3rd Phase

Semi-Automatic mode: inspection and correction

Trouble shooting

Unrelated images are recognized as overlapping or vice versa