common quantization across multiple files? #124

embeepea · 2016-04-12T17:35:40Z

Is there a way to get mapshaper to quantize coordinates in exactly the same way across multiple topojson files?

I know I can specify an explicit quantization amount using the quantization= output option, but my understanding is that this is interpreted relative to the geometry contained in the file, and I have a situation where I want to generate multiple files where the coordinates get quantized in the same way in different files, so that when I combine the files later, coordinates that were equal in any of the original input files end up being quantized the same.

More specifically, I have a data set consisting of 100,000 or so polygons with a lot of shared arcs, and I want to create a new topojson layer consisting of polygons which are created by dissolving shared borders among various subsets of the original polygons. The problem is large enough, in terms of memory and computation time, that I want to split it into a bunch of smaller problems that I can run separately, but then in the end I will re-combine the results back into one topojson file. The issue I'm running into is that when I split out the original 100,000 polygons into separate files, the coordinates in each file get quantized differently, so that when I recombine them later, the coordinates from different files don't always match up, with the result that some of the shared topology is lost.

I can try using no-quantization when writing the intermediate files, but I'm wondering if there's a better way.

The text was updated successfully, but these errors were encountered:

mbloch · 2016-04-12T20:17:51Z

There isn't currently any way to get matching quantization between files. I recommend using no-quantization for your intermediate files, at the expense of larger file size.

If no-quantization doesn't work, we can investigate options for generating quantized files that can later be dissolved together.

mbloch · 2016-07-16T13:31:51Z

I'm taking the position that once files are quantized, they are only useful for display. If you're generating intermediate files that will be recombined with other files, then you shouldn't use quantization.

embeepea · 2016-07-16T15:02:30Z

OK but I think this means that mapshaper doesn't scale to large data sets.

For example, suppose I have a polygon data set that is too large to fit into memory, so it's stored in separate pieces, say shapefiles. I want to do polygon simplification on the entire data set, but I want the simplification to be consistent across the different pieces, so that when I display simplified polygons that share a boundary but that came from different shapefiles, the simplified shared boundaries match. I don't see a way to do this with mapshaper because I think the topological inference that is used to determine shared arcs in the input data depends on quantization, right?

If I'm misunderstanding how this all works and there's another way, let me know, but it seems like this may be a case where there is a use for quantization (topological inference) that is useful for computation (simplification), not just for display.

If your position is that mapshaper is primarily intended for data sets that will fit in memory, I completely understand and I think it's a great tool either way. I just want to make sure I'm not missing something about the way it works that would in fact allow me to solve problems like the above. Thanks!

mbloch · 2016-07-16T22:02:30Z

It sounds like there's a bit of a misunderstanding about the relationship between topology inference and quantization. Mapshaper does not use quantization to identify topology. On the other hand, quantization does affect topology. As you've noted, if you apply quantization to two different files, gaps will form between formerly shared boundaries.

Still, you have a legitimate feature request. If a user has a polygon mosaic that is split across multiple files, currently there's no good way to simplify the files separately while maintaining shared boundaries across different files. This is a problem with or without quantization.

I don't have a good solution presently, but I'll re-open the issue. It's an interesting problem.

mbloch closed this as completed Jul 16, 2016

mbloch reopened this Jul 16, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common quantization across multiple files? #124

common quantization across multiple files? #124

embeepea commented Apr 12, 2016

mbloch commented Apr 12, 2016

mbloch commented Jul 16, 2016

embeepea commented Jul 16, 2016

mbloch commented Jul 16, 2016

common quantization across multiple files? #124

common quantization across multiple files? #124

Comments

embeepea commented Apr 12, 2016

mbloch commented Apr 12, 2016

mbloch commented Jul 16, 2016

embeepea commented Jul 16, 2016

mbloch commented Jul 16, 2016