Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common quantization across multiple files? #124

Open
embeepea opened this issue Apr 12, 2016 · 4 comments
Open

common quantization across multiple files? #124

embeepea opened this issue Apr 12, 2016 · 4 comments

Comments

@embeepea
Copy link

Is there a way to get mapshaper to quantize coordinates in exactly the same way across multiple topojson files?

I know I can specify an explicit quantization amount using the quantization= output option, but my understanding is that this is interpreted relative to the geometry contained in the file, and I have a situation where I want to generate multiple files where the coordinates get quantized in the same way in different files, so that when I combine the files later, coordinates that were equal in any of the original input files end up being quantized the same.

More specifically, I have a data set consisting of 100,000 or so polygons with a lot of shared arcs, and I want to create a new topojson layer consisting of polygons which are created by dissolving shared borders among various subsets of the original polygons. The problem is large enough, in terms of memory and computation time, that I want to split it into a bunch of smaller problems that I can run separately, but then in the end I will re-combine the results back into one topojson file. The issue I'm running into is that when I split out the original 100,000 polygons into separate files, the coordinates in each file get quantized differently, so that when I recombine them later, the coordinates from different files don't always match up, with the result that some of the shared topology is lost.

I can try using no-quantization when writing the intermediate files, but I'm wondering if there's a better way.

@mbloch
Copy link
Owner

mbloch commented Apr 12, 2016

There isn't currently any way to get matching quantization between files. I recommend using no-quantization for your intermediate files, at the expense of larger file size.

If no-quantization doesn't work, we can investigate options for generating quantized files that can later be dissolved together.

@mbloch
Copy link
Owner

mbloch commented Jul 16, 2016

I'm taking the position that once files are quantized, they are only useful for display. If you're generating intermediate files that will be recombined with other files, then you shouldn't use quantization.

@mbloch mbloch closed this as completed Jul 16, 2016
@embeepea
Copy link
Author

OK but I think this means that mapshaper doesn't scale to large data sets.

For example, suppose I have a polygon data set that is too large to fit into memory, so it's stored in separate pieces, say shapefiles. I want to do polygon simplification on the entire data set, but I want the simplification to be consistent across the different pieces, so that when I display simplified polygons that share a boundary but that came from different shapefiles, the simplified shared boundaries match. I don't see a way to do this with mapshaper because I think the topological inference that is used to determine shared arcs in the input data depends on quantization, right?

If I'm misunderstanding how this all works and there's another way, let me know, but it seems like this may be a case where there is a use for quantization (topological inference) that is useful for computation (simplification), not just for display.

If your position is that mapshaper is primarily intended for data sets that will fit in memory, I completely understand and I think it's a great tool either way. I just want to make sure I'm not missing something about the way it works that would in fact allow me to solve problems like the above. Thanks!

@mbloch
Copy link
Owner

mbloch commented Jul 16, 2016

It sounds like there's a bit of a misunderstanding about the relationship between topology inference and quantization. Mapshaper does not use quantization to identify topology. On the other hand, quantization does affect topology. As you've noted, if you apply quantization to two different files, gaps will form between formerly shared boundaries.

Still, you have a legitimate feature request. If a user has a polygon mosaic that is split across multiple files, currently there's no good way to simplify the files separately while maintaining shared boundaries across different files. This is a problem with or without quantization.

I don't have a good solution presently, but I'll re-open the issue. It's an interesting problem.

@mbloch mbloch reopened this Jul 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants