Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Back port new functionality to 24.1.x #51

Merged
merged 1 commit into from
Dec 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Back port new functionality to 24.1.x
  • Loading branch information
sheinbergon committed Dec 23, 2023
commit 3336776330e85386c1c8ab1642e6ff799bcf21de
42 changes: 23 additions & 19 deletions README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,20 @@
[![Github Workflow Status](https://img.shields.io/github/actions/workflow/status/sheinbergon/dremio-udf-gis/release-ci.yml?branch=23.1.x&logo=githubactions&style=for-the-badge)](https://github.com/sheinbergon/dremio-udf-gis/actions?query=workflow%3Arelease-actions)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/sheinbergon/dremio-udf-gis?logo=github&color=%2340E0D0&style=for-the-badge)](https://github.com/sheinbergon/dremio-udf-gis/releases/latest)
[![Maven Central](https://img.shields.io/maven-central/v/org.sheinbergon/dremio-udf-gis?logo=apachemaven&color=Crimson&style=for-the-badge)](https://search.maven.org/search?q=g:org.sheinbergon%20a:dremio-udf-gis*)
[![Snyk Vulnerabilities for GitHub Repo](https://img.shields.io/snyk/vulnerabilities/github/sheinbergon/dremio-udf-gis?logo=snyk&color=432f95&style=for-the-badge)](https://app.snyk.io/org/sheinbergon/project/94183993-505b-439c-9078-6276fa4c1626)
[![Coveralls](https://img.shields.io/coveralls/github/sheinbergon/dremio-udf-gis?logo=coveralls&style=for-the-badge)](https://coveralls.io/github/sheinbergon/dremio-udf-gis)
[![Liberapay](https://img.shields.io/liberapay/patrons/sheinbergon?logo=liberapay&style=for-the-badge)](https://liberapay.com/sheinbergon/donate)

# Dremio Geo-Spatial Extensions

### What you get

- Widespread OGC implementation for SQL (adheres to PostGIS standards)
- Supported input formats: `WKT`, `WKB (HEX or BINARY)`
- Supported output formats: `WKT`, `WKB`, `GeoJSON`
- Easily installable Maven-Central/Github artifacts shaded jar artifact
- Dremio CE version compatibility (new versions will be released with each community edition)
- Supported input formats: `WKT`, `WKB (HEX or BINARY)`
- Supported output formats: `WKT`, `WKB`, `GeoJSON`
- Easily installable Maven-Central/Github artifacts shaded jar artifact
- Dremio CE version compatibility (new versions will be released with each community edition)
- Up-2-date Proj4J & JTS geometry based implementation


### Sponsorship

Enjoying my work? A show of support would be much obliged :grin:
Expand All @@ -28,6 +27,7 @@ Enjoying my work? A show of support would be much obliged :grin:
</a>

### Installation

- Take the shaded jar for the desired version and place inside your Dremio installation (`$DREMIO_HOME/jars/3rdparty`)
- Restart your Dremio server(s)
- Rejoice! (and see the [WIKI](https://github.com/sheinbergon/dremio-udf-gis/wiki) for detailed usage instructions)
Expand All @@ -36,30 +36,31 @@ Enjoying my work? A show of support would be much obliged :grin:

| Library Version | Dremio Version | Status |
|-----------------|----------------|------------|
| 0.2.x | 20.1.x | Legacy |
| 0.3.x | 21.1.x | Legacy |
| 0.4.x | 21.2.x | Legacy |
| 0.5.x | 22.0.x | Legacy |
| 0.6.x | 22.1.x | Legacy |
| 0.7.x | 23.0.x | Maintained |
| 0.8.x | 23.1.x | Maintained |
| 0.9.x | 24.0.x | Maintained |
| 0.10.x | 24.0.x | Maintained |

| 0.2.x | 20.1.0 | Legacy |
| 0.3.x | 21.1.1 | Legacy |
| 0.4.x | 21.2.0 | Legacy |
| 0.5.x | 22.0.0 | Legacy |
| 0.6.x | 22.1.1 | Legacy |
| 0.7.x | 23.0.1 | Legacy |
| 0.8.x | 23.1.0 | Legacy |
| 0.9.x | 24.0.0 | Maintained |
| 0.10.x | 24.1.0 | Maintained |

### Usage Notes

As opposed to PostGIS, Dremio is only a query engine based on existing/projected data sources/lakes.
That means that `Geometry` is not a natively supported data type, and you can only access it if
it's being properly projected from the data sources (For example, PostGIS Geometry is read as an `EWKB` HEX encoded string).

In order to successfully use the provided GIS functions, you must first make sure the geometry is in `WKB (BINARY)` format.
If it's not, you need to decode it:
If it's not, you need to decode it:

- if the input is in `WKT` format, use `ST_GeomFromText`
- if the input is a HEX encoded`WKB`, use Dremio's `FROM_HEX`

This library uses Dremios' Arrow buffers (`ArrowBuf`) to maintain geometry data in binary (`WKB`) format (for performance and efficiency)
when interchanging it between GIS functions, which is of course undecipherable for the naked eye. When running queries from the UI,
`WKB` output will always be base64 encoded.
`WKB` output will always be base64 encoded.

In order to resolve Data back to human-readable format (`WKT`), use `ST_AsText`/`ST_AsGeoJson`

Expand All @@ -74,11 +75,14 @@ SELECT ST_AsText(
```

### Roadmap

- Frequent version/dependency updates
- Add more OGC/PostGIS matching functionality
- Add Geography type support

### Noteworthy Mentions
Work in this repository was originally based on the following sources:

Work in this repository was originally based on the following sources:

- [Apache Drill GIS Functionality](https://github.com/apache/drill/tree/master/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/gis)
- [Christy Haragan's initial port](https://github.com/christyharagan/dremio-gis)
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<carrotsearch.version>0.7.0</carrotsearch.version>
<arrow-memory-netty.version>9.0.0-20221123064031-c39b8a6253-dremio</arrow-memory-netty.version>
</properties>
<version>0.10.0-SNAPSHOT</version>
<version>0.10.2-SNAPSHOT</version>
<name>dremio-udf-gis</name>
<description>GIS UDF extensions for Dremio</description>
<url>https://github.com/sheinbergon/dremio-udf-gis</url>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
* <p>
* http://www.apache.org/licenses/LICENSE-2.0
* <p>
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.sheinbergon.dremio.udf.gis;

import com.dremio.exec.expr.SimpleFunction;
import com.dremio.exec.expr.annotations.FunctionTemplate;
import com.dremio.exec.expr.annotations.Output;
import com.dremio.exec.expr.annotations.Param;

import javax.inject.Inject;

@FunctionTemplate(
name = "ST_GeomFromGeoJSON",
scope = FunctionTemplate.FunctionScope.SIMPLE,
nulls = FunctionTemplate.NullHandling.INTERNAL)
public class STGeomFromGeoJson implements SimpleFunction {

@Param
org.apache.arrow.vector.holders.NullableVarCharHolder jsonInput;

@Output
org.apache.arrow.vector.holders.NullableVarBinaryHolder binaryOutput;

@Inject
org.apache.arrow.memory.ArrowBuf buffer;

public void setup() {
}

public void eval() {
if (org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.isHolderSet(jsonInput)) {
org.locationtech.jts.geom.Geometry geom = org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.toGeometryFromGeoJson(jsonInput);
byte[] bytes = org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.toEWKB(geom);
buffer = buffer.reallocIfNeeded(bytes.length);
org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.populate(bytes, buffer, binaryOutput);
} else {
org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.markHolderNotSet(binaryOutput);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.locationtech.jts.algorithm.Angle;
import org.locationtech.jts.geom.*;
import org.locationtech.jts.io.*;
import org.locationtech.jts.io.geojson.GeoJsonReader;
import org.locationtech.jts.io.geojson.GeoJsonWriter;
import org.locationtech.jts.operation.buffer.BufferOp;
import org.locationtech.jts.operation.valid.IsValidOp;
Expand Down Expand Up @@ -133,6 +134,17 @@ public static Geometry toGeometry(final @Nonnull NullableVarCharHolder holder) {
}
}

@Nonnull
public static Geometry toGeometryFromGeoJson(final @Nonnull NullableVarCharHolder holder) {
try {
String json = toUTF8String(holder);
GeoJsonReader reader = new GeoJsonReader();
return reader.read(json);
} catch (ParseException x) {
throw new RuntimeException(x);
}
}

@Nonnull
public static Geometry toGeometryFromEWKT(final @Nonnull NullableVarCharHolder holder) {
try {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package org.sheinbergon.dremio.udf.gis

import org.apache.arrow.vector.holders.NullableVarBinaryHolder
import org.apache.arrow.vector.holders.NullableVarCharHolder
import org.sheinbergon.dremio.udf.gis.spec.GeometryInputFunSpec
import org.sheinbergon.dremio.udf.gis.util.allocateBuffer

internal class STGeomFromGeoJsonTests : GeometryInputFunSpec.NullableVarChar<STGeomFromGeoJson>() {

init {
testGeometryInput(
"Calling ST_GeomFromGeoJSON on a POINT",
"""
{"type":"Point","coordinates":[0.5,0.5],"crs":{"type":"name","properties":{"name":"EPSG:4326"}}}
""".trimIndent(),
byteArrayOf(1, 1, 0, 0, 32, -26, 16, 0, 0, 0, 0, 0, 0, 0, 0, -32, 63, 0, 0, 0, 0, 0, 0, -32, 63)
)

testInvalidGeometryInput(
"Calling ST_GeomFromGeoJSON on rubbish text",
"42ifon2 fA!@",
)

testNullGeometryInput(
"Calling ST_GeomFromGeoJSON on null input"
)
}

override val function = STGeomFromGeoJson().apply {
jsonInput = NullableVarCharHolder()
binaryOutput = NullableVarBinaryHolder()
buffer = allocateBuffer()
}

override val STGeomFromGeoJson.input: NullableVarCharHolder get() = function.jsonInput
override val STGeomFromGeoJson.output: NullableVarBinaryHolder get() = function.binaryOutput
}
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ abstract class GeometryInputFunSpec<F : SimpleFunction, I : ValueHolder, V : Any
}
}

final override fun NullableVarCharHolder.markNotSet() = this.valueIsNotSet()
final override fun NullableVarCharHolder.markNotSet() {
this.valueIsNotSet()
}

final override fun NullableVarCharHolder.set(value: String) = this.setUtf8(value)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,9 @@ abstract class GeometryRelationFunSpec<F : SimpleFunction, O : ValueHolder> : Fu
this.isSet shouldBeExactly 1
}

override fun NullableVarCharHolder.valueIsNotSet() =
override fun NullableVarCharHolder.valueIsNotSet() {
this.isSet shouldBe 0
}
}

abstract class NullableBitOutput<F : SimpleFunction> : GeometryRelationFunSpec<F, NullableBitHolder>() {
Expand Down Expand Up @@ -91,8 +92,9 @@ abstract class GeometryRelationFunSpec<F : SimpleFunction, O : ValueHolder> : Fu
}
}

override fun NullableBitHolder.valueIsNotSet() =
override fun NullableBitHolder.valueIsNotSet() {
this.isSet shouldBe 0
}
}

protected fun testNullGeometryRelation(
Expand Down