[10383 ] Support decimal operation not precision loss mode (10383)

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
oap-project · Sep 10, 2024 · d9857e2 · d9857e2
1 parent f770f80
commit d9857e2
Show file tree

Hide file tree

Showing 13 changed files with 356 additions and 58 deletions.
diff --git a/velox/docs/functions/spark/config.rst b/velox/docs/functions/spark/config.rst
@@ -0,0 +1,21 @@
+================================
+SparkRegistration Configuration
+================================
+
+struct SparkRegistrationConfig
+---------------------
+.. list-table::
+   :widths: 20 10 10 70
+   :header-rows: 1
+
+   * - Property Name
+     - Type
+     - Default Value
+     - Description
+   * - allowPrecisionLoss
+     - bool
+     - true
+     - When true, establishing the result type of an arithmetic operation according to Hive behavior and SQL ANSI 2011 specification, i.e.
+       rounding the decimal part of the result if an exact representation is not
+       possible. Otherwise, NULL is returned when the actual result cannot be represented with the calculated decimal type. Now we support add,
+       subtract, multiply and divide operations.
diff --git a/velox/docs/functions/spark/decimal.rst b/velox/docs/functions/spark/decimal.rst
@@ -33,8 +33,11 @@ Division
     p = p1 - s1 + s2 + max(6, s1 + p2 + 1)
     s = max(6, s1 + p2 + 1)
 
+Decimal Precision and Scale Adjustment
+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+
 For above arithmetic operators, when the precision of result exceeds 38,
-caps p at 38 and reduces the scale, in order to prevent the truncation of
+caps p at 38 and reduces the scale when allowing precision loss, in order to prevent the truncation of
 the integer part of the decimals. Below formula illustrates how the result
 precision and scale are adjusted.
 
@@ -43,6 +46,26 @@ precision and scale are adjusted.
     precision = 38
     scale = max(38 - (p - s), min(s, 6))
 
+Caps p and s at 38 when not allowing precision loss.
+For decimal addition, subtraction, multiplication, the precision and scale computation logic is same,
+but for decimal division, it is different as following:
+::
+
+    wholeDigits = min(38, p1 - s1 + s2);
+    fractionalDigits = min(38, max(6, s1 + p2 + 1));
+
+If ``wholeDigits + fractionalDigits`` is more than 38:
+::
+
+    p = 38
+    s = fractionalDigits - (wholeDigits + fractionalDigits - 38) / 2 - 1
+
+Otherwise:
+::
+
+    p = wholeDigits + fractionalDigits
+    s = fractionalDigits
+
 Users experience runtime errors when the actual result cannot be represented
 with the calculated decimal type.
 

diff --git a/velox/docs/spark_functions.rst b/velox/docs/spark_functions.rst
@@ -4,6 +4,8 @@ Spark Functions
 
 The semantics of Spark functions match Spark 3.5 with ANSI OFF.
 
+Spark functions can be registered by :doc:`struct SparkRegistrationConfig <functions/spark/config>`.
+
 .. toctree::
     :maxdepth: 1