Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform specs are ignored if dimension auto detection is used #7952

Open
vogievetsky opened this issue Jun 24, 2019 · 0 comments
Open

Transform specs are ignored if dimension auto detection is used #7952

vogievetsky opened this issue Jun 24, 2019 · 0 comments

Comments

@vogievetsky
Copy link
Contributor

The column auto detecting feature does not find the new columns created by transforms.

Affected Version

All versions of Druid so far (up to 0.15.0)

Description

Say you have data:

{"a":"hello","b":"world"}
{"a":"where","c":"to go"}

In a file that lives at: /Users/vadim/Downloads/test-data.json

And you ingest it with:

{
  "dataSchema": {
    "dataSource": "Downloads",
    "parser": {
      "type": "string",
      "parseSpec": {
        "format": "json",
        "timestampSpec": {
          "column": "!!!_no_such_column_!!!",
          "missingValue": "2010-01-01T00:00:00Z"
        },
        "dimensionsSpec": {}
      }
    },
    "metricsSpec": [
      {
        "name": "count",
        "type": "count"
      }
    ],
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "DAY",
      "queryGranularity": "HOUR",
      "rollup": true,
      "intervals": null
    },
    "transformSpec": {
      "filter": null,
      "transforms": [
        {
          "type": "expression",
          "name": "a_prime",
          "expression": "concat(\"a\",'_prime')"
        }
      ]
    }
  },
  "ioConfig": {
    "type": "index_parallel",
    "firehose": {
      "type": "local",
      "baseDir": "/Users/vadim/Downloads",
      "filter": "test-data.json"
    },
    "appendToExisting": false
  },
  "tuningConfig": {
    "type": "index_parallel",
    "maxRowsPerSegment": null,
    "maxRowsInMemory": 1000000,
    "maxBytesInMemory": 0,
    "maxTotalRows": null,
    "numShards": null,
    "indexSpec": {
      "bitmap": {
        "type": "concise"
      },
      "dimensionCompression": "lz4",
      "metricCompression": "lz4",
      "longEncoding": "longs"
    },
    "maxPendingPersists": 0,
    "forceGuaranteedRollup": false,
    "reportParseExceptions": false,
    "pushTimeout": 0,
    "segmentWriteOutMediumFactory": null,
    "maxNumSubTasks": 1,
    "maxRetry": 3,
    "taskStatusCheckPeriodMs": 1000,
    "chatHandlerTimeout": "PT10S",
    "chatHandlerNumRetries": 5,
    "logParseExceptions": false,
    "maxParseExceptions": 2147483647,
    "maxSavedParseExceptions": 0,
    "partitionDimensions": [],
    "buildV9Directly": true
  },
  "type": "index_parallel"
}

Notice how I am trying to create an a_prime column with a transform spec.

The job will work but when you query the data:

image

You see that there is no a_prime column.

I would be great (and would make a ton more sense) if the transforms added themselves to the column list coming from the file.

@vogievetsky vogievetsky changed the title Transform specs are ignored if dimensions auto detection is used Transform specs are ignored if dimension auto detection is used Jun 24, 2019
xvrl added a commit to xvrl/druid that referenced this issue Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant