You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using event hub with Spark structured streaming to process event data. Everything seems to work well but when I quit the Spark process (or it crashes) I expect Spark to pick up the stream where it left off (exactly once processing). But when I restart the Spark script it starts processing already processed event data.
In my Spark script I use a checkpoint to save state, but it seems that the state is not actual...
Do I miss some configuration/option for exactly once processing?
I'm using EventHub connector 2.3.6 with Spark 2.4.
In snippet below the code where I initialize EventHub connector:
val connectionString = ConnectionStringBuilder()
.setNamespaceName(NAME)
.setEventHubName(HUB)
.setSasKeyName(KEYNAME)
.setSasKey(KEY)
.build
val ehConf = EventHubsConf(connectionString)
val ibiDS = spark
.readStream
.format("eventhubs")
.options(ehConf.toMap)
.load()
.select($"body".cast("string"))
.map(row => {
...
})
The text was updated successfully, but these errors were encountered:
When a streaming job restarts from a checkpoint, it first checks if the latest batch has been committed or not. If it hasn't, then it re-executes the latest batch before moving forward.
I'm using event hub with Spark structured streaming to process event data. Everything seems to work well but when I quit the Spark process (or it crashes) I expect Spark to pick up the stream where it left off (exactly once processing). But when I restart the Spark script it starts processing already processed event data.
In my Spark script I use a checkpoint to save state, but it seems that the state is not actual...
Do I miss some configuration/option for exactly once processing?
I'm using EventHub connector 2.3.6 with Spark 2.4.
In snippet below the code where I initialize EventHub connector:
The text was updated successfully, but these errors were encountered: