You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An EventHubs customer is using 30 partitions (one partition is HIGHLY overloaded and handling 90% of the traffic). The customer is seeing their streaming failing with an error observed previously (see below).
Ruidong and I believe the long processing time in Spark is causing the AMQP connection to be dropped due to inactivity and for some reason it is not being re-instantiated. Thus the job fails with error seen in the past #454 .
They were able to reproduce with Client version com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.18
Can we get a fix for this edge-case?
21/04/16 04:15:17 WARN TaskSetManager: Lost task 1.0 in stage 7.0 (TID 454, 10.139.64.133, executor 4): java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: ReactorDispatcher instance is closed.
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:661)
at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646)
at java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:686)
at java.util.concurrent.CompletableFuture.thenAcceptAsync(CompletableFuture.java:2019)
at com.microsoft.azure.eventhubs.impl.PartitionReceiverImpl.createInternalReceiver(PartitionReceiverImpl.java:110)
at com.microsoft.azure.eventhubs.impl.PartitionReceiverImpl.create(PartitionReceiverImpl.java:98)
at com.microsoft.azure.eventhubs.impl.EventHubClientImpl.createEpochReceiver(EventHubClientImpl.java:284)
at org.apache.spark.eventhubs.EventHubsUtils$.createReceiverInner(EventHubsUtils.scala:148)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$$anonfun$2.apply(CachedEventHubsReceiver.scala:91)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$$anonfun$2.apply(CachedEventHubsReceiver.scala:91)
at org.apache.spark.eventhubs.utils.RetryUtils$$anonfun$retryJava$1.apply(RetryUtils.scala:91)
at org.apache.spark.eventhubs.utils.RetryUtils$$anonfun$retryJava$1.apply(RetryUtils.scala:91)
at org.apache.spark.eventhubs.utils.RetryUtils$.org$apache$spark$eventhubs$utils$RetryUtils$$retryHelper$1(RetryUtils.scala:116)
at org.apache.spark.eventhubs.utils.RetryUtils$.retryScala(RetryUtils.scala:149)
at org.apache.spark.eventhubs.utils.RetryUtils$.retryJava(RetryUtils.scala:91)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver.createReceiver(CachedEventHubsReceiver.scala:90)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver.recreateReceiver(CachedEventHubsReceiver.scala:151)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver.org$apache$spark$eventhubs$client$CachedEventHubsReceiver$$awaitReceiveMessage(CachedEventHubsReceiver.scala:297)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$$anonfun$7.apply(CachedEventHubsReceiver.scala:240)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$$anonfun$7.apply(CachedEventHubsReceiver.scala:239)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.Range.foreach(Range.scala:160)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver.org$apache$spark$eventhubs$client$CachedEventHubsReceiver$$receive(CachedEventHubsReceiver.scala:239)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$.receive(CachedEventHubsReceiver.scala:356)
at org.apache.spark.eventhubs.rdd.EventHubsRDD.compute(EventHubsRDD.scala:120)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:353)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:317)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:353)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:317)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:353)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:317)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:353)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:317)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:353)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:317)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:140)
at org.apache.spark.scheduler.Task.run(Task.scala:113)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:537)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1541)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:543)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.RejectedExecutionException: ReactorDispatcher instance is closed.
at com.microsoft.azure.eventhubs.impl.ReactorDispatcher.throwIfSchedulerError(ReactorDispatcher.java:86)
at com.microsoft.azure.eventhubs.impl.ReactorDispatcher.invoke(ReactorDispatcher.java:62)
at com.microsoft.azure.eventhubs.impl.MessagingFactory.scheduleOnReactorThread(MessagingFactory.java:681)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.createLink(MessageReceiver.java:218)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.create(MessageReceiver.java:209)
at com.microsoft.azure.eventhubs.impl.PartitionReceiverImpl.createInternalReceiver(PartitionReceiverImpl.java:106)
... 47 more
Feature Requests:
What issue are you trying to solve?
How do you want to solve it?
What is your use case for this feature?
Bug Report:
Actual behavior
Expected behavior
Spark version
spark-eventhubs artifactId and version
The text was updated successfully, but these errors were encountered:
This issue happened due to a race condition in the underlying Java SDK. The issue has been fixed in the new version (3.3.0) which is being used in Spark connector version 2.3.20 (PR #604).
An EventHubs customer is using 30 partitions (one partition is HIGHLY overloaded and handling 90% of the traffic). The customer is seeing their streaming failing with an error observed previously (see below).
Ruidong and I believe the long processing time in Spark is causing the AMQP connection to be dropped due to inactivity and for some reason it is not being re-instantiated. Thus the job fails with error seen in the past #454 .
They were able to reproduce with Client version
com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.18
Can we get a fix for this edge-case?
Feature Requests:
Bug Report:
The text was updated successfully, but these errors were encountered: