Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RpcEndpointNotFoundException: Cannot find endpoint: spark://PartitionPerformanceReceiver #538

Closed
marcusbb opened this issue Sep 21, 2020 · 7 comments
Assignees
Labels
known issue This is a verified bug in the codebase.

Comments

@marcusbb
Copy link

Bug Report:

com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.17

When run from spark-submit OR databricks notebook, batch query:

val connectionString = ConnectionStringBuilder(s"Endpoint=sb://audit-trunk-eventhub-namespace.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxyyy").setEventHubName("audit-trunk-eventhub").build

val ehConf = EventHubsConf(connectionString).setConsumerGroup("validate_sept21").setMaxEventsPerTrigger(60 * 1200000).setStartingPosition(EventPosition.fromStartOfStream).setEndingPosition(EventPosition.fromEnqueuedTime(Instant.now))

val df = spark.read.format("eventhubs").options(ehConf.toMap).load()
val body = df.select($"body" cast "string")
body.count() //or any query

Works in com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.14.1

Stack Trace in spark submit (spark 2.4.0) (same in databricks runtime 6.5):

[Stage 0:> (0 + 4) / 32]20/09/21 17:17:26 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.ExceptionInInitializerError
at org.apache.spark.eventhubs.rdd.EventHubsRDD.compute(EventHubsRDD.scala:118)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:32)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$.(CachedEventHubsReceiver.scala:334)
at org.apache.spark.eventhubs.client.CachedEventHubsReceiver$.(CachedEventHubsReceiver.scala)
... 33 more
Caused by: org.apache.spark.rpc.RpcEndpointNotFoundException: Cannot find endpoint: spark://PartitionPerformanceReceiver@10.0.2.15:42395
at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$asyncSetupEndpointRefByURI$1.apply(NettyRpcEnv.scala:142)
at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$asyncSetupEndpointRefByURI$1.apply(NettyRpcEnv.scala:138)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at org.spark_project.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:136)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at scala.concurrent.Promise$class.complete(Promise.scala:55)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:157)
at scala.concurrent.Future$$anonfun$recover$1.apply(Future.scala:326)
at scala.concurrent.Future$$anonfun$recover$1.apply(Future.scala:326)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at org.spark_project.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:136)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at scala.concurrent.Promise$class.complete(Promise.scala:55)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:157)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at scala.concurrent.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:63)
at scala.concurrent.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:78)
at scala.concurrent.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:55)
at scala.concurrent.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:55)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at scala.concurrent.BatchingExecutor$Batch.run(BatchingExecutor.scala:54)
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:106)
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at scala.concurrent.Promise$class.trySuccess(Promise.scala:94)
at scala.concurrent.impl.Promise$DefaultPromise.trySuccess(Promise.scala:157)
at org.apache.spark.rpc.netty.NettyRpcEnv.org$apache$spark$rpc$netty$NettyRpcEnv$$onSuccess$1(NettyRpcEnv.scala:217)
at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$ask$1.apply(NettyRpcEnv.scala:226)
at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$ask$1.apply(NettyRpcEnv.scala:225)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at org.spark_project.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:136)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at scala.concurrent.Promise$class.complete(Promise.scala:55)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:157)
at scala.concurrent.Promise$class.success(Promise.scala:86)
at scala.concurrent.impl.Promise$DefaultPromise.success(Promise.scala:157)
at org.apache.spark.rpc.netty.LocalNettyRpcCallContext.send(NettyRpcCallContext.scala:50)
at org.apache.spark.rpc.netty.NettyRpcCallContext.reply(NettyRpcCallContext.scala:32)
at org.apache.spark.rpc.netty.RpcEndpointVerifier$$anonfun$receiveAndReply$1.applyOrElse(RpcEndpointVerifier.scala:31)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:105)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
... 3 more

@nyaghma nyaghma self-assigned this Sep 22, 2020
@nyaghma
Copy link
Contributor

nyaghma commented Sep 22, 2020

@marcusbb Thanks for letting us know about this issue. This issue happened because the RPC endpoint has not been opened on the driver side for PartitionPerformanceReceiver, but the CachedEventHubsReceiver was expecting that to be open. PR #542 fixes the issue.

@marcusbb
Copy link
Author

thanks @nyaghma - is there an eta on this fix, or expected release?

@nyaghma
Copy link
Contributor

nyaghma commented Sep 23, 2020

@marcusbb We'll include the fix in the next release within the next few weeks. Meanwhile, you can use version 2.3.16 which doesn't have this issue.

@y1zh
Copy link

y1zh commented Nov 12, 2020

Hi, @nyaghma. Has this issue been completely fixed? I built a private package from the latest master branch (including PR #542) and used it to createDirectStream, but still ran into this problem.

@nyaghma
Copy link
Contributor

nyaghma commented Nov 16, 2020

The change is included in version 2.3.18.

@nyaghma nyaghma closed this as completed Nov 16, 2020
@galexiou
Copy link

I use version 2.3.21 but I have this issue! Is there any update on this?

@galexiou
Copy link

@nyaghma I don't think that this issue is fixed. It is still happening in version 2.3.21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
known issue This is a verified bug in the codebase.
Projects
None yet
Development

No branches or pull requests

4 participants