Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamingQueryException: Input byte array has wrong 4-byte ending unit #548

Closed
trincaog opened this issue Oct 22, 2020 · 2 comments
Closed

Comments

@trincaog
Copy link

When trying to collect Azure Eventhub messages using Spark/Python, I always get the exception "StreamingQueryException: Input byte array has wrong 4-byte ending unit"

Sample code:

conf = {}
conf["eventhubs.connectionString"] = "Endpoint=sb://XXXXXXXXX.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXX=;EntityPath=XXXXXX"
                                      
read_df  = spark.readStream.format("eventhubs").options(**conf).load()
stream = read_df.writeStream.format("console").start()
stream.awaitTermination()

Error:
StreamingQueryException: Input byte array has wrong 4-byte ending unit
=== Streaming Query ===
Identifier: [id = dc06f9ed-73b3-4198-9f68-4f28f78132be, runId = c34f5655-e2d0-43fe-b101-8b58e5b70a14]
Current Committed Offsets: {}
Current Available Offsets: {}

Current State: INITIALIZING
Thread State: RUNNABLE

Versions:
Spark 3.0.1 (Databricks 7.3LTS)
com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.17

@nyaghma
Copy link
Contributor

nyaghma commented Oct 26, 2020

@trincaog
As it has been mentioned here, for 2.3.15 version and above, the configuration dictionary requires that the connection string be encrypted.

@trincaog
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants