Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using KafkaUtils doesn't work #38

Open
dasch opened this issue Nov 13, 2015 · 1 comment
Open

Using KafkaUtils doesn't work #38

dasch opened this issue Nov 13, 2015 · 1 comment

Comments

@dasch
Copy link

dasch commented Nov 13, 2015

I'm getting this error:

>>> directKafkaStream = KafkaUtils.createDirectStream(ssc, ["help_center.activity.events"], {"metadata.broker.list": "kafka.service.consul:9092"})

________________________________________________________________________________________________

  Spark Streaming's Kafka libraries not found in class path. Try one of the following.

  1. Include the Kafka library and its dependencies with in the
     spark-submit command as

     $ bin/spark-submit --packages org.apache.spark:spark-streaming-kafka:1.5.1 ...

  2. Download the JAR of the artifact from Maven Central http://search.maven.org/,
     Group Id = org.apache.spark, Artifact Id = spark-streaming-kafka-assembly, Version = 1.5.1.
     Then, include the jar in the spark-submit command as

     $ bin/spark-submit --jars <spark-streaming-kafka-assembly.jar> ...

________________________________________________________________________________________________


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/spark/python/pyspark/streaming/kafka.py", line 130, in createDirectStream
    raise e
py4j.protocol.Py4JJavaError: An error occurred while calling o21.loadClass.
: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConne

Would there be a downside to including the necessary libraries in the image?

@Qbin
Copy link

Qbin commented Feb 22, 2017

I also get a error like this. Try to get spark-streaming-kafka-assembly.jar
I use Eclipse to create a maven project, and add some configure in pom.xml

<dependency>
    	<groupId>org.apache.spark</groupId>
    	<artifactId>spark-streaming_2.10</artifactId>
    	<version>1.6.1</version>
	</dependency>
		<dependency>
    	<groupId>org.apache.spark</groupId>
    	<artifactId>spark-streaming-kafka-assembly_2.10</artifactId>
    	<version>1.6.1</version>
	</dependency>
  </dependencies>

So I get spark-streaming-kafka-assembly.jar, then execute
bin/spark-submit --jars <spark-streaming-kafka-assembly.jar> file.py
all is ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants