Skip to content

Contains spark dataframe solutions of leetcode questions

Notifications You must be signed in to change notification settings

cM2908/leetcode-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark Solutions + Leetcode SQL Questions

Want to practice & solve some complex questions using Spark?

  • Then nothing is better than solving some Leetcode questions, but you might think how, read along & you will get a fair idea.
  • Now, Execute Spark Dataframe/Dataset/SQL/RDD code on Leetcode SQL Questions.

Problem statements of all questions including leetcode premium questions :

Repository Contains :

  • Spark Dataframe/Dataset/SQL/RDD Solutions on Leetcode Questions
  • PostgreSQL Dump File (leetcodedb.sql)

Get Started :

  • This guide assumes that you already have PostgreSQL database & Apache Spark installed
  • Load dump file to your local PostgreSQL setup.
  • Leetcode-Sql This repository contains all the information needed to load postgresql dump file (that contains tables of all leetcode sql questions) into your local postgresql setup.

Integrate Apache Spark with PostgreSQL database :

  • Download PostgreSQL JDBC Connector JAR (select appropriate version of JAR according your PostgreSQL setup)

  • Add PostgreSQL JDBC Connector jar to "spark/jars" directory. (This step will make the JAR available directly to the classpath when starting the spark-shell)

  • Start Spark-Shell

user@my-machine:~$ spark-shell
  • Step to Check for Classpath in Spark Shell (Optional)
scala> import java.lang.ClassLoader
scala> val cl = ClassLoader.getSystemClassLoader
scala> cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)
  • Connection code (Replace your credentials)
scala> val url = "jdbc:postgresql://localhost:5432/postgres?user=<your-username>&password=<your-password>"
scala> import java.util.Properties
scala> val connectionProperties = new Properties()
scala> connectionProperties.setProperty("Driver", "org.postgresql.Driver")
  • Example
scala> val query = "(SELECT * FROM employee_181) AS employee"
scala> val employeeDF = spark.read.jdbc(url, query, connectionProperties)
scala> val joinedDF = employeeDF.as("emp")
                        .join(employeeDF.as("mgr"),$"emp.manager_id"===$"mgr.id" && $"emp.salary" > $"mgr.salary","inner")
                        .select($"emp.name")
scala> joinedDF.show

Want to Contribute :

  • Contribute by providing solution of any question in either/all of these dialacts (Spark DataFrame,Spark DataSet,Spark RDD,Spark SQL)
  • Forked the repository
  • Create solution file with proper name (eg. "175. Combine Two Tables (Easy).txt")
  • Create Pull Request
  • After review I'll merge it with the main repository.
  • Congratulations, you've contributed something to the data community.

About

Contains spark dataframe solutions of leetcode questions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published