Skip to content

Commit

Permalink
added connected components algo for spark
Browse files Browse the repository at this point in the history
  • Loading branch information
huy committed Jan 29, 2016
1 parent e2e65b2 commit 8b15acd
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 4 deletions.
14 changes: 12 additions & 2 deletions connected_components_spark/README.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
This a distributed connected components (union-find) implementation for
This is a distributed connected components (union-find) implementation for
pyspark based on the following paper.

http://www.cse.unr.edu/~hkardes/pdfs/ccf.pdf
Expand All @@ -21,5 +21,15 @@ Usage:

vertices_to_roots = ccf.ccf_run(sc, edges, max_iters=5)
root_to_children = ccf.ccf_group_by_root(vertices_to_roots)
root_to_children.take(10)
print root_to_children.take(10)

Expected output:

[
("a", ["a", "b", "c", "g"]),
("d", ["e", "f"])
]




11 changes: 9 additions & 2 deletions connected_components_spark/ccf_spark.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
This a distributed connected components (union find) implementation for
This is a distributed connected components (union-find) implementation for
pyspark based on the following paper.
http://www.cse.unr.edu/~hkardes/pdfs/ccf.pdf
Expand All @@ -22,7 +22,14 @@
vertices_to_roots = ccf.ccf_run(sc, edges, max_iters=5)
root_to_children = ccf.ccf_group_by_root(vertices_to_roots)
root_to_children.take(10)
print root_to_children.take(10)
Expected output:
[
("a", ["a", "b", "c", "g"]),
("d", ["e", "f"])
]
"""

Expand Down

0 comments on commit 8b15acd

Please sign in to comment.