Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of the HDRF algorithm within PowerGraph. #177

Merged
merged 2 commits into from
Jul 29, 2015

Conversation

fabiopetroni
Copy link
Contributor

HDRF is a novel stream-based graph partitioning algorithm that provides important performance improvements with respect to existing solutions in partitioning quality.
In particular, HDRF provides the smallest average replication factor with close to optimal load balance. These two characteristics put together allow HDRF to significantly reduce the time needed to perform computation on graphs and makes it the best choice for partitioning graph data.

The HDRF algorithm is extensively described in the following publication:

F. Petroni, L. Querzoni, K. Daudjee, S. Kamali and G. Iacoboni:
"HDRF: Stream-Based Partitioning for Power-Law Graphs".
CIKM, 2015.
http://www.dis.uniroma1.it/~midlab/articoli/PQDKI15CIKM.pdf

HDRF is a novel stream-based graph partitioning algorithm that provides important performance improvements with respect to existing solutions in partitioning quality.
In particular, HDRF provides the smallest average replication factor with close to optimal load balance. These two characteristics put together allow HDRF to significantly reduce the time needed to perform computation on graphs and makes it the best choice for partitioning graph data.

The HDRF algorithm is extensively described in the following publication:

F. Petroni, L. Querzoni, K. Daudjee, S. Kamali and G. Iacoboni: 
"HDRF: Stream-Based Partitioning for Power-Law Graphs". 
CIKM, 2015.
http://www.dis.uniroma1.it/~midlab/articoli/PQDKI15CIKM.pdf
HDRF is a novel stream-based graph partitioning algorithm that provides important performance improvements with respect to existing solutions in partitioning quality.
In particular, HDRF provides the smallest average replication factor with close to optimal load balance. These two characteristics put together allow HDRF to significantly reduce the time needed to perform computation on graphs and makes it the best choice for partitioning graph data.

The HDRF algorithm is extensively described in the following publication:

F. Petroni, L. Querzoni, K. Daudjee, S. Kamali and G. Iacoboni: 
"HDRF: Stream-Based Partitioning for Power-Law Graphs". 
CIKM, 2015.
http://www.dis.uniroma1.it/~midlab/articoli/PQDKI15CIKM.pdf
@dbickson
Copy link
Contributor

Hi Fabio,
The patch looks good, but I did not find the documentation string where you allow the new partitioning method in the engine. Can you please add it as well?

@fabiopetroni
Copy link
Contributor Author

On 7/29/15 5:54 PM, Danny Bickson wrote:

Hi Fabio,
The patch looks good, but I did not find the documentation string where
you allow the new partitioning method in the engine. Can you please add
it as well?


Reply to this email directly or view it on GitHub
#177 (comment).

Hi Danny,

the string is in the src/graphlab/graph/distributed_graph.hpp file:

...} else if (method == "hdrf") {
if (rpc.procid() == 0) logstream(LOG_EMPH) << "Use hdrf oblivious ingress, usehash: " << usehash
<< ", userecent: " << userecent << std::endl;
ingress_ptr = new distributed_hdrf_ingress<VertexData, EdgeData>(rpc.dc(), *this, usehash, userecent);
} else...

It is possible to use HDRF as input partitioner by specifing the following option:

--graph_opts ingress=hdrf

Best,
Fabio

dbickson added a commit that referenced this pull request Jul 29, 2015
Integration of the HDRF algorithm within PowerGraph.
@dbickson dbickson merged commit 1f157b3 into jegonzal:master Jul 29, 2015
@dbickson
Copy link
Contributor

@fabiopetroni
Copy link
Contributor Author

On 7/29/15 7:25 PM, Danny Bickson wrote:

To clarify, need to add additional documentation here:
https://github.com/dato-code/PowerGraph/blob/18c21033d77208cedac7661e4fcf35698abb4021/src/graphlab/options/graph_help.txt


Reply to this email directly or view it on GitHub
#177 (comment).

Ok, thanks.
I created a new pull request with the updated documentation.
:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants