Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MergeOperator in Java #2282

Open
sagar0 opened this issue May 12, 2017 · 30 comments · May be fixed by #4805
Open

Support MergeOperator in Java #2282

sagar0 opened this issue May 12, 2017 · 30 comments · May be fixed by #4805
Assignees
Labels

Comments

@sagar0
Copy link
Contributor

sagar0 commented May 12, 2017

MergeOperator functionality is not currently available in the Java API. StringAppendOperator is the only implementation that is exposed as part of Java API today. There has been considerable interest from the community recently to expose full functionality of MergeOperator so that they can take advantage of it in Java as well (Ex: #1988 , #2289)

cc: @adamretter

@adamretter
Copy link
Collaborator

adamretter commented Jun 23, 2017

@sagar0 So I have a work in progress for this. However, trying to figure out an efficient API is very hard as their is substantial cost for transferring the operand_list across the JNI boundary. I wonder if it is possible to publish an experimental API and get some feedback first?

@sagar0
Copy link
Contributor Author

sagar0 commented Jun 23, 2017

I do understand that this is a hard one and I +1 the idea of publishing an experimental API and seeking community feedback, if any.

@MPdaedalus
Copy link

is there any progress on this issue? Maybe something like map.put(key,oldValue,newValue):boolean could be implemented for atomic updates, I know it would not be as efficient but if there is another way to do atomic updates for RockDb in java without using counters i'm all ears.

@StefanRRichter
Copy link

@adamretter Also interested if there is any ETA for this? I also noticed that our workaround for some merge performance problems mentioned in #1988 (setting options.setMergeOperatorName("stringappendtest") instead of using "stringappend") is no longer working for RocksDB >= 5.8.X. I could not find an obvious change that introduced this regression, but it keeps me from updating our dependency to a newer version.

@jcalcote
Copy link

+1 on this issue. @adamretter please provide some feedback on the current status. I would love to test and/or help out if possible.

@GalRogozinski
Copy link

@adamretter do you have any news?

@adamretter
Copy link
Collaborator

@GalRogozinski My plan is to implement this in the later part of November and December.

@guillaumepitel
Copy link

That would be a very lovely feature to have. For now, we are avoiding Gets for updates by using key suffix + a background sweep on the base which does the merge + put + delete. That's nasty, but still has significantly better performance than pure get/put. The PR here #3432 seems to have proposed some interesting ideas, and it feels like apart from requiring cleanup & straightening the code, it was doing the job.

@guillaumepitel
Copy link

We've started working with @gfouquier on a Java-side MergeOperator based on PR #3432 by @publicocean0 however there seems to be quite a few redunduncy between the init.[cc|h] and the portal.[cc|h] and a lot of thing are not working anymore probably due to evolutions in the master.

I'm not sure why @GalRogozinski unassigned @adamretter from this issue, since he seems by far the most experienced in this matter.

@adamretter, can you confirm you're working on it ? We'd love to participate if you have a banch we could test.

@GalRogozinski
Copy link

It was a github bug. I just mentioned @adamretter and it unassigned him. I am not even supposed to have permissions here.

I just mentioned him again because maybe it will assign him back :trollface:

@mrambacher
Copy link
Contributor

Is the ask to have MergeOperators written in Java or to have custom MergeOperators written in C++ available to Java? I am working on something that I hope opens up more of the extensions written in C++ available to Java (the latter of the two).

@guillaumepitel
Copy link

Is the ask to have MergeOperators written in Java or to have custom MergeOperators written in C++ available to Java? I am working on something that I hope opens up more of the extensions written in C++ available to Java (the latter of the two).

I may speak only for me, but I understand that this issue is about the first case. And by the way, we have a working version here, based on #3432 from @publicocean0 - it's not yet ready for a proper PR though, and we're first testing it heavily.

@guillaumepitel
Copy link

@sagar0 So I have a work in progress for this. However, trying to figure out an efficient API is very hard as their is substantial cost for transferring the operand_list across the JNI boundary. I wonder if it is possible to publish an experimental API and get some feedback first?

@adamretter you seem to worry about JNI performance, so I think I could share some stuff I'm working on.

  1. attach/detach thread is extremely costly, at least when the C++ code is in a C++ non-JVM thread an calls back a Java method. We can avoid systematic attach/detach by tracking the threads destruction, taking care to attach the thread as a daemon thread to avoid blocking at JVM destruction (plus some exta tips). Performance gain is huge : my current version of MergeOperator take 10s with long-term attach/detach, and 110s with systematic attach/detach

  2. Accessing a Java byte[] from C++ can be done without copy using GetPrimitiveArrayCritical. It's not for use everywhere, but for writebatch.put/merge (kv_op) it can be done.

  3. Passing a byte[] from C++ to Java can be done using NewDirectByteBuffer (so a java.nio.ByteBuffer) which wraps a native memory address in a ByteBuffer access pattern. Offering a double API (byte[] or ByteBuffer) seems like a good idea for at least a handful of functionnalities : Comparators, MergeOperators, Filters, Iterators. For Java->C++ calls that would mean adding extra methods, like for instance RocksIterator.keyAsByteBuffer . For C++ -> Java CallBacks that would mean doubling hte Abstract Classes, like Comparator and NioComparator or ByteBufferComparator which would have a methode compare(Bytebuffer a, ByteBuffer b) instead of compare(byte[] a, byte[] b)

@sagar0
Copy link
Contributor Author

sagar0 commented Dec 17, 2018

@mrambacher:

Is the ask to have MergeOperators written in Java or to have custom MergeOperators written in C++ available to Java?

The ultimate aim of this issue is to allow RocksJava API users to allow accessing custom merge operators without modifying RocksDB library code. The merge operator code should lie in the application's code repository.
We should allow the merge operators to be written in Java, but before that a good first step would be to allow custom-written C++ merge operators to be exposed to be used in the java API. Maybe something like (just thinking out loud):

RegisterMergeOperator("your-custom-C++-merge-op", java-merge-op-obj-name);
options.setMergeOperator(java-merge-op-obj-name);

@mrambacher
Copy link
Contributor

@sagar0:
I generated a PR that shows the basic thinking of what I am proposing. We are also trying to add functionality to RocksDB without modifying the Rocks code itself. I have not added MergeOperators to the list for this PR, but am working on it now.

To add a MergeOperator written in C++ from a custom shared library to the rocks code, you would do something like:
options.AddExtensionLibrary("Custom Library");
options.SetMergeOperator("YourMergeOperator", "YourMergeOperatorProperties")

I plan to push something out in the next few weeks that does exactly this for Cassandra.

To implement a Java-based MergeOperator, I would assume someone would write a C++ one that delegates back to the Java layer. Then the "SetMergeOperator" would use the name of the C++ one and pass in properties for the Java class.

If there is interest on this thread, I will add a comment here when I generate the PR for Cassandra using this model.

@gfouquier
Copy link

gfouquier commented Dec 17, 2018 via email

@gfouquier
Copy link

gfouquier commented Dec 20, 2018 via email

@gfouquier gfouquier linked a pull request Feb 27, 2020 that will close this issue
@guillaumepitel
Copy link

Hello, is there a reason why the pull request by @gfouquier hasn't been merged ? It's a pretty big deal for us to have this functionality upstream. We use it in production in a modified Rocksdb and we think this would enable lots of people to use merge functionnality of rocksdb in java

@brary
Copy link

brary commented Mar 27, 2020

Hi @adamretter , I would also want this PR merged. Is there any plan for it?

Also, I see there is options.setMergeOperator() available in rocks java, isn't its role to use merge operator in java?

@mrambacher
Copy link
Contributor

There are a couple different things going on in this thread. Are you asking for access to more of the C++ MergeOperators in Java or asking for the ability to write MergeOperators in Java? I have been working on making the former easier to do but not the ability to write Java MergeOperators.

@brary
Copy link

brary commented Mar 27, 2020

Oh, so you mean the ability to write MergeOperators in Java is already available by options.setMergeOperator() and this PR is meant to access more of the C++ MergeOperators in Java?

@guillaumepitel
Copy link

No I think this is the opposite, the ability to use C++ merge operators is already available and this PR is to ba able to write java merge operators. We use them heavily and rely currently on a patched version of RocksDB. We also we really like for our PR to be merged. AFAIK, there has been no news about any plan since my colleague @gfouquier has submitted the last version.

@brary
Copy link

brary commented Mar 27, 2020

Thanks @guillaumepitel for information. Can you clarify what StringAppendOperator.java does then? Looks like that is exposed in java, which can be set using options.setMergeOperator().

I just started exploring merge operator and looks like it would be helpful for our use-case where we have a read-modify-write (key:[v1,v2,v3]) and we want to convert it into an append using merge operator, so all values v1, v2 and v3 can be merged during key reads.

@guillaumepitel
Copy link

@brary this is precisely how C MergeOperators are currently supported from the Java : StringAppendOperator is just a wrapper around a native JNI call that encapsulate a Java Strings concat operation written in C/JNI. That's probably what @mrambacher is trying to improve. Look at the code of StringAppendOperator, it calls a native method to create a new instance.

@mrambacher
Copy link
Contributor

@guillaumepitel @brary Right. I am trying to make any and all MergeOperators (and other "plugin classes") accessible to Java and C++ via simple APIs. See #6533 for more information.

@adamretter
Copy link
Collaborator

We found that performance was quite bad with out initial implementation, we are exploring new routes to working around the performance issues and hope to send a PR in the next month or two.

@east825
Copy link
Contributor

east825 commented Apr 15, 2021

Hi there! Any work in this area lately? We're also really interested in this, but at the same time, at the moment, we don't really need a full-blown API exposing all the features of merge operators in Java, but rather want to overcome a single limitation of StringAppendOperator. Namely, to have an option to concatenate values without a delimiter character, just one next to another. Does it sound like a legitimate feature to have in RocksJava? Is anyone else interested in that?

@mrambacher
Copy link
Contributor

I am working on making MergeOperators into Customizable objects, meaning I can configure and load them more generically and through options file. I have considered changing the StringAppend operator to use a string (instead of char) separator when I complete this work. Having an empty string would allow for the behavior you are looking for.

From there it would become a matter of getting that interface/method exposed to Java...

@east825
Copy link
Contributor

east825 commented Apr 18, 2021

@mrambacher This level of configuration definitely sounds really cool, but, on the other hand, if it's still WIP, it seems not that difficult to add an extra StringAppend constructor accepting a String instance instead of a char already now, updating C++ internals not that drastically (please correct me if I'm fooling myself here). If no one objects to extending the API this way, I think I could try to come up with such a PR.

@eric-maynard
Copy link

@east825 We have a similar use case; were you ever able to move forward with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.