Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of delta-sync support on client-side. #6297

Closed
wants to merge 3 commits into from

Conversation

ahmedammar
Copy link
Contributor

This commit adds client-side support for delta-sync, this adds a new
3rdparty submodule gh:ahmedammar/zsync. This zsync tree is a modified
version of upstream, adding some needed support for the upload path and
other requirements.

If the server does not announce the required zsync capability then a
full upload/download is fallen back to. Delta synchronization can be
enabled/disabled using command line, config, or gui options.

On both upload and download paths, a check is made for the existance of
a zsync metadata file on the server for a given path. This is provided
by a dav property called zsync, found during discovery phase. If it
doesn't exist the code reverts back to a complete upload or download,
i.e. previous implementations. In the case of upload, a new zsync
metadata file will be uploaded as part of the chunked upload and future
synchronizations will be delta-sync capable.

Chunked uploads no longer use sequential file names for each chunk id,
instead, they are named as the byte offset into the remote file, this is
a minimally intrusive modification to allow fo delta-sync and legacy
code paths to run seamlessly. A new http header OC-Total-File-Length is
sent, which informs the server of the final expected size of the file
not just the total transmitted bytes as reported by OC-Total-Length.

The seeding and generation of the zsync metadata file is done in a
separate thread since this is a cpu intensive task, ensuring main thread
is not blocked.

This commit closes #179.

Copy link
Contributor

@ogoffart ogoffart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still haven't reviewed all of it.

There is quite a lot of jenkins faillure.

ws2_32
)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_FILE_OFFSET_BITS=64")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D_FILE_OFFSET_BITS=64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required?

It would be nice also if everything that is related to zsync lib was together so we could later split it in its own file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of size_t usage and assumptions by zsync. See webarchive comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding putting all the changes together in one place, I can do that, no problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

int metaHandle = zsyncmeta.handle();
int tfHandle = zsynctf.handle();

zsync_unique_ptr<FILE> meta(fdopen(dup(metaHandle), "w"), [](FILE *f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using QFile here, and everywhere you use raw file operation?
QFile has a constructor that opens a file descriptor as well.
I'd think this would lead to more readable code.

Copy link
Contributor Author

@ahmedammar ahmedammar Jan 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions were copied from zsync source as-is since they are not part of the exported api. Needed when generating a metadata file (part of their make cli tool). My main task was to integrate zsync, not re-write it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please see the archive of all the comments I linked, none of this is new.

extern "C" {
#include "librcksum/rcksum.h"
#include "libzsync/zmap.h"
#include "libzsync/sha1.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, Qt also has a way to compute the sha1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, see previous comment.

@@ -48,6 +48,53 @@
</widget>
</item>
<item row="2" column="0">
<widget class="QGroupBox" name="advancedGroupBox">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think there should be no UI addition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you can feel free to remove it before the merge.


#include <QBuffer>
#include <QFile>

namespace OCC {

class GETJob : public AbstractNetworkJob
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've got time, you could try tomake a separate commit that refactor GETJob and GETFileJob.
That'd make reviewer slightly easier.

Copy link
Contributor Author

@ahmedammar ahmedammar Jan 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My time is all up on this work now, it's taken way too long to get this far ...

@ahmedammar
Copy link
Contributor Author

ahmedammar commented Jan 14, 2018

@guruz @ogoffart @dragotin @ckamm @DeepDiver1975 @PVince81
Based on the amount of time that has elapsed, and the amount of nit-picking happening, my guess is oC don't really want to merge this. That's fine, I can withdraw my PRs and just see if nextCloud want to take them as-is for free.

@ahmedammar
Copy link
Contributor Author

Note: Jenkins failures were resolved after rebase on latest master ...

This commit adds client-side support for delta-sync, this adds a new
3rdparty submodule `gh:ahmedammar/zsync`. This zsync tree is a modified
version of upstream, adding some needed support for the upload path and
other requirements.

If the server does not announce the required zsync capability then a
full upload/download is fallen back to. Delta synchronization can be
enabled/disabled using command line, config, or gui options.

On both upload and download paths, a check is made for the existance of
a zsync metadata file on the server for a given path. This is provided
by a dav property called `zsync`, found during discovery phase. If it
doesn't exist the code reverts back to a complete upload or download,
i.e. previous implementations. In the case of upload, a new zsync
metadata file will be uploaded as part of the chunked upload and future
synchronizations will be delta-sync capable.

Chunked uploads no longer use sequential file names for each chunk id,
instead, they are named as the byte offset into the remote file, this is
a minimally intrusive modification to allow fo delta-sync and legacy
code paths to run seamlessly. A new http header OC-Total-File-Length is
sent, which informs the server of the final expected size of the file
not just the total transmitted bytes as reported by OC-Total-Length.

The seeding and generation of the zsync metadata file is done in a
separate thread since this is a cpu intensive task, ensuring main thread
is not blocked.

This commit closes owncloud#179.
@mrow4a
Copy link
Contributor

mrow4a commented Jan 16, 2018

Is this PR ready from the client side perspective? @guruz @ogoffart

@ckamm
Copy link
Contributor

ckamm commented Jan 17, 2018

@mrow4a Not ready in my opinion. See below.

@ahmedammar I understand your frustration. I genuinely think you've done an amazing job responding and fixing things based on the hundreds of review comments so far. The old PR ended up being too much for github because you were extraordinarily willing to discuss, adjust and fix all of the issues and nitpicks that came up. 👍

Unfortunately this is a large change to a critical component of the sync client and thus everyone seems a bit extra picky and there are worries about reliability. In addition it seems that, with you busy and probably not interested in continuing with this patch after it lands, the existing maintainers will have to fix whatever issues pop up. That makes it necessary for at least two people to understand what's going on well enough to be able to work on it efficiently.

My main issues at this point are:

reliability on Windows

With this patch zsync will be responsible for a lot of the filesystem operations. One wouldn't expect it, but basic stuff for dealing with files can be surprisingly hard to get right (I'm thinking of mtime timezone issues, locked files, shared access permissions, etc). And I'm convinced we'll have several regressions in this area.

reliability generally

While there are now two basic tests (yay!), there will be ample of bugs left to find. Since this patch mostly deals with uploads and downloads these are more likely than usual to be corruption/data loss kind of bugs.

zsync integration generally

The current PR copies some of zsync's code into the client, other code we link against. In any case, the version we link against is a custom modified version of zsync. Essentially zsync becomes part of the client project, but resides in a different repository while being mutually dependent. If there's no strong reason to keep these separate I'd just fold zsync into the client repository to keep things easy.

review age

My review was months ago and there have been many changes since (which is great and has improved things!). But still, I'll need to go over things again to get an up-to-date impression of what's going on.

Path forward?

I'd suggest something like this:

  1. Decide whether we're willing to make the investment of adopting this PR. It's pretty cool, but there are costs.
  2. Review again with a focus on "does this addition break non-delta-sync codepaths?". Make any changes necessary to reduce risk of accidentally breaking default syncing.
  3. Merge in a state where the feature is disabled by default.
  4. Have 1-2 people clean up the remaining loose ends.

And then we need to stabilize. That means adding lots of tests, enabling it on dev machines, pinging interested users whether they're up for testing in low-risk scenarios. Eventually enable it by default.

@ahmedammar
Copy link
Contributor Author

First, thanks for the detailed response and sharing your views so succinctly.

All the points made by @ckamm are valid but I just wanted to respond to a few things.

  1. I want to reaffirm that I am willing to continue to maintain the code after the code lands, I might be busy with work but that doesn't mean I can't continue to maintain this in my spare time.
  2. Regarding folding the zsync code in, I don't think that's a good idea since you'd lose the git commit history. As a compromise, if oC would fork my zsync tree, we could use that as the submodule reference instead.
  3. Merging in a default disabled state is actually how it's implemented today and also the reason I made the GUI addition (to allow users to enable/test it).
  4. I have no problem in helping clean-up any loose ends or adding more tests but I just want to see some positive movement from oC side before investing more time.

@tflidd
Copy link

tflidd commented Jan 23, 2018

From a user's perspective, this would be really a cool new feature. We've never got so close to this feature. There has been such a demand that there will be some people ready to test it. It will certainly require some development costs as well but what better feature to invest in for a sync solution?

@SamuAlfageme
Copy link
Contributor

@tflidd it is indeed; we're thinking about generating some testpilotcloud installers as well as instructions for the server patch to get community to start trying this out as soon as possible to get feedback. We'll announce soon in https://central.owncloud.org.

@PVince81
Copy link
Contributor

I don't think a server patch will be needed. We'll merge the server PR and release it in the upcoming 10.1 pre-alpha/alpha so people can use that for testing

@ogoffart
Copy link
Contributor

ogoffart commented Jan 26, 2018

@ahmedammar Have you had any contact with the zsync maintainer? Why did you not create pull requests there?
(Sorry if you already said that earlier, but i don't have easy access to previous comment)

(In the case we cannot upstream the change, we will make a fork of zsync on github in the owncloud organisation)

Edit: I actually see that you made a pull request. ( cph6/zsync#2 ) But why is it closed?

@ahmedammar
Copy link
Contributor Author

I reached out to him, and got no response.

@ckamm
Copy link
Contributor

ckamm commented Jan 29, 2018

@ahmedammar There was a meeting where we talked about this PR. Upcoming steps are:

  1. Final review round with an eye on reducing risks of breaking unrelated behavior when this feature is disabled. Me and @ogoffart will do that.
  2. Merge into a branch where the three of us can tie up loose ends. We'll also build experimental binaries for people to play with.
  3. Merge into master.

At the same time there'll be another attempt to reach out to the author of zsync. Depending on how this goes we might upstream patches or create a fork in the owncloud organization.

@SamuAlfageme @ogoffart @guruz

Copy link
Contributor

@ckamm ckamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gone through the diff again, looking primarily at the zsync-disabled branches.

There's one inline comment about progress updating where something seems off.

Besides that the only big change to existing paths is in the upload chunking. The chunk numbering the servers receive will be very different. I'll need to go through the updated logic again tomorrow.

@@ -229,7 +245,7 @@ void ProgressInfo::setProgressComplete(const SyncFileItem &item)
_currentItems.remove(item._file);
Copy link
Contributor

@ckamm ckamm Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to move below the if or _totalSizeOfCompletedJobs won't update correctly. edit: done

* Example: With delta-sync, the actual size of the download will only
* be known during propagation - this function adjusts the total size
* to account for it.
*/
Copy link
Contributor

@ckamm ckamm Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for later: This might need even more documentation: item._size must be the old size when this is called, different from newSize which isn't obvious to me. edit: done

Copy link
Contributor

@ckamm ckamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

bool _removeJobError = false; /// if not null, there was an error removing the job
bool _zsyncSupported = false; /// if zsync is supported this will be set to true
bool _isZsyncMetadataUploadRunning = false; // flag to ensure that zsync metadata upload is complete before job is
quint64 _bytesToUpload; // in case of zsync upload this will hold the actual bytes to upload, normal upload will be file size
Copy link
Contributor

@ckamm ckamm Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later: I was wondering what would happen to _bytesToUpload for the case of resumes. It turns out it works fine since it's always used in the context of "total bytes that need to arrive on the server before we're done".

So if the file size is 10 and a previous job uploaded 6, this member would be 10 even though the actual number of bytes to upload for this job is just 4. Rename / documentation change probably.

edit: done

@@ -344,9 +344,12 @@ class PropagateUploadFileNG : public PropagateUploadFileCommon
private:
quint64 _sent = 0; /// amount of data (bytes) that was already sent
uint _transferId = 0; /// transfer id (part of the url)
int _currentChunk = 0; /// Id of the next chunk that will be sent
int _currentChunk = 0; /// id of the next chunk that will be sent
Copy link
Contributor

@ckamm ckamm Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later: _currentChunk -> _currentChunkStart ? edit: done

}

if (_sent > _item->_size) {
if (!_rangesToUpload.isEmpty())
_currentChunk = _rangesToUpload.first().start;
Copy link
Contributor

@ckamm ckamm Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later: Potentially redundant, _currentChunk is also set in startNextChunk() - only used in logging, it seems. edit: done

*/

void PropagateUploadFileNG::doStartUpload()
{
propagator()->_activeJobList.append(this);

_zsyncSupported = isZsyncPropagationEnabled(propagator(), _item);
if (_zsyncSupported && _item->_remotePerm.hasPermission(RemotePermissions::HasZSyncMetadata)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later: In the case where this is false, one can still have _zsyncSupported be true and use the corresponding code paths below. I haven't looked in detail whether that's correct.

@ckamm
Copy link
Contributor

ckamm commented Jan 30, 2018

I've merged this PR manually into the delta-sync branch (there was a trivial conflict).

@ckamm ckamm closed this Jan 30, 2018
@ahmedammar
Copy link
Contributor Author

@ckamm thanks for the review and pulling into a branch but how should I address the things mentioned above after you've already merged into a branch?

@ckamm
Copy link
Contributor

ckamm commented Feb 1, 2018

@ahmedammar Let's continue working on it collaboratively in the branch by making small PRs against it. I've committed a couple of minor fixes directly but my larger changes will also go through pull requests for review.

@orzeech
Copy link

orzeech commented Feb 5, 2018

@ahmedammar @ckamm Hi, I've been testing this code a bit. I've set up two testing environments and I've found some bigger and smaller problems. From progress bar not being updated correctly to total crash of synchronization after synchronizing many times a group of files from a delta-sync enabled and regular client one after another. There was also a bug when delta-sync was not triggered when adding just the last byte to a file. How do I go on with reporting those? Otherwise it's bringing great speed improvements in some cases.

@ahmedammar
Copy link
Contributor Author

@orzeech are you testing the delta-sync branch that @ckamm is now working in? Since some of these things should be fixed there. Also about reporting issues, reproducibility of the issue is quite important for me to duplicate the situation. Thanks for testing!

@orzeech
Copy link

orzeech commented Feb 5, 2018

I will check on this new branch and get back to you soon

@guruz
Copy link
Contributor

guruz commented Mar 20, 2019

This feature will be in the upcoming 2.6 alpha release.
Meanwhile you can try it inside the daily builds 2.6.x https://download.owncloud.com/desktop/daily/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync only the file change, not entire file [$1,755]
9 participants