Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add preview provider for emf files based on office #41395

Merged
merged 3 commits into from
Nov 16, 2023

Conversation

kesselb
Copy link
Contributor

@kesselb kesselb commented Nov 12, 2023

For nextcloud/files_emfviewer#1

Needs nextcloud/viewer#2065 to display the preview in viewer.

Summary

Add support for emf files.

Chrome and Firefox do not render emf files and therefore we generate a png preview through libreoffice.

TODO

  • CI
  • Testing
  • Review

Checklist

Signed-off-by: Daniel Kesselberg <mail@danielkesselberg.de>
@kesselb
Copy link
Contributor Author

kesselb commented Nov 13, 2023

@st3iny made me aware that existing emf files have the wrong mime type.

Pushed a change for OC\Repair\RepairMimeTypes to set image/emf when upgrading to 28.

Copy link
Member

@st3iny st3iny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works.

Weirdly, some previews are not generated when going to a folder with many emf files. This seems to be random as the set of failing files is different each time when reuploading all files.

Uploading files one by one works just fine. Could this be an issue with too many parallel instances of libreoffice?

LibreOffice only allows one invocation per user profile.[^1]

The office provider set the user profile to /tmp/owncloud-instanceid and therefore only one invocation per instance is allowed. This was introduced a while ago, yet it's unclear if this was intentionally or just a side effect.[^2]

The limitation on one invocation leads to the situation that the preview generation only works for a couple of files if you upload a whole folder of emf or word files.

This commit removes the limitation by using a new user profile for each preview. That's done by using instance id plus file id as postfix for getTemporaryFolder.

This has some drawbacks:

- Overload protection: If you upload 100 emf files, you may end up with 100 LibreOffice invocations. Though, you can use preview_concurrency_new to limit the number of previews that can be generated concurrently when php-sysvsem is available.
- New profile: I assume it takes a few bits to generate a fresh LibreOffice user profile. It appears that there is no way to ask LibreOffice to not create a profile and just work with the defaults. The profile will be cleaned after use by our temp manager.
- Remove the configuration option preview_office_cl_parameters:  This is not strictly necessary yet, but if you set the configuration option, the generated path for the user profile is also missing. The configuration option is not well documented (e.g., it's unclear that the last option needs to be --outdir) and actually, there should be no reason to change it after all.

[^1]: https://wiki.documentfoundation.org/UserProfile
[^2]: owncloud/core#9784

Signed-off-by: Daniel Kesselberg <mail@danielkesselberg.de>
The initial office preview implementation converted an office document with LibreOffice to PDF, used ImageMagick to extract the first page as JPEG, and passed it OC_Image.

#10198 changed the implementation to use PNG rather than PDF. OC_Image can use a PNG as a preview right away, so the ImageMagick step is unnecessary.

The registration code was updated to not ask ImageMagick if PDF is supported, as PDFs are no longer used to create office document previews.

Signed-off-by: Daniel Kesselberg <mail@danielkesselberg.de>
@kesselb
Copy link
Contributor Author

kesselb commented Nov 13, 2023

LibreOffice only allows one invocation per user profile.1

b5241d5 changed the implementation to always use a different user profile.

This has some drawbacks:

  • Overload protection: If you upload 100 emf files, you may end up with 100 LibreOffice invocations. Though, you can use preview_concurrency_new to limit the number of previews that can be generated concurrently when php-sysvsem is available.
  • New profile: I assume it takes a few bits to generate a fresh LibreOffice user profile. It appears that there is no way to ask LibreOffice to not create a profile and just work with the defaults. The profile will be cleaned after use by our temp manager.
  • Remove the configuration option preview_office_cl_parameters: This is not strictly necessary yet, but if you set the configuration option, the generated path for the user profile is also missing. The configuration option is not well documented (e.g., it's unclear that the last option needs to be --outdir) and actually, there should be no reason to change it after all.

475dd60 is a follow-up to #10198 to remove the ImageMagick code.

I'm a bit unhappy. We are very close to 28 and my plan was to just add a new preview provider based on the office one and not rework the office provider. However, without the different user profiles, the generation does not work reliable.

If the changes (especially to Office provider) are too late, it should also work to only release the first commit with 28. It will work somehow unless you upload plenty of documents. This issue has been around for quite a while for the other office documents and maybe acceptable.

Footnotes

  1. https://wiki.documentfoundation.org/UserProfile

@blizzz blizzz mentioned this pull request Nov 14, 2023
@ChristophWurst
Copy link
Member

Do I need to add any config to turn on libreoffice previews? The EML preview doesn't work for me

@kesselb
Copy link
Contributor Author

kesselb commented Nov 16, 2023

Do I need to add any config to turn on libreoffice previews? The EML preview doesn't work for me

Yes, the provider for libreoffice is not enabled by default.

For config.php:

  'enabledPreviewProviders' => [
    'OC\Preview\BMP',
    'OC\Preview\GIF',
    'OC\Preview\JPEG',
    'OC\Preview\Krita',
    'OC\Preview\MarkDown',
    'OC\Preview\MP3',
    'OC\Preview\OpenDocument',
    'OC\Preview\PNG',
    'OC\Preview\TXT',
    'OC\Preview\XBitmap',
    'OC\Preview\EMF',
    'OC\Preview\MSOfficeDoc',
    'OC\Preview\MSOffice2003',
    'OC\Preview\MSOffice2007',
  ],

@ChristophWurst
Copy link
Member

I think it created a preview for me but since I'm using a self-signed certificate the service worker doesn't run and I see an error on the console. AFAIK the requests go through the sw for caching.

Copy link
Member

@ChristophWurst ChristophWurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to install a trustworthy SSL cert to get a working service worker

Conversion works. I get a preview for my test eml file 😻

Code looks sane. The performance tradeoff with separate profiles is fine if else libreoffice fails hard on parallelism

@ChristophWurst ChristophWurst merged commit c9dc377 into master Nov 16, 2023
51 of 53 checks passed
@ChristophWurst ChristophWurst deleted the hello-emf branch November 16, 2023 18:48
@kesselb kesselb added the pending documentation This pull request needs an associated documentation update label Nov 16, 2023
@kesselb
Copy link
Contributor Author

kesselb commented Nov 16, 2023

Pending documentation:

  • Configuration flag
  • Add a hint about preview_concurrency_new

@kesselb
Copy link
Contributor Author

kesselb commented Nov 17, 2023

Documentation: nextcloud/documentation#11292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants