File cache table excessively large (and does not shrink after data removal) / Nextcloud should defragment file cache table #7312

ruedigerkupper · 2017-11-27T18:24:53Z

Problem

My nextcloud instance (small-scale, single-server setup, 6 users) has an excessively large database size (20 GB as of today). This size doesn't sensibly relate to the amount of data managed (approx. 1 TB, now reduced to 4.3 GB).

I have found that almost all of the 20GB is located in a single table, the file cache:

root@helge:~# ls -lh /var/snap/nextcloud/current/mysql/nextcloud/oc_filecache.*
-rw-r----- 1 root root 21K Jan 8 2017 /var/snap/nextcloud/current/mysql/nextcloud/oc_filecache.frm
-rw-r----- 1 root root 19G Nov 27 15:17 /var/snap/nextcloud/current/mysql/nextcloud/oc_filecache.ibd

This file seems to grow and grow, but never shrinks.

Further details

My nextcloud used to manage roughly 1TB of data, most of them located on external storage (external disks attached to my server).
External storage was included through nextcloud’s “external storage” app as “Local”.
I have since removed all external storage from my nextcloud instance:

root@helge:~# nextcloud.occ files_external:list --all
No mounts configured

The data managed by my nextcloud instance is now only 4.3 GB (compared to 1 TB before). Surely this should reduce the file cache?
I have called “files:scan” and “files:cleanup”, but to no effect:

root@helge:~# nextcloud.occ files:cleanup
0 orphaned file cache entries deleted

Here’s something I don’t understand: there should be thousands of orphaned entries now that most of the data is gone!

=> I suspect that my file cache table is somehow broken, but I don’t know how to fix it. I believe I should clear the table and reproduce it, but I do not know how to do that.

Steps to reproduce

Unsure, sorry

Expected behaviour

Flie cache table should shrink after external storage was removed.

Actual behaviour

File cache table stays the same (and is excessively large)

General server configuration

Operating system: Linux helge 4.13.0-17-generic #20-Ubuntu SMP Mon Nov 6 10:04:08 UTC 2017 x86_64

Web server: Apache/2.4.28 (Unix) OpenSSL/1.0.2g (fpm-fcgi)

Database: mysql 5.7.18

PHP version: 7.0.23

PHP-modules loaded

 - Core
 - date
 - libxml
 - openssl
 - pcre
 - sqlite3
 - zlib
 - bz2
 - ctype
 - curl
 - dom
 - hash
 - fileinfo
 - filter
 - gd
 - SPL
 - iconv
 - intl
 - json
 - mbstring
 - mcrypt
 - PDO
 - session
 - pdo_sqlite
 - posix
 - Reflection
 - standard
 - SimpleXML
 - mysqlnd
 - exif
 - tokenizer
 - xml
 - xmlreader
 - xmlwriter
 - zip
 - pdo_mysql
 - cgi-fcgi
 - redis

Nextcloud configuration

Nextcloud version: 11.0.5 (stable) - 11.0.5.1 (snap version 3680)

Updated from an older Nextcloud/ownCloud or fresh install: Nextcloud snap, updated from previous snap versions (3317, 2707)

Where did you install Nextcloud from: snap

Are you using external storage, if yes which one: \OC\Files\Storage\Local, see description for details!

Are you using encryption: no

Are you using an external user-backend, if yes which one: Nextcloud sync client from the ubuntu store

Enabled apps

 - activity: 2.4.1
 - admin_audit: 1.1.0
 - audioplayer: 2.2.1
 - comments: 1.1.0
 - dav: 1.1.1
 - federatedfilesharing: 1.1.1
 - federation: 1.1.1
 - files: 1.6.1
 - files_external: 1.1.2
 - files_pdfviewer: 1.0.1
 - files_sharing: 1.1.1
 - files_texteditor: 2.2
 - files_trashbin: 1.1.0
 - files_versions: 1.4.0
 - files_videoplayer: 1.0.0
 - firstrunwizard: 2.0
 - gallery: 16.0.0
 - issuetemplate: 0.2.2
 - logreader: 2.0.0
 - lookup_server_connector: 1.0.0
 - nextcloud_announcements: 1.0
 - notifications: 1.0.1
 - password_policy: 1.1.0
 - provisioning_api: 1.1.0
 - serverinfo: 1.1.1
 - sharebymail: 1.0.1
 - survey_client: 0.1.5
 - systemtags: 1.1.3
 - theming: 1.1.1
 - twofactor_backupcodes: 1.0.0
 - workflowengine: 1.1.1

Disabled apps

 - activitylog
 - calendar
 - encryption
 - external
 - files_accesscontrol
 - files_automatedtagging
 - files_retention
 - ownbackup
 - templateeditor
 - user_external
 - user_ldap
 - user_saml

Content of config/config.php

{
    "apps_paths": [
        {
            "path": "\/snap\/nextcloud\/current\/htdocs\/apps",
            "url": "\/apps",
            "writable": false
        },
        {
            "path": "\/var\/snap\/nextcloud\/current\/nextcloud\/extra-apps",
            "url": "\/extra-apps",
            "writable": true
        }
    ],
    "supportedDatabases": [
        "mysql"
    ],
    "memcache.locking": "\\OC\\Memcache\\Redis",
    "memcache.local": "\\OC\\Memcache\\Redis",
    "redis": {
        "host": "\/tmp\/sockets\/redis.sock",
        "port": 0
    },
    "instanceid": "ocwzswpevaos",
    "passwordsalt": "***REMOVED SENSITIVE VALUE***",
    "secret": "***REMOVED SENSITIVE VALUE***",
    "trusted_domains": [
        "rkupper.no-ip.org",
        "helge"
    ],
    "datadirectory": "\/media\/Data\/nextcloud\/data",
    "overwrite.cli.url": "http:\/\/rkupper.no-ip.org",
    "dbtype": "mysql",
    "version": "11.0.5.1",
    "dbname": "nextcloud",
    "dbhost": "localhost:\/tmp\/sockets\/mysql.sock",
    "dbport": "",
    "dbtableprefix": "oc_",
    "dbuser": "***REMOVED SENSITIVE VALUE***",
    "dbpassword": "***REMOVED SENSITIVE VALUE***",
    "logtimezone": "UTC",
    "installed": true,
    "mail_smtpmode": "php",
    "mail_smtpsecure": "tls",
    "mail_from_address": "ruediger",
    "mail_domain": "rkupper.no-ip.org",
    "mail_smtpauthtype": "LOGIN",
    "mail_smtpauth": 1,
    "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
    "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
    "loglevel": 3,
    "maintenance": false,
    "singleuser": false,
    "log_rotate_size": 1073741824
}

Client configuration

Browser: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/605.1 (KHTML, like Gecko) Version/11.0 Safari/605.1 Ubuntu/17.10 (3.26.1-1ubuntu1) Epiphany/3.26.1 (Web App)

Operating system: GNU/Linux (Ubuntu 17.10)

The text was updated successfully, but these errors were encountered:

ArnisR · 2017-11-27T20:58:49Z

Yes, there is no way with NC commands to reduce oc_filecache table if files&folders are deleted or renamed outside NC. It seems the same apply to external shares. If changes are made outside, NC can only add files to oc_filecache.

I my case oc_filecache was expanded due large amount of photos and previews generated by previewgenerator app (20 previews for each photo generated by default). Every preview is also registered in oc_filecache table. I had to modify previewgenerator, manually delete all previews and all db records for files that referenced to preview folder. Then regenerate reduced amount of previews again. It's time consuming process.

There is definitely need for NC maintenance command which keep oc_filecache table optimized according to local filesystem. It would also allows more easily to use NC for fastlane file upload and managing with direct Samba/FTP shares. Because http/webdav upload for numerous files is lagging and not as robust and safe as Samba/FTP.

MorrisJobke · 2017-11-27T21:14:27Z

cc @icewind1991

ruedigerkupper · 2017-11-27T21:49:14Z

@ArnisR: That would explain the size of the file cache: The external disk hosted my photo library and was indexed by previewgenerator. So how would I proceed to reduce the file cache manually? Is there a way to simply clear the table and regenerate it?

ArnisR · 2017-11-27T22:47:50Z

@ruedigerkupper : I can't say. There are several other records regarding NC system in that table, not only files. So I wouldn't do it. I saw a thread where someone presented php file which make comparison of actual files and records in database and delete db records for the non-existent. But nobody confirms it's working and my knowledge of MySQL is too short to make conclusion.

I didn't risk. Because most overhead was due experiments with previews I'd managed only them. I put NC in maintenance mode and made backup of database. Delete all content of appdata_.../preview/ folder. With phpmyadmin selected (with search) all records in oc_filecache table where column "path" contain appdata_your-NCspecificpath-here/preview/ and deleted them. I have NC12. I don't know maybe NC11 put previews in different folder (under each user).

Then I modified previewgenerator - nextcloud/previewgenerator#78

Switch maintenance mode off and run preview:generate-all. In my case it takes almost 3 days to finish generating them for ~85 K photos on i3 CPU.

At this moment I have problem with this - #7269

ruedigerkupper · 2017-11-28T00:51:32Z

Thanks, ArnisR! It have meanwhile solved my problem, and it turned out too be much easier than that:

Investigating the oc_filecache table from the mysql prompt showed that it only had abut 4000 rows -- nothing that could account for 20 GB of size!
However, the table space was largely empty:

mysql> show table status like "oc_filecache";
+--------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-------------+----------------+---------------------+---------------------+------------+-----------+----------+----------------+---------+
| Name         | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free   | Auto_increment | Create_time         | Update_time         | Check_time | Collation | Checksum | Create_options | Comment |
+--------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-------------+----------------+---------------------+---------------------+------------+-----------+----------+----------------+---------+
| oc_filecache | InnoDB |      10 | Dynamic    | 4106 |           5119 |    21020672 |               0 |     35749888 | 19579011072 |       20409581 | 2017-11-26 23:42:15 | 2017-11-27 18:56:42 | NULL       | utf8_bin  |     NULL |                |         |
+--------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-------------+----------------+---------------------+---------------------+------------+-----------+----------+----------------+---------+

Note "Data_length" (=21 MB) and "Data_free" (=19 GB)!! So, the removed files had actually been cleaned from the file cache, but the InnoDB table was so badly fragmented that it occupied 1000 times the disk space needed for its data.

Going through the mysql reference, I came about https://dev.mysql.com/doc/refman/5.7/en/innodb-file-defragmenting.html. That states that an InnoDB table can be defragmented by issuing a "no-op" ALTER TABLE command:

mysql> ALTER TABLE oc_filecache FORCE;

And -- magically -- this reduced the table to a file size of 8 MB (!!!):

mysql> show table status like "oc_filecache";
+--------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------+----------+----------------+---------+
| Name         | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+--------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------+----------+----------------+---------+
| oc_filecache | InnoDB |      10 | Dynamic    | 4168 |            381 |     1589248 |               0 |       671744 |   2097152 |       20409581 | 2017-11-28 00:40:41 | NULL        | NULL       | utf8_bin  |     NULL |                |         |
+--------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------+----------+----------------+---------+

Summary

So let's get this straight:

After removing external storage, Nextcloud left me with a table that occupied 20 GB of disk space, but which contained only 8 MB of data.
How the table got so badly fragmented is unclear, but obviously it can happen.
The problem could be solved by a simple mysql command for defragmenting the table (ALTER TABLE oc_filecache FORCE;).
But neither does the regular nextcloud cron job perform such defragmentation, nor does nextcloud provide a command for manually doing so.
Instead, the user is required to read the mysql reference (I really followed a steep learning curve -- until yesterday, I had never worked with mysql ;-) )

Feature request

Nextcloud should defragment the oc_filecache table regularly, probably as part of the regular maintenance cron job.

LnLogN · 2017-12-18T22:02:37Z

Same issue here, our internal next cloud service has only a few hundred users, but the database was over 130GB. We just upgraded from owncloud 9 to nextcloud 12.0.4 and manually cleared the oc_filecache table leaving only the shared links ( with the php script ArnisR mentioned), this leaves only about 400MB, with the scan for all local directories, the database size is only at 700MB(only 2TB data locally), and the web interface functions normally (the service will rescan mounted directories when a user enters that mounted directory, and display the files).

With a closer look at the file system, we found out of the space database inserts are from different users mounting the same file systems over smb (the fs has a few PB of data). The bit of annoying thing is when doing the occ files:scan $user command, the service will actually try to scan all the mounted folders this user have, including the smb mounted folder, which takes forever, we couldn't afford to have users waiting for those scans, so wrote a script to scan only local files.

So here are a few suggestions:

please allow files:scan only scan local directory, with something like: occ files:scan --local, because local directories path are much more functionally important than mounted paths
have all external paths stored in a separate temporary filecache table, maybe external_filecache? so we can clear the table and not affect the function of the web service.

rullzer · 2017-12-19T21:34:35Z

Maybe an explicit occ command is fine. But I'm really against doing this in a regular cron job as it rebuilds the table basically. So on larger instances that is not a good idea.

cgrima · 2017-12-27T22:16:36Z

Same issue here with a 80GB oc_filecache.ibd while my entire database is roughly 4TB. @ArnisR , @LnLogN, could you give a link to the php script you are mentioning?

@ArnisR, could you give an idea on how long the defragmenting took for you?

For Info, I could get the oc_filecache.ibd growing rate from recent backups:
Sept. 1, 2017: 46GB
Nov.1, 2017: 61GB
Dec. 1, 2017: 69GB
Dec. 27, 2017: 85GB
In the meantime, my database grew up from ~3.5TB to ~4TB, with a mix of small and >1GB files. 7 users.

pprotschka · 2018-04-01T10:56:25Z

same issue here. my database was 61gigs with only 400gb of data. I had the root of an external ubuntu server mounted... ~73,918,234 ROWs! lol. anyways is there a way to prevent the indexing of external shares?

McKay1717 · 2018-11-16T11:10:39Z

Same issue here 20Go of DB for 200Go of local data
Nextcloud version 14.0.3.0

helmut72 · 2019-01-10T08:57:52Z

MySQL regulary stops working because of this:

mysql> show table status like "oc_filecache";
+--------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------+----------+-----------------------+---------+
| Name         | Engine | Version | Row_format | Rows     | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation   | Checksum | Create_options        | Comment |
+--------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------+----------+-----------------------+---------+
| oc_filecache | InnoDB |      10 | Compressed | 76443704 |            134 | 10273595392 |               0 |  11548246016 |  82837504 |       84169982 | 2018-12-11 12:45:45 | 2019-01-10 09:51:44 | NULL       | utf8mb4_bin |     NULL | row_format=COMPRESSED |         |
+--------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------+----------+-----------------------+---------+
1 row in set (0,00 sec)

chaosgrid · 2019-01-11T14:00:18Z

*bump

I'm quite surprised that there is no "clean up files in index that have actually been deleted on disk" functionality in Nextcloud, since this is obviously vital to keep Nextcloud running smoothly.
In my case, I did move the data directory and it resulted in some files (appdata and __groupfolder files) being duplicated in the filecache because it generated a new storage entry. I did reset the original storage entry but now I have stale filecache entries. I will probably delete those manually now and then I will have to also clean up tables like oc_share and oc_activity for the deleted filecache entries...

I wonder, we also have SMB mounted external storages and for those Nextcloud automatically cleans up the filecache for files that get deleted (and I did not activate the notify feature as far as I'm aware) ... why does this not work for local files?

da3dsoul · 2019-02-01T15:56:23Z

Is this fixed? I have local storage mapped using the local filesystem, and I not only don't delete files from inside nextcloud, but I delete them from a windows machine over samba to the filesystem directly. A folder that gets hundreds of files added and deleted daily only has 226 entries, like I'd expect, so that's nice to see.

I run this, which I don't think has any cleanup scripts https://github.com/crazy-max/docker-nextcloud
I'm on NC 15.2. I do open the directory reasonably often on nextcloud, so if there's a script in there, then that might explain it.

EDIT: Okay, it just cleans them up when I enter each folder, but if I don't, then it gets bigger.

paulrolandw · 2019-05-17T12:44:45Z

I am now having the same ish issue.

18G /var/lib/mysql/next/oc_filecache.ibd

18G for 5x users (testing) and keeps increasing rapidly is a bit much. We are only using nextcloud with external smb shares, nothing local. This 18G has been gathered in 3 months.

Also, I have 34 million rows, so de-fragment in my case I doubt will do anything, not much free data:

mysql> show table status like "oc_filecache" \G;
*************************** 1. row ***************************
Name: oc_filecache
Engine: InnoDB
Version: 10
Row_format: Compressed
Rows: 35592262
Avg_row_length: 349
Data_length: 12424568832
Max_data_length: 0
Index_length: 5660049408
Data_free: 2097152
Auto_increment: 40670072
Create_time: 2019-02-27 11:13:12
Update_time: 2019-05-17 13:38:23
Check_time: NULL
Collation: utf8mb4_bin
Checksum: NULL
Create_options: row_format=COMPRESSED
Comment:
1 row in set (0.00 sec)

helmut72 · 2019-05-22T19:54:01Z

My instance with SMB connection only runs flawless without Cron, but with Ajax. No db table blows up.

nate2014jatc · 2019-10-28T06:38:10Z

I just came across this thread after having my database lock up and kill my server because it kept getting thrown into recovery mode.

MariaDB [nextcloud]> show table status like "oc_filecache" \G;
*************************** 1. row ***************************
            Name: oc_filecache
          Engine: InnoDB
         Version: 10
      Row_format: Compressed
            Rows: 11700392
  Avg_row_length: 413
     Data_length: 4843307008
 Max_data_length: 0
    Index_length: 1800232960
       Data_free: 1884815360
  Auto_increment: 25774142
     Create_time: 2019-10-27 02:04:54
     Update_time: NULL
      Check_time: NULL
       Collation: utf8mb4_bin
        Checksum: NULL
  Create_options: row_format=COMPRESSED
         Comment: 
Max_index_length: 0
       Temporary: N
1 row in set (0.002 sec)

jgrete · 2020-04-04T10:29:22Z

THe same problem with NC 18.0.3... after years no solution?

show table status like "oc_filecache" \G;
*************************** 1. row ***************************
Name: oc_filecache
Engine: InnoDB
Version: 10
Row_format: Compressed
Rows: 43111070
Avg_row_length: 1640
Data_length: 70715596800
Max_data_length: 0
Index_length: 8240365568
Data_free: 3670016
Auto_increment: 68154221
Create_time: 2020-01-21 13:39:53
Update_time: 2020-04-04 12:21:39
Check_time: NULL
Collation: utf8mb4_bin
Checksum: NULL
Create_options: row_format=COMPRESSED
Comment:
Max_index_length: 0
Temporary: N
1 row in set (0.000 sec)

I was voting for NC on every occasion, but things like this are just ridiculous...

tanguy-opendsi · 2020-04-07T19:02:17Z

The same problem on nextcloud 17.0.2

nate2014jatc · 2020-04-07T23:50:32Z

BUMP:
Has there really been no discussion or proposition of a fix for this?
Perhaps add a note in the setup guide(s) stating that this might be an issue, and offering a (janky, improper) "solution" of running an alter op on the table every XX days?

cls-nebadje · 2021-11-09T11:04:59Z

Got this problem too. Increased the VM disk size already a month ago. It doubled in a month from 8 GB to 16 GB without adding any files. No external drive removed. Just a nextcloud instance sitting there. Was on 21.0.4 or so, just upgraded to 21.0.5 and now upgrading to 22.2.0

Basically this problem arose this summer with more or less the same files which were on this server all the time.

Edit:
It doesn't seem to be oc_filecache AFAIKS:

mysql> show table status like "oc_filecache";
+--------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------+----------+-----------------------+---------+
| Name         | Engine | Version | Row_format | Rows   | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation   | Checksum | Create_options        | Comment |
+--------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------+----------+-----------------------+---------+
| oc_filecache | InnoDB |      10 | Compressed | 350257 |            159 |    55934976 |               0 |     66527232 |   4194304 |         519199 | 2021-09-09 13:19:05 | 2021-11-09 11:52:31 | NULL       | utf8mb4_bin |     NULL | row_format=COMPRESSED |         |
+--------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------+----------+-----------------------+---------+
1 row in set (0.00 sec)

It seems that mysql's binlog keeps growing on my machine:

root@nextcloud:/var/lib/mysql# du -ch binlog.*
[...]
13G	total

I continue to investigate this and leave this note here since this can be a culprit as well.

Edit 2:
Yep, binlog seems to grow with the latest nextcloud releases at least on my machine.

root@nextcloud:/var/lib/mysql# ls -l binlog.0* | wc -l
148

I lowered the binlog expiration time from 30 days to one week:

nano /etc/mysql/mysql.conf.d/mysqld.cnf

# expire logs after 7 days:
binlog_expire_logs_seconds = 604800

Restart the mysql service and voilà:

root@nextcloud:/var/lib/mysql# ls -l binlog.0* | wc -l
43

This lowered the footprint drastically:

root@nextcloud:/var/lib/mysql# du -ch binlog.*
[...]
3.6G	total

paradeiser · 2022-04-16T20:44:12Z

Same problem:
binlog files are eating up all my diskspace until it generates an 'Internal Server Error'.

Now I set 'binlog_expire_logs_seconds = 172800'
these two days generate 44 GB of binlog files 😮

If I check these files using 'less' I see mostly ics calendar data. 🧐

obel1x · 2022-04-17T11:46:31Z

ok, now my honest opinion, checking this issues here and other issues of that product and having a look at filecache structure.

well that table looks like the most important table of nextcloud taking the most work on larger filecollections. Despite it is so important, the structure is not efficient, badly normalized and the handling in nexctloud lacks of features, which leads to performance-, ressource- and stability- problems (like deadlocking, swapping files out, timeouts) as such. I even wonder how larger workload can be handeld by this.

For example: my table has about 1.299.481 entries (which i consider to be a small system). The Datasize itself is ~355 MB (Mariadb, Dynamic Rows). Yesterday the Index size was about 655 MB. So the indices are larger, than the table itself (the Database is defragmented and optimzed). This as such is an inidcation of bad tablemanagement. Having a look at the used querys and needed memory/buffers/tempfiles on database also indicates this.
Now, i don't want to complain about it too much, as Nextcloud is my favourite piece of Software in many ways.

So what i am suggesting: Do the maintainers see a chance to refactor the table management of the filecache in NC completly? Until now, my time has been very short, but i may help with that having experience in other systems with refactoring databasestructures and code. I am sure, doing this would be a big benefit for Nexcloud to get accepted in more companies...

I just don't want to start something and get some "we won't do this". And maybe it will take time and help, but i would like to think about getting into this.

I would be happy to hear some opinions from the maintainers here.

ruedigerkupper · 2022-04-20T22:50:58Z

Please consider opening a new issue for the binlog problem. This issue is about oc_filecache and your problem seems to be unrelated to that (correct me if that's wrong).
Please do not hijack other reports, this will only make it more difficult to find solutions. Thanks!

paradeiser · 2022-04-21T09:32:56Z

Thanks @ruedigerkupper, you are right - I just replied to @cls-nebadje 's comment about the same topic.
I opened a new issue about binlog files are growing excessively big and fast - and hid my posting here.

flotpg · 2022-04-30T21:37:29Z

same issue here - also have excessive writes which already killed a SSD: #22419

flotpg · 2022-05-01T11:32:57Z

14 hours ago I executed truncate oc_filecache;

This seems to solved the issue. My oc_filecache was over 300GB, now it's 21MB.

helmut72 · 2022-05-01T19:01:35Z

@flotpg This deletes all favourites of the users.

ruedigerkupper · 2022-05-01T19:17:24Z

Please see above:
#7312 (comment)

flotpg · 2022-05-01T20:49:17Z

Thanks @ruedigerkupper

J0WI · 2022-05-04T23:39:49Z

OPTIMIZE TABLE or mysqloptimize should also do the trick.

obel1x · 2022-06-28T10:16:21Z

OPTIMIZE TABLE or mysqloptimize should also do the trick.

No, they won't - also truncation oc_filecache is not a solution, as the table will be populated again when accessing the files or do a scan.
edit: So truncate will be a quick solution for reducing the size, but when using remote files or removable storages, it will grow again and never shrink automagcially. Also truncating has disadvanteges as mentioned above.

imo still, the big writes and reads are also realated to bad structure of the table, building up large index-tables. what helped for me was to remove those indexes:
Index "fs_size" in der Tabelle "oc_filecache".
Index "fs_id_storage_size" in der Tabelle "oc_filecache".
Index "fs_storage_path_prefix" in der Tabelle "oc_filecache".

After that, do optimize table. NC will than complain about those indices missing, but i checked and i am fine with them missing.
(in fact there is exactly one strange query in the jobs, selecting files with "size < 0", that is showing in my slowlog - but it is no good idea to index size-cloumn, so its ok)

That cut my tablesize into half without deleting data, reducing writes and reads massively all the time in cost of a few (very few) slower reads in special cases.

Maybe you want to try this and report.

tclavier · 2022-11-17T13:27:32Z

When i extract the 10 bigger tables of my instances I have :

+---------------+---------------------------------+---------+--------------+
| database_name | table_name                      | used_mb | allocated_mb |
+---------------+---------------------------------+---------+--------------+
| nextcloud     | oc_filecache                    |  707.13 |       709.63 |
| nextcloud     | oc_cards                        |  140.88 |       148.38 |
| nextcloud     | oc_recognize_face_detections    |  109.56 |       114.56 |
| nextcloud     | oc_activity                     |   59.76 |        63.76 |
| nextcloud     | oc_facerecog_faces              |   31.07 |        34.07 |
| nextcloud     | oc_mail_recipients              |   21.53 |        25.53 |
| nextcloud     | oc_mail_messages                |   17.04 |        19.04 |
| nextcloud     | oc_news_items                   |   15.66 |        17.66 |
| nextcloud     | oc_filecache_extended           |   13.27 |        15.27 |
| nextcloud     | oc_recognize_queue_faces        |   13.22 |        17.22 |

When I activate slow query log I have many queries on oc_filecache I thinks the bottleneck of nextcloud is the table oc_filecache

The most important query (they take many time and called many times) is

SELECT
  ... 
FROM
  `oc_filecache` `filecache`
  LEFT JOIN `oc_filecache_extended` `fe` ON `filecache`.`fileid` = `fe`.`fileid`
WHERE
  `name` IN ('?')

In your opinion, is it possible to move part of oc_filecache and oc_filecache_extended in redis ?

obel1x · 2022-11-20T08:54:28Z

When I activate slow query log I have many queries on oc_filecache I thinks the bottleneck of nextcloud is the table oc_filecache

The most important query (they take many time and called many times) is
SELECT
  ... 
FROM
  `oc_filecache` `filecache`
  LEFT JOIN `oc_filecache_extended` `fe` ON `filecache`.`fileid` = `fe`.`fileid`
WHERE
  `name` IN ('?')
In your opinion, is it possible to move part of oc_filecache and oc_filecache_extended in redis ?

No, as long as the structure is that bad, it would not improve the situation to put the data from one database to another.
The query above is also a strange one, i have not seen it before. The name column is searched for '?' - what makes the DB load all data of table. Basically this is what the table is for - but as the structure is that bad, it is too large and is having too many incides to perform well.
Maybe sometime i will have time to clean that code of nextcloud server so that filehandling in the database works efficient.

Maybe you want to reduce the size by deleting the incidces i mentioned above and have a look what happens.

tclavier · 2022-11-20T14:28:59Z

The name column is not search for '?' it's a generalization of many queries for different names.

In your opinion, what is the target new design ?

obel1x · 2022-11-20T15:08:05Z

the table should be normalized - read about the basics of normalisaition of a table. it should be splitted and character-fields should only be used as values to search for. the path should be split in another table and so on... just normalization as a clean design would be. cant explain in details, relies on the usage a bit. but the way its done now its not working good in any way.

SlavikCA · 2022-11-21T01:30:49Z

the table should be normalized

Normalization always makes performance slower. It never makes anything faster.

That's because with normalization, you'll need to use JOINs, meaning - extra tables, extra look ups, extra indexes

tclavier · 2022-11-21T05:55:58Z

I'm not comfortable with the code but I feel that the code that uses this table could be split into subdomains. A DDD approach seems to me to be more effective in the long run. In a short term I feel replacing frequent queries by "cache" could be a good idea.

obel1x · 2022-11-21T10:09:07Z

the table should be normalized

Normalization always makes performance slower. It never makes anything faster.

this in fact is wrong. normalization is the reason for relational databases as we use today. but of course it is important to understand the benefits and to use it wisely. regarding filestructure a normalization would be a good approach. look at the tablesizes and indexsizes that are used now. they are clearly not good, causing the database to do a lot stuff it shouldnt do. Like always swapping to disk and going down to full table scans. I had some of bad designed tables in my life and fixed it by better design, this one here is clearly needing a change. It just, that i am not used to mysql and nextcloud sadly...

In a short term I feel replacing frequent queries by "cache" could be a good idea.

This is what indices are for and would be a general solution for this problem, but you cant do in the current structure. Indices are caches for a part of the table designed for faster reading speed by keeping only relevant, smaller parts in memory as possible. It will work if they are used tight to the needs, but right now there are duplicated and as large as the data itself. This is why currently they are slowing everything down and causing high disk usage. This is why you are faster deleting them as i stated before (read my suggestion above to delete those indices).

This is why i honestly look for changes here as there are: Do normalization, clean up indices and go through the basic queries to improve nextcloud performance and reduce space waste.

obel1x · 2023-02-16T19:28:10Z

just to leave a reference here: Looks like its related to the same behaviour as #6899. See my proposal to reduce deadlocks by changing indices on table oc_filecache. Can you check #6899 (comment) and leave a thumb up there if it works?

Would not really do defrag, but would prevent the indexsize to get that large, as it is using default indices, so Table would not grow that large.

joshtrichards · 2023-06-16T01:22:33Z

Perhaps a check with a warning could be added for the fragmentation scenario (https://dev.mysql.com/doc/refman/8.0/en/innodb-file-defragmenting.html) somewhat like exists for missing indexes?

max-nextcloud · 2023-10-18T09:28:24Z

If you are using external storages please make sure to configure an authentication mechanism that matches your usecase:

Users use the same storage -> use the same global credentials.
Users get a different storage -> use per user credentials.

If two users connect to the same service using different credentials Nextcloud will assume they are seeing different storages. This will of course blow up your filecache. it will also prevent collaboration on files. There's no way for Nextcloud to tell that two users are seeing the same data.

See https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage/auth_mechanisms.html#considerations-for-shared-storage-label

da3dsoul · 2023-10-18T13:42:31Z

It's been a long time since I've used it, so this may not be relevant anymore. I only had a single user and still had this issue.

PancakecatfromSpace · 2024-06-20T21:45:09Z

Good evening, I'm having the exact same issue but in my case it resulted with my snap instance breaking in a very weird way.
https://help.nextcloud.com/t/repeated-database-errors-after-recovering-from-backup/194242?u=pusheen
and from what I've read the problem apears to be known and it is currently set as a "nice to have" feature
#13472 (comment)
I must say that I find this rather frustrating, it is a known issue since over 8 years now. Mounting external storage may turn your instance into a ticking time bomb that may go off without warning. All the backups made after mounting an external filesystem are affected so recovery is extremely difficult even if proper precautions have been made.

MorrisJobke added 0. Needs triage Pending check for reproducibility or if it fits our roadmap feature: filesystem labels Nov 27, 2017

ruedigerkupper changed the title ~~File cache table excessively large (and does not shrink after data removal)~~ File cache table excessively large (and does not shrink after data removal) / Nextcloud should defragment file cache table Nov 28, 2017

MorrisJobke mentioned this issue Dec 8, 2017

13.0.0 beta 2 #7425

Merged

28 tasks

MorrisJobke added the enhancement label Dec 18, 2017

nextcloud-bot added the stale Ticket or PR with no recent activity label Jun 20, 2018

nextcloud-bot removed the stale Ticket or PR with no recent activity label Nov 16, 2018

This comment was marked as duplicate.

Sign in to view

helmut72 mentioned this issue Jan 10, 2019

Add an option to prevent indexing of external shares #13472

Closed

solracsf mentioned this issue Apr 7, 2020

Large external CIFS storage = large DB table oc_filecache #20349

Closed

skjnldsv removed the 0. Needs triage Pending check for reproducibility or if it fits our roadmap label Apr 10, 2020

szaimen mentioned this issue Jun 25, 2021

[oc_filecache] Loading one million files #25995

Closed

This comment was marked as duplicate.

Sign in to view

paradeiser mentioned this issue Apr 20, 2022

Prevent Sabre\VObject\Recur\NoInstancesException: This recurrence rule does not generate any valid instances from being fatal to CalDAV sync #5641

Closed

paradeiser mentioned this issue Apr 21, 2022

binlog files are growing excessively big and fast: 2 days of binlogs > 73G #32036

Closed

joshtrichards added the performance 🚀 label Aug 31, 2023

major-mayer mentioned this issue Dec 18, 2023

[Bug]: Server gets stuck with dozens of MariaDB queries running (most oc_filecache related) (no deadlock reported) #42344

Open

8 tasks

File cache table excessively large (and does not shrink after data removal) / Nextcloud should defragment file cache table #7312

File cache table excessively large (and does not shrink after data removal) / Nextcloud should defragment file cache table #7312

Comments

ruedigerkupper commented Nov 27, 2017 • edited Loading

Problem

Further details

Steps to reproduce

Expected behaviour

Actual behaviour

General server configuration

Nextcloud configuration

Client configuration

ArnisR commented Nov 27, 2017

MorrisJobke commented Nov 27, 2017

ruedigerkupper commented Nov 27, 2017

ArnisR commented Nov 27, 2017

ruedigerkupper commented Nov 28, 2017

Summary

Feature request

LnLogN commented Dec 18, 2017

rullzer commented Dec 19, 2017

cgrima commented Dec 27, 2017 • edited Loading

pprotschka commented Apr 1, 2018

McKay1717 commented Nov 16, 2018 • edited Loading

This comment was marked as duplicate.

helmut72 commented Jan 10, 2019

chaosgrid commented Jan 11, 2019

da3dsoul commented Feb 1, 2019 • edited Loading

paulrolandw commented May 17, 2019 • edited Loading

helmut72 commented May 22, 2019

nate2014jatc commented Oct 28, 2019

jgrete commented Apr 4, 2020

tanguy-opendsi commented Apr 7, 2020

nate2014jatc commented Apr 7, 2020

cls-nebadje commented Nov 9, 2021 • edited Loading

paradeiser commented Apr 16, 2022

obel1x commented Apr 17, 2022 • edited Loading

This comment was marked as duplicate.

ruedigerkupper commented Apr 20, 2022

paradeiser commented Apr 21, 2022

flotpg commented Apr 30, 2022

flotpg commented May 1, 2022

helmut72 commented May 1, 2022

ruedigerkupper commented May 1, 2022

flotpg commented May 1, 2022

J0WI commented May 4, 2022

obel1x commented Jun 28, 2022 • edited Loading

tclavier commented Nov 17, 2022

obel1x commented Nov 20, 2022

tclavier commented Nov 20, 2022

obel1x commented Nov 20, 2022 • edited Loading

SlavikCA commented Nov 21, 2022

tclavier commented Nov 21, 2022

obel1x commented Nov 21, 2022 • edited Loading

obel1x commented Feb 16, 2023

joshtrichards commented Jun 16, 2023

max-nextcloud commented Oct 18, 2023 • edited by solracsf Loading

da3dsoul commented Oct 18, 2023

PancakecatfromSpace commented Jun 20, 2024

ruedigerkupper commented Nov 27, 2017 •

edited

Loading

cgrima commented Dec 27, 2017 •

edited

Loading

McKay1717 commented Nov 16, 2018 •

edited

Loading

da3dsoul commented Feb 1, 2019 •

edited

Loading

paulrolandw commented May 17, 2019 •

edited

Loading

cls-nebadje commented Nov 9, 2021 •

edited

Loading

obel1x commented Apr 17, 2022 •

edited

Loading

obel1x commented Jun 28, 2022 •

edited

Loading

obel1x commented Nov 20, 2022 •

edited

Loading

obel1x commented Nov 21, 2022 •

edited

Loading

max-nextcloud commented Oct 18, 2023 •

edited by solracsf

Loading