Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a lotus-shed migration command to migrate existing indexes to chain indexer #12408

Open
3 of 9 tasks
akaladarshi opened this issue Aug 21, 2024 · 3 comments
Open
3 of 9 tasks
Labels
kind/feature Kind: Feature
Milestone

Comments

@akaladarshi
Copy link
Contributor

Checklist

  • This is not brainstorming ideas. If you have an idea you'd like to discuss, please open a new discussion on the lotus forum and select the category as Ideas.
  • I have a specific, actionable, and well motivated feature request to propose.

Lotus component

  • lotus daemon - chain sync
  • lotus fvm/fevm - Lotus FVM and FEVM interactions
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt/WinningPoSt)
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

What is the motivation behind this feature request? Is your feature request related to a problem? Please describe.

According to the discussion in #12293 and changes in PR, the decision was made to remove the fragmented indexes (msg, txhash, events) and create a single index (ChainIndexer). Therefore, we need to migrate the existing indexes to ChainIndexer.

Describe the solution you'd like

Create a lotus-shed migration command to migrate all the existing indexes to ChainIndexer

Describe alternatives you've considered

No response

Additional context

No response

@akaladarshi akaladarshi added the kind/feature Kind: Feature label Aug 21, 2024
@akaladarshi akaladarshi changed the title Implement a lotus-shed migration command to migrate indexes to chain indexers Implement a lotus-shed migration command to migrate indexes to chain indexer Aug 21, 2024
@akaladarshi akaladarshi changed the title Implement a lotus-shed migration command to migrate indexes to chain indexer Implement a lotus-shed migration command to migrate existing indexes to chain indexer Aug 21, 2024
@rjan90 rjan90 added this to the DX-Streamline milestone Aug 21, 2024
@rvagg
Copy link
Member

rvagg commented Aug 22, 2024

Right now @aarshkshah1992 and I are thinking that lotus-shed is the best place for this to live, for a few reasons:

  • We anticipate the migration to be really slow for anyone with a non-trivial amount of existing data; and we had quite a bit of grief from users during the 1.28 upgrade because we had a migration in there that took quite a long time for some users and they didn't know what was going on.
  • We also expect that most people who have this turned on don't even need the depth of data that they have, so migrating the whole thing is likely going to be pointless for their use. The difference between starting from scratch and migrating is probably not going to be huge.
  • This is something we can warn about with big flashing lights in the release notes -- IF YOU WANT CONTINUITY THEN DO THIS FIRST, OTHERWISE IT'LL START FROM SCRATCH.
  • The migration can happen at any time, and you should even be able to re-run it to populate your db, even after you've started a new db. So we could decide to keep the existing db files in there for now (perhaps delete in a future upgrade) so users have the opportunity to decide to re-populate it after the upgrade if they didn't see the notice. Users with large dbs, like archival node operators, can run the shed command as a separate job during their upgrade process and manage it accordingly.
    • It will need to be idempotent, such that you can run it repeatedly and it'll just fill up the gaps, so you can do it before upgrade to prime it, then run it at upgrade time to get it finished up.

All that being said, we may end up deciding this isn't a great idea and the migration should be inline during the upgrade. Perhaps if we couple it with new GC settings then migration doesn't have to be expensive at all because we only migrate the data after where a GC would delete your data anwyay.

So for now, lotus-shed is a good place to work on this, we may end up moving it in to the main daemon process later if we decide that's a better strategy.

@akaladarshi
Copy link
Contributor Author

@rvagg

In a call with @aarshkshah1992, we decided to go with the migration cum backfill type of command.

It will start from the chain head and start backfilling to the new chainIndexer database, this command will backfill from the main store not from the existing DB this will make sure we have correct data in the new indexes (not GC'ed or pruned).

@aarshkshah1992
Copy link
Contributor

@rvagg Basically, we want to use the chain store/chain state to "migrate" rather than using the existing Indices to migrate.

This is because the existing Indices might have a lot of entries for which the corresponding state has already been GC'd and also the DDLs don't map 1:1 nicely to the new DDL.

Using the chainstore/chainstate as the source of truth for migrating/backfilling the Indices makes more sense and will get us a consistent Index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Kind: Feature
Projects
Status: 🐱 Todo
Development

No branches or pull requests

4 participants