-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iteratively execute synchronous ingest processors #84250
Conversation
Hi @danhermann, I've created a changelog YAML for you. |
ef3138f
to
115df83
Compare
Pinging @elastic/es-data-management (Team:Data Management) |
Hi @danhermann, I've updated the changelog YAML for you. |
@elasticmachine update branch |
Adding a brief description of how CompoundProcessor works since I was initially thrown off: |
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/bwc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I ran a couple of rally tests to make sure that we don't see a performance regression with this. For all tests I used a one-node Elasticsearch cluster and a separate node running rally.
The median throughput improved by about 4.5%.
I ran that multiple times (since the first run on a fresh ES was always slowest), and on the non-first runs I'm consistently seeing the |
* upstream/master: (40 commits) Fix BuildTests serialization (elastic#85827) Use urgent priority for node shutdown cluster state update (elastic#85838) Remove Task classes from HLRC (elastic#85835) Remove unused migration classes (elastic#85834) Remove uses of Charset name parsing (elastic#85795) Remove legacy versioned logic for DefaultSystemMemoryInfo (elastic#85761) Expose proxy settings for GCS repositories (elastic#85785) Remove SLM classes from HLRC (elastic#85825) TSDB: fix the time_series in order collect priority (elastic#85526) Remove ILM classes from HLRC (elastic#85822) FastVectorHighlighter should use ValueFetchers to load source data (elastic#85815) Iteratively execute synchronous ingest processors (elastic#84250) Remove TransformClient from HLRC (elastic#85787) Mute XPackRestIT deprecation/10_basic/Test Deprecations (elastic#85807) Unmute Lintian packaging test (elastic#85778) Add a highlighter unit test base class (elastic#85719) Remove NIO Transport Plugin (elastic#82085) [TEST] Remove token methods from HLRC SecurityClient (elastic#85515) [Test] Use thread-safe hashSet for result collection (elastic#85653) [TEST] Mute BuildTests.testSerialization (elastic#85801) ... # Conflicts: # server/src/test/java/org/elasticsearch/search/aggregations/timeseries/TimeSeriesIndexSearcherTests.java
Thanks @masseyke for pushing this across the finish line. It had a positive impact on the notoriously hard-to-pin-down |
Iteratively executes ingest processors that are not asynchronous. This resolves the issue of stack overflows that could occur in large ingest pipelines. As a secondary effect, it dramatically reduces the depth of the stack during ingest pipeline execution which reduces the performance cost of gathering a stack trace for any exceptions instantiated during pipeline execution. The aggregate performance effect of these changes has been observed to be a 10-15% improvement in some larger pipelines.
The primary changes required were adding iterative execution loops to
CompoundProcessor
andForEachProcessor
. Most of the other changes were consequences of changing those two processors.Fixes #84274.