Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Fixes for Scatter Charts #2828

Merged
merged 7 commits into from
May 17, 2022
Merged

Some Fixes for Scatter Charts #2828

merged 7 commits into from
May 17, 2022

Conversation

oleibman
Copy link
Collaborator

@oleibman oleibman commented May 13, 2022

Chart issues have been pouring in recently. Fix #2762. It implements "no joins" for scatter charts, as well as having the reader and writer handle "point size", "line width", and "color" for markers. A new boolean property scatterLines, with setter and getter, is added to DataSeriesValues to handle joins (default is true which means scatter plot points are joined by lines). All font properties for the chart title are handled (support may be incomplete for color and font name), which makes this a partial solution for #2700.

With these changes, sample 32readwriteScatterChart*.xlsx now appears the same as its source.

The code for reading charts has not yet been converted to be namespace aware. Having a tiny island of aware code in a sea of unaware makes no sense to me, so some of the new code is likewise unaware. I hope to be able to get to it eventually, but, among other considerations, it is difficult to generate suitable test cases.

Fix #2297. Fix #2298 (PR #2801 already addressed that for writing charts, this PR extends support to reading). Fix #2299.

This is:

- [x] a bugfix
- [ ] a new feature
- [ ] refactoring
- [ ] additional unit tests

Checklist:

  • Changes are covered by unit tests
    • Changes are covered by existing unit tests
    • New unit tests have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change
  • Documentation is updated as necessary

Why this change is needed?

Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.

Chart issues have been pouring in recently. This is a partial response to issue PHPOffice#2762. It implements "no joins" for scatter charts, as well as having the reader and writer handle "point size", "line width", and "color" for markers. A new boolean property `scatterLines`, with setter and getter, is added to DataSeriesValues to handle joins (default is true which means scatter plot points *are* joined by lines). Some, but not yet all, default font properties for the chart title are handled (color and, surprisingly, font name present challenges).

With these changes, sample 32readwriteScatterChart1.xlsx now looks closer to its source. There are still some differences (x-axis changes), but I think this change is already large enough. I can work on the other problems later.

The code for reading charts has not yet been converted to be namespace aware. Having a tiny island of aware code in a sea of unaware makes no sense to me, so some of the new code is likewise unaware. I hope to be able to get to it eventually, but, among other considerations, it is difficult to generate suitable test cases.
Essentially the same as the corresponding Samples, but with formal assertions.
Having added code to support default font attributes as well as element-specific font attributes for chart captions, there was duplicated code between the default and specific sections. I hope that this PR makes the code easier to follow.
XML layout for these in new files differs from what program was expecting. Not sure if program expectations were wrong, or if this is a change to Excel since initial development.
Handle theoretical case where Chart title has text but no font information.
For Bezier, need to specify `<c:smooth>` tag in addition to already supplied `<c:scatterStyle val="smoothMarker">`

For X-Axis, scatter needs to supply both X and y axis as `<c:valAx>` rather than `<c:catAx>` for X.
@oleibman oleibman merged commit 9aa1708 into PHPOffice:master May 17, 2022
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this pull request May 19, 2022
Taking up where PHPOffice#2828 left off. Most of the following changes are demonstrated in 32readwriteChartWithImages1:
- Adds support for "scheme" colors (because rgb, theme, and index colors just weren't enough for Excel) for DataSeriesValues. See issue PHPOffice#2299.
- For chart titles (including axis labels), rather than a font name, Excel supplies a 3-fold series of font names for Latin, East Asian, and Complex Scripts. New properties `latin`, `eastAsian`, and `complexScript` are added to the Font class. I frankly have no idea how, or even if, you can set these in Excel; my test case (sample 32readwriteScatterChart7) is a result of manually editing the XML.
- Add support for subscript/superscript to chart titles. This requires a new property `baseLine` in Font (positive=superscript negative=subscript baseline value says how high/low).
- Support for underscore with different scheme color than its text, using a new string property `uSchemeClr` in Font.
- Support for extra options for strikethrough, using a new string property `strikeType` in Font.
- Support for extra options for underscore type, using the existing string property `underline` in Font.
- I do not anticipate that any of the new Font properties will be used except for chart titles.
- If no default font overrides are found for a Rich Text element in chart titles, and no explicit font overrides are found for a Run under such an element, the font element of the Run is set to null.
- PhpSpreadsheet will always write a tag `a:pPr` and, underneath that, an empty tag `a:defRPr`, for default font settings for chart titles and axis labels. Combined with the previous bullet item, this will prevent PhpSpreadsheet from inadvertently overriding the Excel defaults (18 point bold Calibri for chart title, 10 point bold Calibri for axis labels).
- Axis labels will now be written to XML in the same manner as chart titles. Among other considerations, this means that they can now have colors. Fix PHPOffice#2700. Supersedes PR PHPOffice#2701. Demonstrated in sample 32readwriteStockChart5.
@oleibman oleibman mentioned this pull request May 19, 2022
7 tasks
oleibman added a commit that referenced this pull request May 21, 2022
* More Chart Fixes

Taking up where #2828 left off. Most of the following changes are demonstrated in 32readwriteChartWithImages1:
- Adds support for "scheme" colors (because rgb, theme, and index colors just weren't enough for Excel) for DataSeriesValues. See issue #2299.
- For chart titles (including axis labels), rather than a font name, Excel supplies a 3-fold series of font names for Latin, East Asian, and Complex Scripts. New properties `latin`, `eastAsian`, and `complexScript` are added to the Font class. I frankly have no idea how, or even if, you can set these in Excel; my test case (sample 32readwriteScatterChart7) is a result of manually editing the XML.
- Add support for subscript/superscript to chart titles. This requires a new property `baseLine` in Font (positive=superscript negative=subscript baseline value says how high/low).
- Support for underscore with different scheme color than its text, using a new string property `uSchemeClr` in Font.
- Support for extra options for strikethrough, using a new string property `strikeType` in Font.
- Support for extra options for underscore type, using the existing string property `underline` in Font.
- I do not anticipate that any of the new Font properties will be used except for chart titles.
- If no default font overrides are found for a Rich Text element in chart titles, and no explicit font overrides are found for a Run under such an element, the font element of the Run is set to null.
- PhpSpreadsheet will always write a tag `a:pPr` and, underneath that, an empty tag `a:defRPr`, for default font settings for chart titles and axis labels. Combined with the previous bullet item, this will prevent PhpSpreadsheet from inadvertently overriding the Excel defaults (18 point bold Calibri for chart title, 10 point bold Calibri for axis labels).
- Axis labels will now be written to XML in the same manner as chart titles. Among other considerations, this means that they can now have colors. Fix #2700. Supersedes PR #2701. Demonstrated in sample 32readwriteStockChart5.

* Fix Some Chart Corruption

Fix #2817, where @bridgeplayr gives an excellent description of the problem and how it should be solved.

* Fix Bubble Charts

Sample produced corrupt output - see issue #2763. After a lot of research, solution was just re-ordering of parameters in a single function call.

Bubble 3D had not been supported at all. It is now.

Surface Charts remain corrupted.
@bridgeplayr
Copy link

Thank you @oleibman and @MarkBaker for untangling and addressing these wide-ranging issues. I'm looking forward to updating to the next release of phpspreadsheet.

@MarkBaker
Copy link
Member

It's Owen to thank; he's done all the work on this

@oleibman oleibman deleted the scatter1a branch May 25, 2022 08:12
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this pull request May 26, 2022
Continuing the work from PHPOffice#2828, PHPOffice#2841, PHPOffice#2846, and PHPOffice#2852. This is probably my last change in this area for a while.

Bubble charts can have bubbles of different sizes. Phpspreadsheet had not supported this. Openpyxl comes with sample code to generate such a chart. I was especially drawn to that solution because its namespace usage would have been unexpected before 2852. And it turned out to come with other surprises - use of absolute paths in the .rels files (PhpSpreadsheet expected only relative), use of a one-cell anchor to place the chart (PhpSpreadsheet expected two-cell anchor or absolute positioning), plaintext in the legend (Phpspreadsheet expected RichText), no cached values for chart data. Excel handles the file okay, and this PR makes sure PhpSpreadsheet does as well. This file is now Samples/Templates/32readwriteBubbleChart2, and is used for both generating a sample output file and in formal tests. A new sample in the 33* series demonstrates how to code these.
@oleibman oleibman mentioned this pull request May 26, 2022
7 tasks
oleibman added a commit that referenced this pull request May 29, 2022
Continuing the work from #2828, #2841, #2846, and #2852. This is probably my last change in this area for a while.

Bubble charts can have bubbles of different sizes. Phpspreadsheet had not supported this. Openpyxl comes with sample code to generate such a chart. I was especially drawn to that solution because its namespace usage would have been unexpected before 2852. And it turned out to come with other surprises - use of absolute paths in the .rels files (PhpSpreadsheet expected only relative), use of a one-cell anchor to place the chart (PhpSpreadsheet expected two-cell anchor or absolute positioning), plaintext in the legend (Phpspreadsheet expected RichText), no cached values for chart data. Excel handles the file okay, and this PR makes sure PhpSpreadsheet does as well. This file is now Samples/Templates/32readwriteBubbleChart2, and is used for both generating a sample output file and in formal tests. A new sample in the 33* series demonstrates how to code these.
@bridgeplayr bridgeplayr mentioned this pull request Jun 1, 2022
8 tasks
MarkBaker added a commit that referenced this pull request Jul 9, 2022
Note that this will be the last 1.x branch release before the 2.x release. We will maintain both branches in parallel for a time; but users are requested to update to version 2.0 once that is fully available.

### Added

- Added `removeComment()` method for Worksheet [PR #2875](https://github.com/PHPOffice/PhpSpreadsheet/pull/2875/files)
- Add point size option for scatter charts [Issue #2298](#2298) [PR #2801](#2801)
- Basic support for Xlsx reading/writing Chart Sheets [PR #2830](#2830)

  Note that a ChartSheet is still only written as a normal Worksheet containing a single chart, not as an actual ChartSheet.

- Added Worksheet visibility in Ods Reader [PR #2851](#2851) and Gnumeric Reader [PR #2853](#2853)
- Added Worksheet visibility in Ods Writer [PR #2850](#2850)
- Allow Csv Reader to treat string as contents of file [Issue #1285](#1285) [PR #2792](#2792)
- Allow Csv Reader to store null string rather than leave cell empty [Issue #2840](#2840) [PR #2842](#2842)
- Provide new Worksheet methods to identify if a row or column is "empty", making allowance for different definitions of "empty":
  - Treat rows/columns containing no cell records as empty (default)
  - Treat cells containing a null value as empty
  - Treat cells containing an empty string as empty

### Changed

- Modify `rangeBoundaries()`, `rangeDimension()` and `getRangeBoundaries()` Coordinate methods to work with row/column ranges as well as with cell ranges and cells [PR #2926](#2926)
- Better enforcement of value modification to match specified datatype when using `setValueExplicit()`
- Relax validation of merge cells to allow merge for a single cell reference [Issue #2776](#2776)
- Memory and speed improvements, particularly for the Cell Collection, and the Writers.

  See [the Discussion section on github](#2821) for details of performance across versions
- Improved performance for removing rows/columns from a worksheet

### Deprecated

- Nothing

### Removed

- Nothing

### Fixed

- Xls Reader resolving absolute named ranges to relative ranges [Issue #2826](#2826) [PR #2827](#2827)
- Null value handling in the Excel Math/Trig PRODUCT() function [Issue #2833](#2833) [PR #2834](#2834)
- Invalid Print Area defined in Xlsx corrupts internal storage of print area [Issue #2848](#2848) [PR #2849](#2849)
- Time interval formatting [Issue #2768](#2768) [PR #2772](#2772)
- Copy from Xls(x) to Html/Pdf loses drawings [PR #2788](#2788)
- Html Reader converting cell containing 0 to null string [Issue #2810](#2810) [PR #2813](#2813)
- Many fixes for Charts, especially, but not limited to, Scatter, Bubble, and Surface charts. [Issue #2762](#2762) [Issue #2299](#2299) [Issue #2700](#2700) [Issue #2817](#2817) [Issue #2763](#2763) [Issue #2219](#2219) [Issue #2863](#2863) [PR #2828](#2828) [PR #2841](#2841) [PR #2846](#2846) [PR #2852](#2852) [PR #2856](#2856) [PR #2865](#2865) [PR #2872](#2872) [PR #2879](#2879) [PR #2898](#2898) [PR #2906](#2906) [PR #2922](#2922) [PR #2923](#2923)
- Adjust both coordinates for two-cell anchors when rows/columns are added/deleted. [Issue #2908](#2908) [PR #2909](#2909)
- Keep calculated string results below 32K. [PR #2921](#2921)
- Filter out illegal Unicode char values FFFE/FFFF. [Issue #2897](#2897) [PR #2910](#2910)
- Better handling of REF errors and propagation of all errors in Calculation engine. [PR #2902](#2902)
- Calculating Engine regexp for Column/Row references when there are multiple quoted worksheet references in the formula [Issue #2874](#2874) [PR #2899](#2899)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants