Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update LanguageHelper.php - Extension language translation failing #42441

Closed
wants to merge 6 commits into from

Conversation

BrainforgeUK
Copy link
Contributor

@BrainforgeUK BrainforgeUK commented Dec 1, 2023

Caters for multi-line text.

Pull Request for Issue #42416 .

Summary of Changes

Allow embedded newlines in text (i.e. between quotes ").
Using them to structure very large areas of text makes maintenance easier.
In some cases using linewrap in IDE slightly helps, but one is left with trying to read / unravel a massive jumble of text.

Newlines in text worked up to v4.4.0
Security related change in v4.4.1 removed this functionality.

This change detects and removes the embedded newlines before using PHP ini parsing as before.
The security related change introduced in v4.4.1 is maintained.

Testing Instructions

Install this plugin.
plg_system_bflanguagetest.zip

Go to plugin admin page.

Actual result BEFORE applying this Pull Request

With v4.4.1 the plugin name and description not translated.
The language labels are displayed.

Expected result AFTER applying this Pull Request

Name and description translated.

Link to documentations

Please select:

  • Documentation link for docs.joomla.org:

  • [] No documentation changes for docs.joomla.org needed

  • [X ] Pull Request link for manual.joomla.org:

  • [] No documentation changes for manual.joomla.org needed

Line breaks in character string have worked for a very long time.
The documentation says they are not allowed - that needs to be corrected.

Caters for multi-line text.
Improved fallback to Joomla 4.1.1 way of doing things.
See comment on issue joomla#17198
Removed unnecessary parse_ini_string().
@BrainforgeUK
Copy link
Contributor Author

Looking at the PHP examples Joomla language handling is only interested in character strings.
https://www.php.net/manual/en/function.parse-ini-string.php

So we can forget all the other stuff.
We can also forget about using parse_ini_string.

I have removed call to parse_ini_string() as it is pointless.
But kepy kept it in the $unexpectedFileContents handling section ... just in case someone doing something unexpected.
But if they are ... will Joomla language handling work in that scenario anyway? I doubt it!

Added extra double-quote check and comment to clarify reason.
Corrected typing error.
@chrisdavenport
Copy link
Contributor

Regardless of your view on whether changing to RAW is a B/C break or not, the change has caused us (and many others) a great deal of pain as almost all of our websites would be "broken" (in the sense of showing untranslated strings) if we were to proceed with the 4.4.1 update. We are in the process of removing multi-line language strings, but this will take some time and we have had to put a freeze on rolling out the security update until this work is completed.

Going forwards, the loss of multi-line support is very inconvenient because editing long single lines is tricky. Automatic line-wrapping in editors helps a tiny bit, but the line wraps are never really in the right places. So we need to come up with a solution that allows us to have multi-line language strings, but without compromising the security issue (whatever that is/was). This PR is one possibility. Another would be to modify our build tools to process the language files during the build process. I would prefer the former on the grounds that we should try to make Joomla websites and third-party extensions as easy to use and maintain as possible (so build tools should not be required).

It would be useful to get a steer from the maintainers as to whether this, or a similar PR, stands a chance of being accepted. We can then decide on how to proceed with our website updates. The only downside with this PR, as far as I can see, is the potential performance impact, so it would be useful to get some metrics on that.

@BrainforgeUK
Copy link
Contributor Author

Thanks for your support on this - I am sure in the coming weeks many more like you will have similar comments about this change in J4.4.1 'causing a great deal of pain'.

Agree if someone has a similar PR with less performance impact then go for it. In the meantime the choice is go with my PR for now or reverse the Joomla change as a matter of urgency - the latter requires a proper assessment of the security risk in this particular language translation use scenario ... my guess the security risk is zero.

@brianteeman
Copy link
Contributor

There are lots of things in life these days that "cause a great deal of pain". Writing a very simple script to correct a language file is not one of them.

@ghost
Copy link

ghost commented Dec 2, 2023

There are lots of things in life these days that "cause a great deal of pain". Writing a very simple script to correct a language file is not one of them.

It would actually be helpful if instead of snarky comments like this talking down on anyone who had written a language file that was parsed without error, people could talk about the issue and work toward a solution. But, nope, it seems like more people with contributor badges would rather take the stance of "LOL, you're wrong, bad person".

I'm going to do what the security team won't and fully disclose the security issue this release aimed to address, which if you know how the functions that were changed work, you already know this information.

The PHP INI parser in its default mode supports string interpolation and using environment variables in strings, meaning this is a legal string:

MY_COMPOSER_PATH="My Composer path is: ${COMPOSER_HOME}"

By changing to the raw mode, PHP does not do this string interpolation. That seems to be the "sensible information" [sic] that the patch aims to address, but the behavior change introduces a lot of side effects. People seem to like to point fingers about the multi-line strings, but #42425 is another example of how changing the parsing mode breaks things. No finger pointing here, just stating the fact that normal versus raw parsing mode has different behavior.

Now, why would having environment variables in language strings be an issue? Well, there are really only two ways to get language strings into a Joomla application:

  • Install an extension
  • The overrides area in the Joomla admin

It is also important to note that the language parser cannot be triggered on its own with untrusted user input, it has to be given a path on the filesystem. Which makes me question the high impact and severity ratings on the security advisory, because the scope here is already severely limited to "requires a privileged user to make a change with malicious side effects".

If the security patch was issued because of the first scenario, then this is basically the security team saying "we can't even trust users to review their extensions before installing them" because those same extensions can do a lot more crazy things than access environment variables in language strings.

If the security patch was issued because of the second scanario, then this is basically the security team admitting that there is not adequate filtering/sanitization/validation of user input (as the overrides area is the only spot that a language string can be manipulated without being able to install extensions) and they used a sledgehammer to push a nail into a wall, ignoring the fact that there's now a 3 meter hole in the wall as a result.

So instead of shunning people for using native language features to improve their workflows, now that everyone is on a level playing field in terms of information sharing, work toward a solution that satisfies everyone's needs.

@BrainforgeUK
Copy link
Contributor Author

Thanks for that security issue research - you went deeper than me.

Never thought of 'using environment variables in strings' ... interesting idea but in a language translation .ini would there be any practical to use?

Until someone comes up with a better performance alternative I think my PR solves the immediate issues.

Be interesting to see if there are any people out there 'using environment variables in strings' in language .ini files!
My PR, or an alternative, would need to cater for that, but for now I suggest leave asis.

@ghost
Copy link

ghost commented Dec 2, 2023

For translations, not really needed. PHP supports referencing environment variables in its INI config files, so it’s a feature for anyone using the native parser.

@bembelimen
Copy link
Contributor

Hi @BrainforgeUK I appreciate your resentments regarding the security fix but looking at your PR, we have to weightening up if it is worth not fixing a few lines of language strings compared to heavy performance suffering for the end user.

I run some tests between current method and your method with 100,000 iterations: test.zip

Result:

grafik

So just for the sake of not changing a few translations we should not implement such an heavy performance impact.

Until someone comes up with a better performance alternative

Alternative is easy: fix the strings

@laoneo
Copy link
Member

laoneo commented Dec 3, 2023

Thanks for your effort fixing the regression which came up with the security fix. Unfortunately the performance impact is too big with this pr. So I'm closing this for now. If somebody suffers from this issue with a 3rd party extension, it is always possible to manually apply the patch till the extension developers have fixed their code.

@laoneo laoneo closed this Dec 3, 2023
@BrainforgeUK
Copy link
Contributor Author

Thanks for the performance information.
I have a fix for that - can you reopen the PR or shall I submit a new one?
Sorry been doing other things, so did not get this in earlier.
I'll wait 30mins and see if you are still around to reopen. Thanks.

@laoneo
Copy link
Member

laoneo commented Dec 3, 2023

Please open a new one as it sounds like a new approach.

@BrainforgeUK
Copy link
Contributor Author

New PRs for J4.4 and J5.0

#42455

#42456

@SniperSister
Copy link
Contributor

Let me suggest a potential alternative approach: Instead of changing the parsing mode, the attack vector could also be closed by validating the user inputs in the language override form to avoid the injection of placeholders that are replaced by environment variable contents.

We discussed this option during a JSST meeting but came to the conclusion that the raw mode is the more straightforward and "complete" approach as it would also cover scenarios where language overrides are created by 3rd party extensions (and yes, I know of at least 2 extensions that allow that).

So, the alternate solution to our own parsing logic would be the restoration of the "normal" parsing mode, combined with strong input filtering. I won't block such an approach if such a PR is properly tested and includes automated tests to avoid future regressions.

@BrainforgeUK
Copy link
Contributor Author

Sounds unecessary complexity. The only user inputs available here are through the admin language override form - unless a developer adds additional language mechanisms ... suppose worth protecting that scenario.

My PR #42456 keeps the current security fix and restores backward compatibility.

@brianteeman
Copy link
Contributor

@SniperSister please see my email yesterday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature PR-4.4-dev RMDQ ReleaseManagerDecisionQueue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants