Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hrefTranslate attribute #301

Closed
3 of 5 tasks
dtapuska opened this issue Aug 23, 2018 · 40 comments
Closed
3 of 5 tasks

hrefTranslate attribute #301

dtapuska opened this issue Aug 23, 2018 · 40 comments
Assignees
Labels
Progress: in progress Resolution: unsatisfied The TAG does not feel the design meets required quality standards Review type: CG early review An early review of general direction from a Community Group Topic: HTML Venue: WHATWG

Comments

@dtapuska
Copy link

dtapuska commented Aug 23, 2018

Bonjour TAG,

I'm requesting a TAG review of:

Further details (optional):

You should also know that...

There has been some debate in whatwg/html#2945 specifically if this feature is useful or creates new problematic scenarios. We believe this feature has useful benefits in surfacing content to users.

We'd prefer the TAG provide feedback as (please select one):

  • open issues in our Github repo for each point of feedback
  • open a single issue in our Github repo for the entire review
  • leave review feedback as a comment in this issue and @-notify [github usernames]
@torgo torgo self-assigned this Sep 18, 2018
@plinss plinss added this to the 2018-10-02-telcon milestone Sep 18, 2018
@torgo torgo changed the title TAG Review Request: hrefTranslate attribute hrefTranslate attribute Oct 30, 2018
@torgo
Copy link
Member

torgo commented Oct 30, 2018

We're trying to make some progress on this at our Paris f2f. Unfortunately we haven't made progress on this review. @dtapuska can you send us an explainer reference? Please see our updated explainer explainer.

@torgo torgo added the Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review label Oct 30, 2018
@dtapuska
Copy link
Author

@torgo The explainer is indicated in the first comment and is here: https://github.com/dtapuska/html-translate

@kenchris
Copy link

According to https://w3ctag.github.io/design-principles/#casing-rules attributes should be lowercased and concatenated

@dbaron
Copy link
Member

dbaron commented Oct 31, 2018

I agree that I'm not a fan of reusing the lang attribute for any of this -- the lang attribute is describing the language of the contents of the element, which in this case is the text of the link.

Assuming that the feature is desirable, both options 2 and 3 seem plausible, and it also seems like they could be combined.

I suppose that what's less clear to me here is why this should be something that the page chooses rather than the UA -- the explainer doesn't really make that clear to me.

@kenchris
Copy link

would hreftranslate without value do anything? Like for instance translate to the default UA language?

@kenchris
Copy link

I would like to understand what are some of the advantages of this? That the browser/CDN can cache translations or that the pages can be pretranslated to minimize bandwidth usage/caching and thus improving the time to interactive? Could the explainer talk more about that?

@hadleybeeman
Copy link
Member

hadleybeeman commented Oct 31, 2018

Hi all. We've discussed this at our TAG face-to-face in Paris.

We are mainly wondering: when do you imagine this being a decision that you'd want to give to a referring page, and not to the user agent? Currently, if I prefer viewing web pages in French, I will express that preference with an HTTP header (like Accept-Language, for example) and the server will send me French content or translate for me where it can.

As a user, I expect the UA to manage these preferences and transactions for me. When would it be useful to let the referring site make that decision for me?

Could you provide some more real world examples of where you see this user need cropping up? Thanks so much!

@dtapuska
Copy link
Author

It is most beneficial when:

  1. The landing page wants to present text in alternate languages and wants to present a machine translated page for those languages. Think how the main wikipedia site has a bunch of text for the language translated into the native language. So even a non-English speaker lands on an English site they can identify the text that indicates their native language. And the site which is comfortable with machine translation can cause the page to actually be translated in that language.

  2. The explainer describes this scenario where searching for text in a language different than any of your language preferences set may return a page that should be translated for the user. ie. Think about "Translate this page" on Google Search; it currently uses server side translation and for better fidelity (because of the use of javascript to generate pages) it could use client side translation.

@dbaron
Copy link
Member

dbaron commented Oct 31, 2018

One other concern here is that it seems to assume that client side translation is available in all browsers, when that's not actually the case, and I don't think there's a reasonable path to it being the case. It's not clear to me what the recommended fallback is, or how it's supposed to fall back to it.

@dtapuska
Copy link
Author

Sorry I should place some things that I believe are common knowledge in the explainer. Since this is a new attribute it is easy to feature detect. (ie; HTMLAnchorElement.prototype.hasOwnProperty('hrefTranslate');)

@dbaron
Copy link
Member

dbaron commented Oct 31, 2018

Sure, it can be feature detected... but what's the expectation of what pages do when it's not there? Seems like it might be worth talking about that.

@hadleybeeman
Copy link
Member

Thanks for the response, @dtapuska. I'll admit, I'm still a little concerned.

The web fundamentally works by the user expressing their preferences through a user agent. (I tell my browser that I want sites in French.) The user agent then negotiates with sites accordingly.

If I (or my UA) has set some preferences with a search engine, for example, ("I'd like my Google results to be displayed in French") -- that is still a relationship I have with that site. The same origin policy is another expression of this mentality; I have told this site something about me that I don't expect it would (or could) share with another.

To explicitly expect this search engine to tell the site I'm clicking on that I'm probably going to want the site in French -- isn't how the web works. It's a violation of my deal with the search engine (since it is leaking information about me) and prevents my user agent from doing its job -- which is to tell that second site how I'd like the content.

Have I misunderstood what you're trying to do here?

@plinss plinss removed this from the 2018-12-18-telcon milestone Jan 7, 2019
@alice
Copy link

alice commented Oct 3, 2019

@chrishtr

... it's also in the new explainer

Would you mind clarifying which specific part of the new explainer you're referring to, just for easy access?

Also, almost all users don't understand browser settings very well, and are often influenced by default settings or other factors when choosing a UI language.

Could you elaborate on why a UA couldn't solve this problem by making those settings easier to access (for example, prompting a user who frequently searches via the address bar in language A but has their UA language set as language B to set their UA language to A)? Do the other factors alluded to rule out a purely UA-based solution?

@chrishtr
Copy link

chrishtr commented Oct 3, 2019

Would you mind clarifying which specific part of the new explainer you're referring to, just for easy access?

Sure thing. The list of Pros of the API includes the points I'm making here, plus more.

The Problem Statement section also goes into more detail about the problems caused by sites trying to help out the user with translation without any UA coordination. In particular, it's not just rendering quality.

Could you elaborate on why a UA couldn't solve this problem by making those settings easier to access (for example, prompting a user who frequently searches via the address bar in language A but has their UA language set as language B to set their UA language to A)? Do the other factors alluded to rule out a purely UA-based solution?

To your first question:

If a user frequently searches via the address bar, then the UA could for sure learn something about their preferences. But almost all interaction with sites, other than the very special case of search, happens in the page itself and the UA (by design) has a limited idea of what is going on.

To your second:

A hypothetical UA that reads all of the content of your websites and observes what you do could build machine learning models that predict your language preferences. To some extent browsers today attempt to do that, by observing the detected language of visited pages and keeping statistics.

But to rely on the UA and only the UA for this is much too limiting, and actually works against open-ness, decentralization and choice. There is room for smart sites to offer suggestions to the browser to help the user experience, with the user's permission.

@alice
Copy link

alice commented Oct 3, 2019

@chrishtr Thank you for the response!

But almost all interaction with sites, other than the very special case of search, happens in the page itself and the UA (by design) has a limited idea of what is going on.

It would be nice to have an example in the explainer of how this would be used by a non-search website, in that case - the Motivation section seems to be based exclusively around search.

(Also, given the explainer is now longer than a single page, a table of contents would be helpful!)

@chrishtr
Copy link

chrishtr commented Oct 3, 2019

It would be nice to have an example in the explainer of how this would be used by a non-search website, in that case - the Motivation section seems to be based exclusively around search.

Sure. An example: Facebook routinely surfaces web page links users might be interested in exploring. It would set hrefTranslate if its relationship with the user and understanding of the web page indicated it would be a good idea to translate it.

@hadleybeeman
Copy link
Member

We've discussed this on our telecon today. We'd like to invite you for a call, @dtapuska (and anyone else relevant), to talk through what's changed and what want to accomplish here. We'll be in touch offline.

@dtapuska
Copy link
Author

@hadleybeeman Looking back on the feedback you've provided I'm wondering if you think our proposal is to share a user's preferences that are stored on some logged in account with the next site? ie. Is your assumption that the website by setting the attribute is leaking something about user preferences? This is not the scenario we are describing at all. The scenario we are talking about is an empheral user interfacing with a website in some language that can't be determined by the browser and the website is just surfacing the language that the user interacted in.

We've tried to articulate that it is impossible for the browser to solely determine the interaction language (as we've given examples of interaction with speach, and we could provide more). It is unclear to me how you'd iterate on your suggestions to make the browser "smarter" to know what accept-language to send on the subsequent request (or to translate the page into).

@torgo
Copy link
Member

torgo commented Nov 5, 2019

Scheduled for call next week and hopefully you can dial in for that @dtapuska. Alice is going to contact you...

@dbaron
Copy link
Member

dbaron commented Nov 13, 2019

This was discussed in the call today, for which minutes should soon be available.


I wanted to follow up on one point I raised in the discussion today. @chrishtr made the good point that one of the motivating factors here is that (a) many users don't know how to configure browsers or even understand the difference between the browser and the website and (b) the sites they're using (say, Google or Facebook) might be able to figure out through the text that the user types or interacts with what language the user wants to speak. I found this to be the most convincing argument for work in the problem space of having the site suggest the user's desired language.

However, this particular proposal results in each site being allowed to fix things for outbound links from their site, but the user's browser still not knowing the user's preferred language, and thus not being able to suggest the correct translation for all sites. This can push the user towards a more walled-garden view of the web -- if Google is the only site that knows their preferred language, then they can only browse the web starting from Google. If Facebook is the only site that knows the user's preferred language, then they can only browse the web starting from Facebook. But if the underlying mechanism was instead a way for the site to suggest to the browser that it knows something about the user's preferred language, then the browser might be able to ask the user if they prefer another language (perhaps even by asking in both its currently configured language and the newly proposed language).

During the call I suggested the first idea that came to mind, that the site could just directly suggest to the browser what they think the user's preferred language is, and the browser could choose to act or not act on that information. Now I realize that the browser could, if it wanted, do something with the hrefTranslate information and make the same suggestion to the user based on that information. (Would it make sense for the explainer to suggest this possibility?)


There was one other issue @hadleybeeman asked me to raise at the beginning of our discussion and I said I'd rather move on to the other points -- but I never actually managed to come back to it. I think we were also somewhat concerned that many of the fallback options here require javascript. I think it's also worth thinking about the different sorts of fallback that might be desired. I feel like we discussed this issue before but I don't see it above, so I'll state it here (possibly again).

In particular, if the linking site's first preference is to link to https://example.com/ with hrefTranslate=hi... it seems different sites might have different preferences for what should happen as fallback if the browser doesn't support client-side translation. Option (1) might be that the linking site would prefer the user just go to https://example.com/ without any translation or other intervention. Option (2) would be that the linking site would prefer to link to https://translate.google.com/translate?sl=en&tl=hi&u=https%3A%2F%2Fexample.com%2F in order to use server-side translation. Currently the mechanism proposed here defaults to (1), even though I suspect more sites probably want (2), whereas doing fallback type (2) requires Javascript. Perhaps that's OK; it at least keeps the markup "cleaner" from a certain perspective.

At the very least, it seems like the explainer should mention the question of fallback for browsers that don't support the feature, point out that the attribute should only be implemented in implementations that support client-side translation so that feature detection works, and perhaps give an example of feature detection.

I'd also note the explainer currently says:

Automatically fallback to server-side translation if client side isn’t supported

... which isn't really true, since that's not the fallback that happens automatically.

@dtapuska
Copy link
Author

I've updated the explainer. Changes are:

  • table of contents added
  • news aggregation use case (facebook) added
  • feature detection called out and example given
  • privacy section called out that UAs should prompt and reference provided to Chrome's privacy whitepaper.

If you wish further changes please let me know. I hope to send an intent to ship for chrome shortly.

@dbaron
Copy link
Member

dbaron commented Nov 26, 2019

We discussed this in our teleconference today. Based on the discussion we had two weeks ago, we were expecting to see somewhat more change to the explainer. I think a number of TAG members were inclined to close this again as "unsatisfied", but I wanted to say explicitly that we were expecting a bit more change to the explainer so that you had another chance to revise it in response to that feedback.

Currently, the explainer seem to be internally inconsistent on a number of points:

  • whether hrefTranslate is a hint to the user-agent or a command that the site can expect to be reliably followed
  • whether the translation done by a browser implementing hrefTranslate must be client-side, or might instead be server side. (I think this is mainly that the sentence "Note that this translation service could be server-side or client-side, depending on which is available." disagrees with the opposite point made in a number of other places in the explainer.)

We were also hoping to see a little bit more explanation in the "Privacy Considerations" section of the explainer, which is currently three sentences. It seems worth explaining what some of the issues that people might be worried about are and where this proposal does or doesn't have issues that are worth thinking about. For example, translation generally has the issue that the user has to trust the translation service with their browsing history. But beyond that, it seems worth comparing the privacy characteristics of browser-mediated ("client-side") translation (where there's less direct transfer of information from one website to another, but where the website the user is visiting can probably figure out by looking at the DOM or layout results that this user in particular has translated the page to a particular language) with the privacy characteristics of server-mediated translation (where the origin model is broken and the referring site directly invokes the translation service by its URL, but the translated page probably then knows less about the user). I think there was a good bit more discussed when we talked about this two weeks ago, and I don't remember all the details, but I do recall that our conclusion then was that the explainer could say a bit more about the privacy and security considerations to make it clear how the relevant issues had (or hadn't) been considered.

Along similar lines, it also feels worth expanding on the statement that there will be a permission prompt, with a clearer explanation of what it is that the browser needs to ask permission for, and why user consent is needed.

I think it also may be worth expanding on the ecosystem concerns that I raised in the first part of #301 (comment) .

@dtapuska
Copy link
Author

  1. It definitely is only a hint. I thought that was clear in the document where it is presented on a number of occasions. I've adjusted the text where it said "reliably invoke the client side translation" to "reliably invoke the request for client-side". That is the only spot that I read there might be some misunderstanding. Please let me know if there are others from your point of view.

  2. The explainer made reference to server-based and client-based translation and then a subsection of client-based which is server or client based. I've changed the wording to client-based online and offline and provided a section around defintions to help clarify this.

  3. Sorry I was travelling for two weeks and some things got lost. I know we wanted to expand this and I went back to the notes and raw meeting notes from the call, the only reference I found was that me confirming that I'd do that. So when I went to look at things I saw that mostly the things I recall we talked about were relevant to a implementor of a client-based translation service. I've hopefully detailed it a little better for you. Feedback is welcome.

@dbaron
Copy link
Member

dbaron commented Dec 4, 2019

@torgo @hadleybeeman and I are looking at this at a breakout at our Cupertino meeting.

OK, thanks for those updates; the explainer does now appear to be clearer.

It seems like there are still some things we fundamentally disagree about here, but there are also a number of ways this proposal and its explainer have become clearer during this review process, which we appreciate.

It doesn't appear that there's a lot more we can do right now, and it sounds like you're interested in going ahead and experimenting with this in origin trials or something similar, so we think it's best to close this issue for now. However, if you have feedback from origin trials that we would be interested in, feel free to ask us to reopen this issue, or file a new issue.

@dbaron dbaron closed this as completed Dec 4, 2019
@dtapuska
Copy link
Author

dtapuska commented Dec 5, 2019

For the record we've already conducted origin trials for this. We are interested proceeding to ship it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Progress: in progress Resolution: unsatisfied The TAG does not feel the design meets required quality standards Review type: CG early review An early review of general direction from a Community Group Topic: HTML Venue: WHATWG
Projects
None yet
Development

No branches or pull requests

9 participants