-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cookies consent page #1
Comments
Isn't this an already old issue, which has been solved before? Last time I built a YouTube scrape thing, I already applied this workaround. Indeed. https://stackoverflow.com/a/66940841 Obviously, rejecting cookies should be preferred... |
@theAkito Sure. The issue here is not really how to "solve" it, but how to implement it (ethically speaking). Even if it's scoped into requests session, users has to be aware of the implicit "consent" (in my opinion). EDIT: Sure, rejecting would be the best option, but I didn't succeed baking a reject cookie. |
I would go against anything, that would restrict or complect the end-user experience. For example, the argument for consenting shouldn't be there. I know some programs, like for example ones related to Let's Encrypt, that require the user or administrator to specifically consent to an EULA or something. It's just terrible user experience. I think, the only reasonable option, that makes sense, at all, is to put a big fat disclosure -- which we put up only for cosmetics, since everyone using Google should have the two brain cells, that tell them the obvious, which is that Google & therefore YouTube is a black hole for user data -- into the README or the product's description & then apply & at least try to reject the consent. That's it. In no way should the user experience be diminished by a CLI argument or manually required configuration change, forcing the user to waste time on something useless, because everyone ought to know what Google & YouTube is. If someone is using this tool to get something off YouTube, it ought to be absolutely obvious, that data is sucked into a black hole, either way. If we are going to ask people for consent regarding stuff like this, we might as well start explaining how to breathe air, eat food & drink water. It's stuff everyone has to know. Period. |
Thanks for taking time to share your toughts. I agree, you're totally right. The "big fat disclosure" into the README seems a reasonable way to adress this issue. I'll try to understand how the reject cookie is working. |
By analysing both "Accept all" and "Reject all" POST requests, I was able to bake a valid "Reject all" cookie. "Accept all" POST parameters:
"Reject all" POST parameters:
We can see that set_eom is set to true and both set_ytc and set_apyt are missing from a "Reject all" request. By declaring a global requests session variable and making a POST request to get a valid "Reject all" consent cookie before making any other requests, we can address this issue:
If you loop over requests's session CookieJar you can see that we just have 4 (against at least 6 when we "Accept all"):
BTW, SOCS cookie seems to contain the consent rejection (needs confirmation). Obviously, the POST parameters are suceptible to be modified in the future (but I've no ideas about the modification cycle). Especially the bl parameter as it seems to contains a server version with a date (actually boq_identityfrontenduiserver_20230514.09_p0). EDIT: The bl value and others are obviously inside the cookies consent webpage (hidden inputs), maybe we can extract them from the webpage to be "sure" that we've the right value for the right parameter. |
Hi,
First, thanks for this tool, really useful.
As reported on HN by Europe users, it exists a YouTube cookies consent page that blocks channel_id retrieving (first) and consequently, all other requests.
I already faced this issue and adding a cookie indicating that consent has been given to a requests session can "solve" this.
In order to respect the initial goal of this consent page, we can ask the user to give its consent through a CLI argument like so:
It's just a suggestion as it can also be a question that prompt in CLI during download but this require to know that the user is in Europe (or it can apply to all users but it can be annoying if it's not really needed after all).
I tried to analyse "Reject all" selection behavior but the CONSENT cookie's content is still PENDING+{RANDOM NUMBER} (perhaps not random from Google's POV but I couldn't explain this value) so from my point of view only "Accept all" is "working".
Do you have any thoughts about this?
Kind regards,
The text was updated successfully, but these errors were encountered: