You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to figure out a way to reliably prevent bad actors from abusing __data.json in my SvelteKit website. Especially for parts of the website that are not behind login. For example list of the products on e-commerce website.
My main concern is regarding making requests directly to this file, instead of going through the actual websites. Possible instances where this file could be abused is web scraping and passthrough proxy for API or database (depending what backend is serving in this file) without any protection.
Of course website can be scraped by bots and the data that we passed somehow securely can be scraped. But with dynamically generated website like the ones using Svelte it requires bot that is capable of parsing JS content. Which in general can be slower than downloading purely static website. So it is some form of deterrent.
Another deterrent with static website is the fact that HTML has to be parsed. Again, it's not super difficult, but requires at least some effort to properly detect elements that we want to extract from the website.
I am very much aware that 100% secure method doesn't exist (especially if we provide data for not registered users), but at the same time lack of some good enough approach is a bit disappointing.
I have read discussion #8847 but solutions provided there are not very helpful:
Only return the data you need. That is still data that is going to be displayed in __data.json file anywhere.
Disable client side navigation entirely. I've been especially interested in handler idea provided:
// src/hooks.server.js
/** @type {import('@sveltejs/kit').[Handle](https://kit.svelte.dev/docs/types#public-types-handle)} */
export async function handle({ event, resolve }) {
// true when the request is for a `__data.json` endpoint
// https://kit.svelte.dev/docs/types#public-types-requestevent
if (event.isDataRequest) {
return new Response(null, { status: 400 });
}
const response = await resolve(event);
return response;
}
But this is solution for very specific set of pages. Mainly the ones that doesn't require any form of filtering or pagination that could be provided via GET request. And can be extremely easy "broken" just by adding something behind __data.json.
I have been also thinking about using POST only instead of GET but this will fail the moment user will refresh page I feel. Fail in the sense that whatever the user has been doing will refresh page to the default state. And for example storing latest POST request in cookie is not good idea because if user will close page intentionally and return later cookie will kick-in and prevent from seeing the entry page.
So what other options are there? It's hard for me to believe that any website using SvelteKit is always creating open API to any data they are serving to the frontend. But maybe I'm just naive.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm trying to figure out a way to reliably prevent bad actors from abusing __data.json in my SvelteKit website. Especially for parts of the website that are not behind login. For example list of the products on e-commerce website.
My main concern is regarding making requests directly to this file, instead of going through the actual websites. Possible instances where this file could be abused is web scraping and passthrough proxy for API or database (depending what backend is serving in this file) without any protection.
Of course website can be scraped by bots and the data that we passed somehow securely can be scraped. But with dynamically generated website like the ones using Svelte it requires bot that is capable of parsing JS content. Which in general can be slower than downloading purely static website. So it is some form of deterrent.
Another deterrent with static website is the fact that HTML has to be parsed. Again, it's not super difficult, but requires at least some effort to properly detect elements that we want to extract from the website.
I am very much aware that 100% secure method doesn't exist (especially if we provide data for not registered users), but at the same time lack of some good enough approach is a bit disappointing.
I have read discussion #8847 but solutions provided there are not very helpful:
But this is solution for very specific set of pages. Mainly the ones that doesn't require any form of filtering or pagination that could be provided via
GET
request. And can be extremely easy "broken" just by adding something behind__data.json
.I have been also thinking about using
POST
only instead ofGET
but this will fail the moment user will refresh page I feel. Fail in the sense that whatever the user has been doing will refresh page to the default state. And for example storing latestPOST
request in cookie is not good idea because if user will close page intentionally and return later cookie will kick-in and prevent from seeing the entry page.So what other options are there? It's hard for me to believe that any website using SvelteKit is always creating open API to any data they are serving to the frontend. But maybe I'm just naive.
Beta Was this translation helpful? Give feedback.
All reactions