-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAN.initialize_from_another_model() error:Expected input and other to have the same dtype, but got input's dtype Float and other's dtype Double #146
Comments
Hi it seems like in your first input tensor, the first inputs are zero for all samples. This can lead to a singular problem. You could remove the first column and create a KAN which takes in only the second dimension. |
Hi, |
I've made changes based on #129 similar issue and the problem persists. But I made sure that entering the dtype as float32 solved the problem. Thanks to |
|
Please give #148 a try; it's designed to address the issue you're facing. Note that #129 only resolved the problem with |
input is tensor([[ 0.0000, -0.2120],
[ 0.0000, -0.0247],
[ 0.0000, 0.2150],
...,
[ 0.0000, 0.7221],
[ 0.0000, -0.6781],
[ 0.0000, 0.3832]], dtype=torch.float64);
model = KAN(width=[2,3,1], grid=3, k=3)
train loss: 5.90e-01 | test loss: 6.14e-01 | reg: 1.03e+01 : 100%|██| 20/20 [00:08<00:00, 2.37it/s]
model.train(dataset, opt="LBFGS", steps=20,)
initialize a more fine-grained KAN with G=10
model2 = KAN(width=[2,3,1], grid=10, k=3)
initialize model2 from model
model2.initialize_from_another_model(model, dataset['train_input']);
RuntimeError Traceback (most recent call last)
Cell In[7], line 4
2 model2 = KAN(width=[2,3,1], grid=10, k=3)
3 # initialize model2 from model
----> 4 model2.initialize_from_another_model(model, dataset['train_input'])
File E:\jupyter\KAN\pykan-master\pykan-master\kan\KAN.py:196, in KAN.initialize_from_another_model(self, another_model, x)
193 another_model(x.to(another_model.device)) # get activations
194 batch = x.shape[0]
--> 196 self.initialize_grid_from_another_model(another_model, x.to(another_model.device))
198 for l in range(self.depth):
199 spb = self.act_fun[l]
File E:\jupyter\KAN\pykan-master\pykan-master\kan\KAN.py:275, in KAN.initialize_grid_from_another_model(self, model, x)
273 model(x)
274 for l in range(self.depth):
--> 275 self.act_fun[l].initialize_grid_from_parent(model.act_fun[l], model.acts[l])
File E:\jupyter\KAN\pykan-master\pykan-master\kan\KANLayer.py:253, in KANLayer.initialize_grid_from_parent(self, parent, x)
251 x_pos = parent.grid
252 sp2 = KANLayer(in_dim=1, out_dim=self.size, k=1, num=x_pos.shape[1] - 1, scale_base=0., device=self.device)
--> 253 sp2.coef.data = curve2coef(sp2.grid, x_pos, sp2.grid, k=1, device=self.device)
254 y_eval = coef2curve(x_eval, parent.grid, parent.coef, parent.k, device=self.device)
255 percentile = torch.linspace(-1, 1, self.num + 1).to(self.device)
File E:\jupyter\KAN\pykan-master\pykan-master\kan\spline.py:137, in curve2coef(x_eval, y_eval, grid, k, device)
135 # x_eval: (size, batch); y_eval: (size, batch); grid: (size, grid); k: scalar
136 mat = B_batch(x_eval, grid, k, device=device).permute(0, 2, 1)
--> 137 coef = torch.linalg.lstsq(mat.to('cpu'), y_eval.unsqueeze(dim=2).to('cpu')).solution[:, :, 0] # sometimes 'cuda' version may diverge
138 return coef.to(device)
RuntimeError: torch.linalg.lstsq: Expected input and other to have the same dtype, but got input's dtype Float and other's dtype Double.
In example_1_function fitting, input is tensor([[-0.0075, 0.5547], [[-0.0075, 0.5547],
The text was updated successfully, but these errors were encountered: