Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

可以考虑使用字典树 #212

Closed
ChunelFeng opened this issue Feb 12, 2021 · 14 comments
Closed

可以考虑使用字典树 #212

ChunelFeng opened this issue Feb 12, 2021 · 14 comments

Comments

@ChunelFeng
Copy link
Member

朋友,不能光是分治或者hash啊。url的长度差距不大,而且前面几个字符,绝大部份相同。
这种情况,非常适合使用 字典树(trie tree) 这种数据结构来进行存储。
降低存储成本的同时,提高查询效率。

@yanglbme
Copy link
Member

@ChunelFeng 很好的思路,我补充一下,谢谢反馈

@KnightHONG
Copy link

几百g的文件,不能同时加载到内存,那是怎么能分成一个个4g大小的文件的。
如果说一个几百g的文件能一点点加载进内存,那为什么还要划分成4g大小的文件,然后再一个个4g文件的导入内存做操作?
直接一点点的加载,做hash不是更快吗?

@newBe1
Copy link

newBe1 commented Feb 25, 2022 via email

@KnightHONG
Copy link

我问写这篇文章的人呀

@KnightHONG
Copy link

why not reply me, 百g文件为啥不一次次加载4g内容做哈希,反而要先分成4g文件再一个个哈希,这不是浪费时间嘛?

@zhrgithub
Copy link

zhrgithub commented Feb 25, 2022 via email

@zhrgithub
Copy link

zhrgithub commented Feb 25, 2022 via email

@xiaopan1916
Copy link

xiaopan1916 commented Jul 18, 2022 via email

@zhFuture
Copy link

zhFuture commented Jul 18, 2022 via email

@yongroot
Copy link

yongroot commented Jul 18, 2022 via email

@nielongguang
Copy link

nielongguang commented Jul 18, 2022 via email

@onions1111
Copy link

onions1111 commented Jul 18, 2022 via email

@zgczjc
Copy link

zgczjc commented Jul 18, 2022 via email

@doocs doocs deleted a comment from fireinrain Jul 18, 2022
@biggerboy
Copy link

全是自动回复

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests