-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix malformed CAR panics and excessive memory usage #312
Commits on Jun 30, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 4257e9c - Browse repository at this point
Copy the full SHA 4257e9cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c256c2 - Browse repository at this point
Copy the full SHA 6c256c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3143968 - Browse repository at this point
Copy the full SHA 3143968View commit details -
Configuration menu - View commit details
-
Copy full SHA for c133105 - Browse repository at this point
Copy the full SHA c133105View commit details -
Configuration menu - View commit details
-
Copy full SHA for e36135f - Browse repository at this point
Copy the full SHA e36135fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8004dff - Browse repository at this point
Copy the full SHA 8004dffView commit details -
Configuration menu - View commit details
-
Copy full SHA for f5b91b9 - Browse repository at this point
Copy the full SHA f5b91b9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6dc2ea1 - Browse repository at this point
Copy the full SHA 6dc2ea1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 67ff54f - Browse repository at this point
Copy the full SHA 67ff54fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4e24d90 - Browse repository at this point
Copy the full SHA 4e24d90View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2eea288 - Browse repository at this point
Copy the full SHA 2eea288View commit details -
feat: Refactor indexes to put storage considerations on consumers
There is no way I can make a safe implementation of the parser by slurping thing into memory, indexes people use are just too big. So I made a new API which force consumers to manage that. They can choose to use a bytes.Reader, *os.File, mmaped thing, ...
Configuration menu - View commit details
-
Copy full SHA for f8735e6 - Browse repository at this point
Copy the full SHA f8735e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e4e208 - Browse repository at this point
Copy the full SHA 6e4e208View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3eabc2d - Browse repository at this point
Copy the full SHA 3eabc2dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 04a85e7 - Browse repository at this point
Copy the full SHA 04a85e7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 77de9fe - Browse repository at this point
Copy the full SHA 77de9feView commit details -
Configuration menu - View commit details
-
Copy full SHA for dfbe3f7 - Browse repository at this point
Copy the full SHA dfbe3f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3940bf5 - Browse repository at this point
Copy the full SHA 3940bf5View commit details -
Configuration menu - View commit details
-
Copy full SHA for ec34902 - Browse repository at this point
Copy the full SHA ec34902View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3c9f1d2 - Browse repository at this point
Copy the full SHA 3c9f1d2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3a1b4e8 - Browse repository at this point
Copy the full SHA 3a1b4e8View commit details -
Fix testutil assertion logic and update index generation tests
Update index generation tests to assert indices are identical. Fix minor typo in the test utility name and a bug where the check was not using both index instances to assert they are identical. Also refactor the use of lock in favour of wait group for better readability of the assertion logic.
Configuration menu - View commit details
-
Copy full SHA for 8a5b330 - Browse repository at this point
Copy the full SHA 8a5b330View commit details -
Use a fix code as the multihash code for
CarIndexSorted
Previous changes added the `ForEach` interface to `Index` type which enables iteration through the index by multihash and offset. However, not all index types contain enough information to construct the full multihash. Namely, `CarIndexSorted` only stores the digest portion of the multihashes. In order to implement `ForEach` for this index type correctly uses the `uint64` max value as the code in the multihash and document the behaviour where the iterations over this index type should not rely on the returned code. Note that the max value is used as a code that doesn't match any existing multicodec.Code to avoid misleading users.
Configuration menu - View commit details
-
Copy full SHA for fd7281b - Browse repository at this point
Copy the full SHA fd7281bView commit details -
Remove support for
ForEach
enumeration from car-index-sortedThis index type does not store enough information to satisfy `ForEach`. It only contains the digest of mulithashes and not their code. Instead of some partial functionality simply return an error when `ForEach` is called on this function type. Because, there is no valid use for this index type and the user should ber regenerating the index to the newer `car-multihash-index-sorted` anyway. Update tests to include samples of both types and assert IO operations and index generation for both formats.
Configuration menu - View commit details
-
Copy full SHA for e6d416c - Browse repository at this point
Copy the full SHA e6d416cView commit details
Commits on Jul 1, 2022
-
feat: add Reader#Inspect() function to check basic validity of a CAR …
…and return stats
Configuration menu - View commit details
-
Copy full SHA for 2539ce2 - Browse repository at this point
Copy the full SHA 2539ce2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 708b0a2 - Browse repository at this point
Copy the full SHA 708b0a2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a36603e - Browse repository at this point
Copy the full SHA a36603eView commit details -
Use streaming APIs to verify the hash of blocks in CAR
Inspect
`go-cid` exposes `Sum` API that facilitates calculation of the CID from `[]byte` payload. `go-multihash` now exposes `SumStream` which can calculate digest from `io.Reader` as well as `[]byte`. But, unfortunately the equivalent API does not exist in `go-cid`. To avoid copying the entire block into memory, implement CID calculation using the streaming multihash sum during inspection of CAR payload.
Configuration menu - View commit details
-
Copy full SHA for 965f1f3 - Browse repository at this point
Copy the full SHA 965f1f3View commit details
Commits on Jul 2, 2022
-
Use consistent CID mismatch error in
Inspect
andBlockReader.Next
This reverts the earlier changes to get the message consistent. Note, the CID we expect is the one in the CAR payload, not the calculated CID for the block.
Configuration menu - View commit details
-
Copy full SHA for a274e75 - Browse repository at this point
Copy the full SHA a274e75View commit details -
Benchmark
Reader.Inspect
with and without hash validationBenchmark the `Reader.Inspect` with and without hash validation using a randomly generated CARv2 file of size 10 MiB. Results from running the benchmark in parallel locally on MacOS `Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz` repeated 10 times: ``` Reader_InspectWithBlockValidation-8 5.30ms ±48% Reader_InspectWithoutBlockValidation-8 231µs ±42% name speed Reader_InspectWithBlockValidation-8 2.08GB/s ±35% Reader_InspectWithoutBlockValidation-8 46.8GB/s ±32% name alloc/op Reader_InspectWithBlockValidation-8 10.7MB ± 0% Reader_InspectWithoutBlockValidation-8 60.7kB ± 0% name allocs/op Reader_InspectWithBlockValidation-8 4.54k ± 0% Reader_InspectWithoutBlockValidation-8 2.29k ± 0% ```
Configuration menu - View commit details
-
Copy full SHA for 641c0f8 - Browse repository at this point
Copy the full SHA 641c0f8View commit details
Commits on Jul 4, 2022
-
Drop repeated package name from
CarStats
Cosmetic refactor to rename `car.CarStats` to `car.Stats`, which looks more fluent when using the API.
Configuration menu - View commit details
-
Copy full SHA for 8696a19 - Browse repository at this point
Copy the full SHA 8696a19View commit details -
Return error when section length is invalid
varint
Return potential error when reading section error as varint. Add test to verify the error is indeed returned. Use `errors.New` instead of `fmt.Errorf` when no formatting is needed in error message.
Configuration menu - View commit details
-
Copy full SHA for bed1297 - Browse repository at this point
Copy the full SHA bed1297View commit details -
Configuration menu - View commit details
-
Copy full SHA for a41506a - Browse repository at this point
Copy the full SHA a41506aView commit details
Commits on Jul 5, 2022
-
Revert changes to
index.Index
while keeping most of security fixesRevert the changes to `index.Index` interface such that it is the same as the current go-car main branch head. Reverting the changes, however, means that unmarshalling untrusted indices is indeed dangerous and should not be done on untrusted files. Note, the `carv2.Reader` APIs are changed to return errors as well as readers when getting `DataReader` and `IndexReader`. This is to accommodate issues detected by fuzz testing while removing boiler plate code in internal IO reader conversion. This is a breaking change to the current API but should be straight forward to roll out. Remove index fuzz tests and change inspection to only read the index codec instead of reading the entire index.
Configuration menu - View commit details
-
Copy full SHA for a971f7c - Browse repository at this point
Copy the full SHA a971f7cView commit details
Commits on Jul 6, 2022
-
Revert changes to
insertionindex
Revert changes to serialization of `insertionindex` postponed until the streaming index work stream.
Configuration menu - View commit details
-
Copy full SHA for dec4ca1 - Browse repository at this point
Copy the full SHA dec4ca1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 80bb0d5 - Browse repository at this point
Copy the full SHA 80bb0d5View commit details -
Configuration menu - View commit details
-
Copy full SHA for d68cd32 - Browse repository at this point
Copy the full SHA d68cd32View commit details