Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ConsolidateBlocks further #10467

Merged
merged 18 commits into from
Aug 10, 2023
Merged

Conversation

raynelfss
Copy link
Contributor

Follow-up to #10365.

Summary

These commits aim to speed up the ConsolidateBlocks pass further by doing all matrix operations using Rust as a backend. For this, we added the function blocks_to_matrix to the _accelerate crate. The function receives all the matrix representations of each operation in a block and performs the matrix expansion and multiplication.

@raynelfss raynelfss requested a review from a team as a code owner July 21, 2023 18:15
@qiskit-bot qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Jul 21, 2023
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

@mtreinish mtreinish added performance Changelog: None Do not include in changelog Intern PR PR submitted by IBM Quantum interns and removed Community PR PRs from contributors that are not 'members' of the Qiskit repo labels Jul 21, 2023
@coveralls
Copy link

coveralls commented Jul 21, 2023

Pull Request Test Coverage Report for Build 5816430906

  • 47 of 51 (92.16%) changed or added relevant lines in 5 files are covered.
  • 5 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.006%) to 87.26%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/convert_2q_block_matrix.rs 38 42 90.48%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 93.76%
crates/qasm2/src/lex.rs 4 90.63%
Totals Coverage Status
Change from base Build 5812842422: 0.006%
Covered Lines: 74290
Relevant Lines: 85136

💛 - Coveralls

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this further, Ray!

You might want to check your benchmarking results again with this - sometimes you might find unexpected results, like Rust taking more time. There's some overhead when moving from Rust-space to Python-space and vice-versa that we have to take care to optimise as well.

Comment on lines 24 to 27
#[pyo3(text_signature = "(matrix_list, /")]
pub fn blocks_to_matrix(
py: Python,
op_list: Vec<(PyReadonlyArray2<Complex64>, Vec<usize>)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor (and it's particularly unimportant because the / in the text signature implies we can't use named arguments), but the text signature doesn't match the function. I think in this case, PyO3 will probably actually populate the __text_signature__ attribute correctly without being told, but it's worth checking.

crates/accelerate/src/convert_2q_block_matrix.rs Outdated Show resolved Hide resolved
crates/accelerate/src/convert_2q_block_matrix.rs Outdated Show resolved Hide resolved
crates/accelerate/src/convert_2q_block_matrix.rs Outdated Show resolved Hide resolved
crates/accelerate/src/convert_2q_block_matrix.rs Outdated Show resolved Hide resolved
crates/accelerate/src/convert_2q_block_matrix.rs Outdated Show resolved Hide resolved
Small modifications based on previous reviews were made.
- Added `change_basis` function to switch the qubits in a 2-qubit gate.
@mtreinish mtreinish linked an issue Aug 1, 2023 that may be closed by this pull request
Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, Ray. Have you had chance to run the timings, to see what sorts of speed-ups we should expect from this?

Comment on lines 30 to 37
let input_matrix = op_list[0].0.as_array();
let mut matrix: Array2<Complex64> = match op_list[0].1.as_slice() {
[0] => kron(&identity, &input_matrix),
[1] => kron(&input_matrix, &identity),
[0, 1] => input_matrix.to_owned(),
[1, 0] => change_basis(input_matrix),
_ => unreachable!(),
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor because we control the inputs, but perhaps it would be safer to detect if the input list is zero length, and return the 2q identity? Currently the code would panic if given an empty list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add that as one of the cases in that match, however, it is highly unlikely to ever happen due to the way _blocks_to_matrix calls this function. But better be safe than sorry.

Comment on lines +55 to +66
/// Switches the order of qubits in a two qubit operation.
fn change_basis(matrix: ArrayView2<Complex64>) -> Array2<Complex64> {
let mut trans_matrix: Array2<Complex64> = matrix.reversed_axes().to_owned();
for index in 0..trans_matrix.ncols() {
trans_matrix.swap([1, index], [2, index]);
}
trans_matrix = trans_matrix.reversed_axes();
for index in 0..trans_matrix.ncols() {
trans_matrix.swap([1, index], [2, index]);
}
trans_matrix
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clones the matrix, but we'd be able to mutate the one we're given if we took in an ArrayViewMut2, since we never need it in the other form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there's a problem with doing so, since we're receiving PyReadOnlyArray from Python, which doesn't implement the DataMut trait that's necessary to turn the views to mutable views by calling view_mut().
One way of avoiding that is by using the as_array_mut() method to obtain an ArrayViewMut2 right away. However, this is an unsafe operation, and I'm not sure we'd want to include unsafe code for this solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was kind of meaning that we'd take in a PyReadwriteArray2 at the top level instead, but really, I wouldn't worry about it - I think it'd need fairly significantly changes to how we're handling the iteration here to make it borrow-safe.

Comment on lines 39 to 50
let op_matrix = op_matrix.as_array();
let q_list = q_list.as_slice();
let result = match q_list {
[0] => Some(kron(&identity, &op_matrix)),
[1] => Some(kron(&op_matrix, &identity)),
[1, 0] => Some(change_basis(op_matrix)),
_ => None,
};
matrix = match result {
Some(result) => result.dot(&matrix),
None => op_matrix.dot(&matrix),
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, very minor, but if you're interested in style, this could equally be written as something like

let op_matrix = op_matrix.as_array();
let left = match q_list.as_slice() {
    [0] => kron(&identify, &op_matrix),
    [1] => kron(&op_matrix, &identity),
    [1, 0] => change_basis(op_matrix),
    [0, 1] => op_matrix,
    _ => unreachable!(),
};
matrix = left.dot(&matrix);

This is mostly just stylistic, though, so there's no need to do it if you don't want to / think your way is neater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of the extra q_list variable. However, when it comes to that [0,1] case in the match function, the way the current approach is made without a [0,1] case is to be able to continue without necessarily copying the op_matrix (since it's an ArrayView2 object, it would need to be returned as an ArrayOwned2.)

Just a funky rust way of avoiding copies here :v

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my mistake, I'd misread op_matrix.as_array() as being a clone operation (which tbh if it had been, I should have commented on it!).

@raynelfss
Copy link
Contributor Author

Thanks for the changes, Ray. Have you had chance to run the timings, to see what sorts of speed-ups we should expect from this?

@jakelishman It's a very minor speedup, when running a 1024-qubit, 128-depth circuit, the runtime with the current Qiskit release is an average of 51 seconds, with this speedup the runtime is an average of 43 seconds.

@jakelishman
Copy link
Member

I wouldn't call that minor! That's about 15%, which is pretty good going for a performance PR that isn't fundamentally changing the algorithm or the memory usage (significantly).

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ray - this looks good to me! Let's get this speed-up merged for sure.

Comment on lines +55 to +66
/// Switches the order of qubits in a two qubit operation.
fn change_basis(matrix: ArrayView2<Complex64>) -> Array2<Complex64> {
let mut trans_matrix: Array2<Complex64> = matrix.reversed_axes().to_owned();
for index in 0..trans_matrix.ncols() {
trans_matrix.swap([1, index], [2, index]);
}
trans_matrix = trans_matrix.reversed_axes();
for index in 0..trans_matrix.ncols() {
trans_matrix.swap([1, index], [2, index]);
}
trans_matrix
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was kind of meaning that we'd take in a PyReadwriteArray2 at the top level instead, but really, I wouldn't worry about it - I think it'd need fairly significantly changes to how we're handling the iteration here to make it borrow-safe.

Comment on lines 39 to 50
let op_matrix = op_matrix.as_array();
let q_list = q_list.as_slice();
let result = match q_list {
[0] => Some(kron(&identity, &op_matrix)),
[1] => Some(kron(&op_matrix, &identity)),
[1, 0] => Some(change_basis(op_matrix)),
_ => None,
};
matrix = match result {
Some(result) => result.dot(&matrix),
None => op_matrix.dot(&matrix),
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my mistake, I'd misread op_matrix.as_array() as being a clone operation (which tbh if it had been, I should have commented on it!).

@jakelishman jakelishman added this pull request to the merge queue Aug 10, 2023
Merged via the queue into Qiskit:main with commit f6207c0 Aug 10, 2023
13 checks passed
SamD-1998 pushed a commit to SamD-1998/qiskit-terra that referenced this pull request Sep 7, 2023
* Initial commit: Added convert_2q_block_rust

* Feat: Perform final matrix multiplication in rust

* Feat: `blocks_to_matrix` performs all operations

* Fix: Prevent certain copying in Rust.

* Fix: Adapted to newest changes

* Fix: Small modifications to blocks_to_matrix

Small modifications based on previous reviews were made.

* Fix: Basis change without Matrix Multiplication.
- Added `change_basis` function to switch the qubits in a 2-qubit gate.

* Remove:`calculate_matrix` function

* Fix: Use swap to swap rows

* Fix: Skip initial matrix multiplication.

* Fix: Deal with empty list edege case
@raynelfss raynelfss deleted the consolidate_rust branch October 7, 2024 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog Intern PR PR submitted by IBM Quantum interns performance
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Speed up ConsolidateBlocks pass
5 participants