Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up search code and unify function returned values #91977

Merged
merged 2 commits into from
Dec 18, 2021

Conversation

GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Dec 15, 2021

This PR is a cleanup: there is no changes in the search results or in the UI.

Depending if it was "literal search" or not, it was either returning booleans or integers. It's pretty bad so instead it all returns integers.

Another thing I did was to move the add and checks into a addIntoResults function to simplify things.

Last thing: I removed a loop in the sortResults function and moved its code directly into the first loop.

All these changes are done to make #90630 much smaller.

r? @jsha

@GuillaumeGomez GuillaumeGomez added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. A-rustdoc-search Area: Rustdoc's search feature A-rustdoc-js Area: Rustdoc's JS front-end labels Dec 15, 2021
@rust-highfive
Copy link
Collaborator

Some changes occurred in HTML/CSS/JS.

cc @GuillaumeGomez

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 15, 2021
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

for (var entry in results) {
if (hasOwnPropertyRustdoc(results, entry)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't have to be part of this PR, but: we are allowed to use ECMASCript 5 features, right? Seems like this should be results.forEach(function(entry) { ... }).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Just like you suggested, I'll do it in another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we can't use forEach on dict/Object... T_T

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use Object.entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but I read that the performance were less good than a simple for loop. To be investigated...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there was a post in 2018 saying it was slower: https://medium.com/hackernoon/5-techniques-to-iterate-over-javascript-object-entries-and-their-performance-6602dcb708a8

But it had a followup, also in 2018, saying Object.keys() is now faster: https://gists.cwidanage.com/2018/06/how-to-iterate-over-object-entries-in.html

I'm kinda surprised that there would be any significant difference given how similar the operations are and how optimized JS runtimes are.

Comment on lines 370 to 372
// Something wasn't found and this is a literal search so
// abort and return 1.
return 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comment for this function (checkType) so it explains the type of its parameters and return value? And what the return value means? That is, what does it mean when this function returns 1 vs 0 vs something else?

Reading the code, it looks like this still treats 0 or 1 as special "truthy" or "falsy" values, but the function can also return a levenshtein distance, which could be 0, 1, or some other number. It seems like you've fixed the type problem (returning different types based on code path), but not the underlying conceptual problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I should return MAX_LEV_DISTANCE + 1 just like everywhere else.

src/librustdoc/html/static/js/search.js Show resolved Hide resolved
src/librustdoc/html/static/js/search.js Show resolved Hide resolved
@GuillaumeGomez
Copy link
Member Author

Updated!

Copy link
Contributor

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you forgot to add a function comment on checkType?

@@ -173,19 +173,17 @@ window.initSearch = function(rawSearchIndex) {
}

function sortResults(results, isType) {
var result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this down to line 180 and combine it with the assignment:

var result = results[entry];

Comment on lines 431 to 434
// This function looks if the object (`obj`) has an argument with the given type (`val`).
//
// Returns the lev distance of the best match it found (if there are inputs), otherwise
// returns MAX_LEV_DISTANCE + 1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This function looks if the object (`obj`) has an argument with the given type (`val`).
//
// Returns the lev distance of the best match it found (if there are inputs), otherwise
// returns MAX_LEV_DISTANCE + 1`.
// This function checks if the object (`obj`) has an argument with the given type (`val`).
//
// Returns a Levenshtein distance to the best match. If there is no match, returns MAX_LEV_DISTANCE + 1.

//
// `id` is the index in both `searchWords` and `searchIndex` arrays for this element.
//
// `index` is used as a value when sorting results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is used as a value when sorting results" doesn't tell me much. :-) Can you explain more? What type is it? How is it used in sorting?

// * If it is a "literal search" (`isExact`), then `lev` must be 0.
// * If it is not a "literal search", `lev` must be <= `MAX_LEV_DISTANCE`.
//
// `fullId` is the key used of the object we use for the `res` map.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the type of fullId? What are the expected contents of the res map?

@GuillaumeGomez GuillaumeGomez force-pushed the unify-search-code branch 2 times, most recently from 303b924 to bdd6886 Compare December 16, 2021 19:08
@GuillaumeGomez
Copy link
Member Author

Updated! :)

for (var entry in results) {
if (hasOwnPropertyRustdoc(results, entry)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there was a post in 2018 saying it was slower: https://medium.com/hackernoon/5-techniques-to-iterate-over-javascript-object-entries-and-their-performance-6602dcb708a8

But it had a followup, also in 2018, saying Object.keys() is now faster: https://gists.cwidanage.com/2018/06/how-to-iterate-over-object-entries-in.html

I'm kinda surprised that there would be any significant difference given how similar the operations are and how optimized JS runtimes are.

Comment on lines 344 to 348
// This function checks if the object (`obj`) matches the given type (`val`) and its
// generics (if any).
//
// Returns a Levenshtein distance to the best match. If there is no match, returns
// `MAX_LEV_DISTANCE + 1`.
Copy link
Contributor

@jsha jsha Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define the types of obj, val, and literalSearch? In other places in this file, such parameters are documented using jsdoc syntax, like so:

/**
 * Validate performs the following boolean logic. For example:
 * "File::open" will give IF A PARENT EXISTS => ("file" && "open")
 * exists in (name || path || parent) OR => ("file" && "open") exists in
 * (name || path )
 *
 * This could be written functionally, but I wanted to minimise
 * functions on stack.
 *
 * @param  {[string]} name   [The name of the result]
 * @param  {[string]} path   [The path of the result]
 * @param  {[string]} keys   [The keys to be used (["file", "open"])]
 * @param  {[object]} parent [The parent of the result]
 * @return {boolean}       [Whether the result is valid or not]
 */

We don't use jsdoc AFAIK, but it's a good syntax for formalizing information about parameters and return types. Also worth noting: I'm not sure why the types are wrapped in brackets. But I think it's okay to follow local style, and we can do a pass later to bring all jsdoc comments in line with the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely not a big fan of the syntax but why not. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but I think a syntax that exists and has tooling built for it is much better than no syntax. :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sorry, not possible. It's just impossible to read. I'll add the types but the comment is just not human readable...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel that way even for the properly formatted jsdoc style (from https://jsdoc.app/about-getting-started.html#adding-documentation-comments-to-your-code)?

/**
 * Represents a book.
 * @constructor
 * @param {string} title - The title of the book.
 * @param {string} author - The author of the book.
 */
function Book(title, author) {
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's better! That I can do.

@GuillaumeGomez
Copy link
Member Author

Updated with jsdoc comments.

Comment on lines 725 to 729
var allFound = 0;
for (it = 0, len = inputs.length; allFound === 0 && it < len; it++) {
allFound = checkType(type, inputs[it], true);
}
in_args = allFound;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is pretty confusing to me. I tried to rewrite it with different variable names and a structure that makes its intended meaning clearer (at least I think so!).

Suggested change
var allFound = 0;
for (it = 0, len = inputs.length; allFound === 0 && it < len; it++) {
allFound = checkType(type, inputs[it], true);
}
in_args = allFound;
var firstNonZeroDistance = 0;
for (var it = 0, len = inputs.length; it < len; it++) {
var distance = checkType(type, inputs[it], true);
if (distance > 0) {
firstNonZeroDistance = distance;
break;
}
}
in_args = firstNonZeroDistance;

@GuillaumeGomez
Copy link
Member Author

@jsha Good idea, done as well.

@jsha
Copy link
Contributor

jsha commented Dec 17, 2021

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Dec 17, 2021

📌 Commit 080b926 has been approved by jsha

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 17, 2021
@camelid
Copy link
Member

camelid commented Dec 17, 2021

Doesn't have to be part of this PR, but: we are allowed to use ECMASCript 5 features, right?

I think at this point we can even use ES6, but I'm not certain.

@GuillaumeGomez
Copy link
Member Author

I think at this point we can even use ES6, but I'm not certain.

We mentioned it yes. Let's wait for the year to be over before making the change to ensure that we're not missing something.

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 18, 2021
…askrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#91858 (pass -Wl,-z,origin to set DF_ORIGIN when using rpath)
 - rust-lang#91923 (Remove `in_band_lifetimes` from `rustc_query_impl`)
 - rust-lang#91925 (Remove `in_band_lifetimes` from `rustc_privacy`)
 - rust-lang#91977 (Clean up search code and unify function returned values)
 - rust-lang#92018 (Fix typo in "new region bound" suggestion)
 - rust-lang#92022 (Eliminate duplicate codes of expected_found_bool)
 - rust-lang#92032 (hir: Do not introduce dummy type names for `extern` blocks in def paths)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 8bb238b into rust-lang:master Dec 18, 2021
@rustbot rustbot added this to the 1.59.0 milestone Dec 18, 2021
@GuillaumeGomez GuillaumeGomez deleted the unify-search-code branch December 18, 2021 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-js Area: Rustdoc's JS front-end A-rustdoc-search Area: Rustdoc's search feature S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants