Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: resolve type errors/add type annotations #151

Merged
merged 5 commits into from
May 30, 2024
Merged

fix: resolve type errors/add type annotations #151

merged 5 commits into from
May 30, 2024

Conversation

jsstevenson
Copy link
Contributor

No description provided.

@jsstevenson jsstevenson marked this pull request as ready for review May 28, 2024 12:29
@jsstevenson jsstevenson requested a review from a team as a code owner May 28, 2024 12:29


def translate_db2api(namespace, alias):
def translate_db2api(namespace: str, alias: str) -> list[tuple[str, Optional[str]]]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to create an enum for namespace in a separate issue (if it doesn't already exist)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe -- I think that makes usage of the store function complicated because users might want to add variations under arbitrary/custom namespaces

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Maybe just an enum for the sources that are used frequently in seqrepo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or maybe a predefined union of constants? otherwise you have to annotate everything as namespace_enum | str and handle both kinds of types every time

src/biocommons/seqrepo/_internal/logging_support.py Outdated Show resolved Hide resolved
@@ -55,14 +56,14 @@ def translate_api2db(namespace, alias):
return [
("VMC", "GS_" + alias if alias else None),
]
if namespace == "ga4gh":
if namespace == "ga4gh" and alias is not None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we don't just return an empty list at the beginning if alias is none?

Copy link
Contributor Author

@jsstevenson jsstevenson May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had a hard time with this and have gone through it again. The call stacks that end up in this method are kind of messy, but I think this is the reason for the way it is right now:

  1. store and its descendant functions shouldn't ever be passing None for an alias because it's a non-nullable field within the DB schema

  2. find_aliases uses None for all fields as a wildcard indicator. So if I call this method with ("refseq", None), that means I want to be running a search for all aliases with namespace "NCBI" (if we return an empty list if alias is None, then this lookup would be erroneously turned into "get all aliases regardless of namespace").

The change here is to ensure that we don't get a TypeError on the next line if alias is None. Given the structure of the 3 lines above this condition, I'm not sure if it should look like

    if namespace == "ga4gh" and alias is not None:
        return [
            ("VMC", "GS_" + alias[3:]),
        ]

or

    if namespace == "ga4gh":
        return [
            ("VMC", "GS_" + alias[3:] if alias else None),
        ]

either way, I updated the typing on the store_alias side of the call stack above this method to ensure that it doesn't permit alias to be None on its end -- but it does need to be able to return a nonempty list even when alias is None sometimes

src/biocommons/seqrepo/fastadir/fastadir.py Show resolved Hide resolved
"""Creates a new sequence repository if necessary, and then opens it"""

self._root_dir = root_dir
self._db_path = os.path.join(self._root_dir, "db.sqlite3")
self._writing = None
self._db = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as other PR, should this be separated out in a different issue?

Copy link
Contributor Author

@jsstevenson jsstevenson May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of intended this PR to be that issue. This is a line that doesn't do anything except establish a flawed type annotation that becomes problematic for how the entire rest of the class is written. I originally addressed it in the other issue only because its existence makes pyright (checked as part of precommit) explode if you touch anything else in this module, but the only reason it needs to change at all is for type checking reasons, not because it affects function (since the variable is reassigned to a real value four lines down).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. I didn't pull anything up in my editor so couldn't see any warnings and it just appeared that you were deleting unused code since you were already in that module

src/biocommons/seqrepo/seqrepo.py Show resolved Hide resolved
korikuzma
korikuzma previously approved these changes May 30, 2024
@jsstevenson jsstevenson merged commit 09a6ede into main May 30, 2024
8 checks passed
@jsstevenson jsstevenson deleted the types branch May 30, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants