You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Really great job on kicking off the wordcount feature Tejas! Excited to see you making progress so fast. Some suggestions on next steps:
It looks like the current script produces a csv of wordcounts for an input list of keywords. I think what we're looking for is rather, a csv of wordcounts for all words present in the .tsv file (after they have been normalized with clean_and_filter). Let me know if you have questions about this
Excellent to see type annotations! Can you also add docstrings please?
rename the file to snakecasing (I have a bad habit of camelcasing .ipynb files but I think python files should be lowercased; eventually we will move several of these functions into a library)
Again, great job!! Let me know if you have any questions or if these suggestions don't make sense.
The text was updated successfully, but these errors were encountered:
Nice! One thing to consider is whether it would be better to run clean_and_filter on each input sentence, though (with some refactoring). In other words, will keyword_set have separate entries for lowercase and uppercase words, for example?
It might be helpful to also create a unit test to test some corner cases for this script, and also to document some shortcomings that we aren't currently addressing (such as not combining word stems, which is fine for now). For example:
input="""#sentenceThree apples, three oranges 3 pears & one more pear"""
Really great job on kicking off the wordcount feature Tejas! Excited to see you making progress so fast. Some suggestions on next steps:
.tsv
file (after they have been normalized withclean_and_filter
). Let me know if you have questions about this__main__
(link)sys.argv
since you're usingargparse
black
(https://github.com/psf/black).ipynb
files but I think python files should be lowercased; eventually we will move several of these functions into a library)Again, great job!! Let me know if you have any questions or if these suggestions don't make sense.
The text was updated successfully, but these errors were encountered: