-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add map gender function to utils.py #84
feat: add map gender function to utils.py #84
Conversation
Thank you for the pull request!The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :) If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you! Maintainer checklist
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for taking this on @ikeadeoyin! 🚀
Added some comments to the PR! ✌️
PATH_TO_SCRIBE_ORG = os.path.dirname(sys.path[0]).split("Scribe-Data")[0] | ||
PATH_TO_SCRIBE_DATA_SRC = f"{PATH_TO_SCRIBE_ORG}Scribe-Data/src" | ||
sys.path.insert(0, PATH_TO_SCRIBE_DATA_SRC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I guess I'm a little confused as to why the above is needed. Does something not work without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem in loading the function into the file that was discussed on Matrix. I can take a bit more of a look into this and try to find a solution :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unable to import the map_genders
function in the utils.py
file without these lines. It gives an error message (ModuleNotFound: no module named scribe_data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll work to figure this out after the merge, @ikeadeoyin :)
@@ -420,3 +420,21 @@ def check_and_return_command_line_args( | |||
python {all_args[0]} '["comma_separated_sets_in_quotes"]' | |||
""" | |||
) | |||
|
|||
def map_genders(wikidata_gender): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding #69 correctly, it seems that what we'd rather want here actually is that map_genders()
also works for all supported languages. There are other map_genders()
functions for other languages that exist elsewhere in the code, as pointed out in the issue. The changes here only address the map_genders()
for German.
So to make the function general and reconcile all supported languages, it could make sense to modify map_genders(wikidata_gender)
into something like map_genders(wikidata_gender, iso_code)
instead perhaps - where iso_code
is "en"
, "fr"
, "es"
, etc. We can then use the iso_code
for the language-specific checks, such as "Q499327"
here for German masculine.
Let me know if there are any questions regarding the above 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for the review, it was helpful in understanding the codebase. I can see that we have map_genders()
function for other languages.
I have tried implementing this feedback however on running update.py
file, I got a couple of files modified and a new file created(src/scribe_data/extract_transform/languages/German/nouns/nouns_queried.json
).
Can I commit these extra files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not commit the extra files like nouns_queried.json
, @ikeadeoyin :) These should have been deleted by the format_nouns.py
file in the German directory. The way it works is that the query results are brought down, they're formatted and saved in the final version, and then the original file is deleted. Let's just have this focus on the Python files 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright @andrewtavis,
Following the contribution guide, I have committed the Python files however, I have this error
Scribe-Data git:(add-map-gender-function) ✗ git pull --rebase upstream main
error: cannot pull with rebase: You have unstaged changes.
error: Please commit or stash them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ikeadeoyin 👋
Are the unstaged changes the *.json
files?
If so, I guess you could either delete them from your local machine or run git stash
to stash them so that your working directory is clean before pulling
src/scribe_data/utils.py
Outdated
elif wikidata_gender in ["neuter", "Q1775461"]: | ||
return "N" | ||
else: | ||
return "" # nouns could have a gender that is not valid as an attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
Add a newline at the end of file
There are some uncommon issues that can occasionally occur with the absence of an ending newline 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do this, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have the weird sys import magic at the top of the files, but this will be worked out when I got through and convert all of the formatting processes over to being lexeme based rather than string based 😊 Thanks for all the work here, @ikeadeoyin! Great to be improving the packages structure like this 🚀
Contributor checklist
Description
Added the map_gender function to
utils.py
Related issue