Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fixes and manual import schema customization. #941 #945 #946 #944

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

zstumgoren
Copy link

This PR addresses the openpyxl (#941) and db numeric type bugs (#945), and the addition of support for adding field-level search using the manual import mgmt command (#946).

All changes were tested on a local development version of PANDA, both manually (through web gui) and with the addition of new unittest (full test suite is still green).

Note that the migration updates three models that inherit from the BaseUpload ABC:

  • DataUpload
  • Export
  • RelatedUpload

Additionally, I've verified in our production install that this patch works on a large upload exceeding the Postgres integer threshold; prior to this patch, that upload was failing with the error detailed here.

The openpyxl bug (#941) prevents a successful build of the stack, so combined with the db fix, it might be worth a new patch release (1.1.2) and announcement on the PANDA group. Changes in this PR are listed in the CHANGELOG under 1.1.2.

Let me know if you have questions or want a new PR with tweaks to the code/tests.

@zstumgoren zstumgoren changed the title Update openpyxl dependency. Fixes #941 Update openpyxl dependency and fix db column size bug. Fixes #941 #945 Dec 24, 2014
…schema overrides

* Add schema override option to manual_import command
* Update Dataset.import_data and utils code to support schema overrides
* Add test for schema override
* Update docs to reflect schema override option
@zstumgoren zstumgoren changed the title Update openpyxl dependency and fix db column size bug. Fixes #941 #945 Update openpyxl dependency and fix db column size bug. #941 #945 #946 Jan 12, 2015
@zstumgoren zstumgoren changed the title Update openpyxl dependency and fix db column size bug. #941 #945 #946 Bug fixes and manual import schema customization. #941 #945 #946 Jan 12, 2015
@zstumgoren
Copy link
Author

Updating PR to include addition of schema customization support for manual import mgmt command (#946).

@zstumgoren
Copy link
Author

@JoeGermuska Could I trouble you to test a theory for me by trying both of the manual tests below and letting me know the results?

  • Scenario 1: Use snake case for column names in the source data file and overrides file
  • Scenario 2: Use the original column names (title case with spaces) in both source data and overrides files

@JoeGermuska
Copy link
Member

Your "Scenario 1" is what we did with the files you sent me, right?

I just tested Scenario 2 with the original file and got the desired behavior.

I made one comment on the docs in the PR issue, but otherwise, this tests out for me.

@zstumgoren
Copy link
Author

Yep, scenario 1 is the CA data; Scenario 2 is the Cook data.

* Note that override field names must precisely match field names in source data
* Note unexpected behavior of type inference on dollar-sign prefixed fields
* Flesh out workflow bits related to experimenting with slice of data
@zstumgoren
Copy link
Author

@JoeGermuska manual_import mgmt cmd docs are updated per our discussion on c89d607; your upstream changes to FAQ and reqs.txt (#941) have been integrated.

Lmk if you have questions or need anything else in order to merge the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants