Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Part B Spend Clean #20

Merged
merged 6 commits into from
Feb 1, 2017
Merged

Conversation

sgalletta213
Copy link
Contributor

@sgalletta213 sgalletta213 commented Jan 31, 2017

This commit contains the RProject I made to clean the raw part B spend
.xlsx file from CMS. It contains:

  1. the RProject file
  2. script
  3. clean data file
  4. data definition file
  5. a .gitignore for R specific files and to override the higher-level .gitignore that was omitting my .csvs
  6. an extra .gitignore in /data to preserve the file hierarchy that the script looks for

Note: This may be overkill for a data cleaning task. If so, let me know, and in the future I'll commit something lighter :)

sgalletta213 and others added 2 commits January 30, 2017 21:31
This commit contains the RProject I made to clean the raw part B spend
.xlsx file from CMS.  It contains:
1) the RProject file
2) script
3)
clean data file
4) data definition file
5) a .gitignore for R specific
files and to override the higher-level .gitignore that was omitting my
.csvs
6) an extra .gitignore in /data to preserve the file hierarchy
that the script looks for

Note: This may be overkill for a data
cleaning task.  If so, let me know, and in the future I'll commit
something lighter :)
@jenniferthompson
Copy link
Contributor

@sgalletta213 Wow - awesome! I'm working on some other tasks tonight but will check it out ASAP. In general - I think an R script + final CSV would be plenty, but I'm also a big fan of thoroughness. :) Thank you!

@jenniferthompson
Copy link
Contributor

@sgalletta213 This looks awesome!! Thank you so much for doing it! The only thing I see - do you think it would be straightforward to remove the trailing white space from the drug names within your script? That would be helpful when trying to merge with other data sources, I think.

- Fixed script to remove whitespace from character fields in
part_b_spend_clean.csv
- Fixed data_definitions.csv to match variable names and order in
part_b_spend_clean.csv
@sgalletta213
Copy link
Contributor Author

Closing this pull request to remove whitespace and fix data_definitions

@sgalletta213 sgalletta213 reopened this Feb 1, 2017
@jenniferthompson
Copy link
Contributor

Awesome!

@jenniferthompson jenniferthompson merged commit 6c81f3c into Data4Democracy:master Feb 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants