Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and remove IDs disguised in text features #415

Closed
wants to merge 44 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
681d98a
starter code
TuanNguyen27 Sep 25, 2019
ef7cdfb
fix weird compilation error
TuanNguyen27 Sep 25, 2019
6a57693
fix some tests
TuanNguyen27 Sep 25, 2019
bb858c7
fix more errors resulting from removing moments calculation
TuanNguyen27 Sep 25, 2019
cae7fe4
Update ModelInsightsTest.scala
TuanNguyen27 Sep 25, 2019
aa55539
Update FeatureDistributionTest.scala
TuanNguyen27 Sep 25, 2019
7950924
add new rules to remove raw feature based on topK & starter code on t…
TuanNguyen27 Sep 27, 2019
8f3befe
fix scala style
TuanNguyen27 Sep 27, 2019
9ed2a02
more code
TuanNguyen27 Sep 27, 2019
90350d0
fix more style error
TuanNguyen27 Sep 28, 2019
8640d6d
adding isID as an exclusion criteria
TuanNguyen27 Sep 30, 2019
91285b6
fix scala style
TuanNguyen27 Oct 1, 2019
61d26b1
bunch of broken tests
TuanNguyen27 Oct 1, 2019
a7a0781
move IdDetect app to hw
TuanNguyen27 Oct 1, 2019
6c887f3
try modify titanic instead
TuanNguyen27 Oct 1, 2019
7829dfd
add app
TuanNguyen27 Oct 1, 2019
b611289
switch to a different metric
TuanNguyen27 Oct 7, 2019
a30eca6
remove extra calculations
TuanNguyen27 Oct 7, 2019
c0ceaa6
remove more stuff
TuanNguyen27 Oct 7, 2019
b3930dc
fix naming issue
TuanNguyen27 Oct 8, 2019
33afe00
Update IdDetectTest.scala
TuanNguyen27 Oct 8, 2019
b7f050b
Update FeatureDistributionTest.scala
TuanNguyen27 Oct 8, 2019
d79456e
finishing up RFF
TuanNguyen27 Oct 8, 2019
b99b395
update default so that tests will pass
TuanNguyen27 Oct 8, 2019
ac8757e
Update OpWorkflow.scala
TuanNguyen27 Oct 8, 2019
2a4ccbc
Update OpWorkflow.scala
TuanNguyen27 Oct 8, 2019
febdc13
Update OpTitanicSimple.scala
TuanNguyen27 Oct 8, 2019
0c016b8
Update RawFeatureFilter.scala
TuanNguyen27 Oct 8, 2019
93267ab
Merge branch 'master' into ID_detect
TuanNguyen27 Oct 8, 2019
d73f3c5
new transformer wip
TuanNguyen27 Oct 10, 2019
7c1f262
Merge branch 'ID_detect' of https://github.com/salesforce/Transmogrif…
TuanNguyen27 Oct 10, 2019
88b5867
Update FeatureDistributionTest.scala
TuanNguyen27 Oct 10, 2019
a133392
added transformer for map
TuanNguyen27 Oct 10, 2019
095a180
Update FeatureDistribution.scala
TuanNguyen27 Oct 10, 2019
7031d16
Merge branch 'master' into ID_detect
TuanNguyen27 Oct 10, 2019
1e767c1
more updates
TuanNguyen27 Oct 10, 2019
72ce224
more
TuanNguyen27 Oct 10, 2019
df5562e
fix unecessary changes
TuanNguyen27 Oct 10, 2019
c3cc3b0
more updates
TuanNguyen27 Oct 10, 2019
b022921
Delete IdDetectTest.scala
TuanNguyen27 Oct 10, 2019
9bad875
more fix
TuanNguyen27 Oct 10, 2019
9f7bc99
Update FeatureDistribution.scala
TuanNguyen27 Oct 10, 2019
126ddcd
fix unit tests
TuanNguyen27 Oct 10, 2019
7753280
Update SmartTextVectorizerTest.scala
TuanNguyen27 Oct 11, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge branch 'master' into ID_detect
  • Loading branch information
TuanNguyen27 committed Oct 10, 2019
commit 7031d16c8344d96d670d243d4738760da632def6
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ class FeatureDistributionTest extends FlatSpec with PassengerSparkFixtureTest wi
it should "have toString" in {
FeatureDistribution("A", None, 10, 1, Array(1, 4, 0, 0, 6), Array.empty).toString() shouldBe
"FeatureDistribution(type = Training, name = A, key = None, count = 10, nulls = 1, " +
"distribution = [1.0,4.0,0.0,0.0,6.0], summaryInfo = [], cardinality = , rawFeatureType = )"
"distribution = [1.0,4.0,0.0,0.0,6.0], summaryInfo = [])"
}

it should "marshall to/from json" in {
Expand Down
You are viewing a condensed version of this merge commit. You can view the full changes here.