Skip to content

Commit

Permalink
only show the distribution for top-10 tokens
Browse files Browse the repository at this point in the history
  • Loading branch information
zzachw committed Nov 10, 2022
1 parent ae67753 commit 02ae89f
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion pyhealth/datasets/base_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -501,8 +501,10 @@ def task_stat(self) -> str:
f"{sum(num_events) / len(num_events):.4f}")
lines.append(
f"\t\t- Number of unique {key}: {len(self.get_all_tokens(key))}")
distribution = self.get_distribution_tokens(key)
top10 = sorted(distribution.items(), key=lambda x: x[1], reverse=True)[:10]
lines.append(
f"\t\t- Distribution of {key}: {self.get_distribution_tokens(key)}")
f"\t\t- Distribution of {key} (Top-10): {top10}")
return "\n".join(lines)

@staticmethod
Expand Down

0 comments on commit 02ae89f

Please sign in to comment.