Skip to content

Tags: modin-project/modin

Tags

0.32.0

Toggle 0.32.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.32.0

This release introduces support for Polars API, a new query compiler for small data,
more functions that can use dynamic partitioning, as well as several bug fixes.

Key Features and Updates Since 0.31.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#0000: Fix type hint (#7343)
  * FIX-#7113: Fix docstring overrides for subclasses. (#7354)
  * FIX-#7134: Use a separate docstring class for BasePandasDataset. (#7353)
  * FIX-#7329: Do not sort columns on df.update (#7330)
  * FIX-#7351: Add ipython method calls to non-lookup list (#7352)
  * FIX-#7355: Cpu count would be set incorrectly on a cluster (#7356)
  * FIX-#7357: Fix `NoAttributeError` on `DataFrame.copy` (#7358)
  * FIX-#7371: Fix inserting datelike values into a DataFrame (#7372)
  * FIX-#7373: Try a previous version of `motoserver/moto` service, pin to 5.0.13 (#7374)
  * FIX-#7379: Fix __imul__ performing addition instead of multiplication (#7380)
  * FIX-#7387: Limit the number of pytest workers for tests with Ray engine on Windows (#7388)
  * FIX-#7389: Fix uploading artifacts (#7390)
* Refactor Codebase
  * REFACTOR-#0000: Update copyright date (#7333)
* Documentation improvements
  * DOCS-#0000: Update RunLLM Ask AI widget script path (#7345)
  * DOCS-#7335: Fix borken links in Modin Usage Examples page (#7336)
  * DOCS-#7382: Add documentation on how to use Modin Native query compiler (#7386)
* New Features
  * FEAT-#4605: Add native query compiler (#7259)
  * FEAT-#7308: Interoperability between query compilers (#7376)
  * FEAT-#7331: Initial Polars API (#7332)
  * FEAT-#7337: Using dynamic partitionning in `broadcast_apply` (#7338)
  * FEAT-#7340: Add more granular lazy flags to query compiler (#7348)
  * FEAT-#7368: Add a new environment variable for using dynamic partitioning (#7369)

Contributors
------------
@MortalHappiness
@Retribution98
@YarShev
@ZhipengXue97
@anmyachev
@arunjose696
@devin-petersohn
@likawind
@sfc-gh-joshi
@sfc-gh-mvashishtha

0.31.0

Toggle 0.31.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.31.0

First release compatible with NumPy 2.0.

Key Features and Updates Since 0.30.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7138: Stop reloading modules for custom docstrings. (#7307)
  * FIX-#7263: Empty docstrings should not be inherited (#7264)
  * FIX-#7272: Remove HDK engine (#7275)
  * FIX-#7277: Remove Cudf storage format as unmaintained (#7290)
  * FIX-#7278: Make sure `enable_logging` decorator preserve type hints (#7279)
  * FIX-#7292: Prepare Modin code to NumPy 2.0 (#7293)
  * FIX-#7295: Unpin numexpr to allow versions >= 2.8.4 to match pandas (#7296)
  * FIX-#7309: Update versioneer with `versioneer install --vendor` (#7311)
  * FIX-#7320: Bump the github-actions group with 3 updates (#7319)
  * FIX-#7321: Using 'C' engine instead of 'pyarrow' for getting metadata in 'read_csv' (#7322)
* Performance enhancements
  * PERF-#7299: Avoid using `synchronize_labels` for `combine` function (#7300)
* Refactor Codebase
  * REFACTOR-#7271: Remove `instance_type` attribute of axis partitions (#7268)
  * REFACTOR-#7273: Remove deprecated functions from utils.py, accessor.py and io.py (#7274)
  * REFACTOR-#7285: Remove deprecated configs (#7286)
  * REFACTOR-#7294: Reduce access of methods `_modin_frame` methods from `_query_compiler` (#7297)
  * REFACTOR-#7313: Add similar methods as in #7294 for operating on columns (#7314)
* Update testing suite
  * TEST-#0000: Add a Dependabot config to auto-update GitHub action versions (#7318)
  * TEST-#7316: Run a subset of CI tests with python 3.10 and 3.11 on a scheduled basis (#7289)
* Documentation improvements
  * DOCS-#0000: Adds RunLLM widget to docs (#7326)
  * DOCS-#7287: Update Modin on Dask documentation (#7288)
* New Features
  * FEAT-#6574: UserWarning no longer displayed when Series/DataFrames are small (#7323)
  * FEAT-#7249: Add `reload_modin` feature (#7280)
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)
  * FEAT-#7283: Introduce MinRowPartitionSize and MinColumnPartitionSize (#7284)
  * FEAT-#7310: NumPy 2.0 support (#7312)

Contributors
------------
@Jayson729
@Retribution98
@YarShev
@anmyachev
@arunjose696
@kurtmckee
@sfc-gh-dpetersohn
@vsreekanti

0.30.1

Toggle 0.30.1's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.30.1

This release pins numpy<2.

Key Features and Updates Since 0.30.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7302: Pin numpy<2 (072453b)

Contributors
------------

@anmyachev

0.29.1

Toggle 0.29.1's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.29.1

This release pins numpy<2.

Key Features and Updates Since 0.29.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7302: Pin numpy<2 (072453b)
* New Features
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)

Contributors
------------

@anmyachev
@sfc-gh-dpetersohn

0.28.3

Toggle 0.28.3's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.28.3

This release pins numpy<2.

Key Features and Updates Since 0.28.2
-------------------------------------
* Stability and Bugfixes
  * FIX-#7302: Pin numpy<2 (072453b)
* New Features
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)

Contributors
------------

@anmyachev
@sfc-gh-dpetersohn

0.27.1

Toggle 0.27.1's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.27.1

This release pins numpy<2.

Key Features and Updates Since 0.27.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#6968: Align API with pandas (#6969)
  * FIX-#7302: Pin numpy<2 (072453b)
* New Features
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)

Contributors
------------

@anmyachev
@dchigarev
@sfc-gh-dpetersohn

0.30.0

Toggle 0.30.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.30.0

This release introduces support for DataFrame API standard, a distributed implementation for right merge/join,
more efficient implementation of internal operators, which gives a performance boost to almost all distributed Modin functions,
improved compatibility with pandas on pyarrow backend, type hints for pandas API to improve UX.

Key Features and Updates Since 0.29.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#0000: Fix badge in README.md (#7213)
  * FIX-#0000: Make merge tests more stable by sorting results (#7266)
  * FIX-#6967: Remove read_pickle_distributed/to_pickle_distributed functions as deprecated (#7258)
  * FIX-#7093: Make sure 'idxmax' and 'idxmin' can work with string columns (#7193)
  * FIX-#7102: Remove `enable_api_only` mode in modin logging (#7194)
  * FIX-#7103: Move lower-level functionality logging to debug (#7184)
  * FIX-#7143: Constructing a DataFrame from a Modin Series with tuple name should produce MultiIndex columns (#7214)
  * FIX-#7185: Add extra check for some config classes (#7189)
  * FIX-#7201: Update docs on how to enable Modin logs for high-level API and low-level API (#7209)
  * FIX-#7206: Make sure df.melt handle duplicate value_vars correctly (#7208)
  * FIX-#7219: Pin dataframe-api-compat>=0.2.7 (#7220)
  * FIX-#7221: Don't use 'use_legacy_dataset=False' for 'ParquetDataset' (#7222)
  * FIX-#7224: Importing modin.pandas.api.extensions overwrites re-export of pandas.api submodules (#7225)
  * FIX-#7233: Display property name in default_to_pandas error messages (#7269)
  * FIX-#7234: Deprecate HDK engine (#7235)
  * FIX-#7238: Fix docstring inheritance for `cached_property` and use it (#7239)
  * FIX-#7240: Allow `doc_checker.py` works with `functools.cached_property` (#7241)
  * FIX-#7246: Pin pyarrow>=10.0.1 as pandas 2.2.* does (#7247)
  * FIX-#7248: Make sure '_validate_dtypes_sum_prod_mean' works correctly with datetime types (#7237)
  * FIX-#7250: Revert "PERF-#6666: Avoid internal reset_index for left merge" (#7251)
* Performance enhancements
  * PERF-#7227: Call 'modin_frame.combine()' for merge and join only when necessary (#7228)
  * PERF-#7230: Don't preserve bad partition for 'merge' (#7229)
* Refactor Codebase
  * REFACTOR-#7242: Add type hints for `modin/core/dataframe/algebra/` (#7243)
  * REFACTOR-#7260: Use `extract_dtype` internal function in more places (#7261)
* Update testing suite
  * TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes (#7199)
  * TEST-#7191: Fix ASV after changing default branch (#7190)
* Documentation improvements
  * DOCS-#0000: Fix a typo with MODIN_CPUS number (#7198)
  * DOCS-#0000: Supplement Optmization Notes with a link to configs (#7197)
  * DOCS-#7217: Update docs as to when Modin operators work best (#7218)
  * DOCS-#7255: Update docs as to from_* functions (#7256)
* New Features
  * FEAT-#5394: Reduce amount of remote calls for Map operator (#7136)
  * FEAT-#5394: Reduce amount of remote calls for TreeReduce and GroupByReduce operators (#7245)
  * FEAT-#6492: Add `from_map` feature to create dataframe (#7215)
  * FEAT-#6498: Make Fold operator more flexible (#7257)
  * FEAT-#6808: Implement '__arrow_array__' for Series (#7200)
  * FEAT-#6890: Modin implementation of DataFrame API standard (#7216)
  * FEAT-#7139: Use ray-core instead of ray-default (#6955)
  * FEAT-#7187: Change "master" branch to "main" (#7188)
  * FEAT-#7202: Use custom resources for Ray (#7205)
  * FEAT-#7203: Make sure Modin works correctly with pandas, which uses pyarrow as a backend (#7204)
  * FEAT-#7207: Add the ability to assing a df to a columns selection without d2p (#7210)
  * FEAT-#7252: Add type hints for `base.py` (#7253)
  * FEAT-#7254: Support right merge/join (#7226)

Contributors
------------
@Retribution98
@YarShev
@anmyachev
@arunjose696
@noloerino
@sfc-gh-jkew

0.29.0

Toggle 0.29.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.29.0

This release introduces `modin.pandas.testing` and `modin.pandas.arrays` modules, faster implementation (range-partitioning) for
`pivot_table`, `unique`, `drop_duplicates`, `nunique`, `df.resample` functions, new functions to interact with Dask: `to/from_dask`
distributed implementation for `Series.case_when`, optimization for `astype` function with scalar dtype.

Key Features and Updates Since 0.28.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#6227: Make sure `Series.unique()` with pyarrow dtype returns `ArrowExtensionArray` (#7042)
  * FIX-#6793: Use 'pandas_dtype' instead of 'np.dtype' for some more places in Modin code (#6794)
  * FIX-#7039: Pass scalar dtype as is to astype query compiler (#7152)
  * FIX-#7051: Update exception message for 'astype' function (#7052)
  * FIX-#7054: Update exception message for `shift` function (#7055)
  * FIX-#7056: Update exception message for `iloc/loc` functions (#7057)
  * FIX-#7058: Update exception message for `insert` function (#7059)
  * FIX-#7060: Fix 'pivot' when index or columns are of Index type (#7061)
  * FIX-#7062: Update exception message for `aggregate` function (#7063)
  * FIX-#7072: Replace MaterializationHook with the materialized object on serialization. (#7075)
  * FIX-#7088: Make sure `rank` raises `No axis named None...` exception (#7089)
  * FIX-#7115: Exclude Ray 2.10.0 from deps installation (#7116)
  * FIX-#7135: Fix appending a new row (#7172)
  * FIX-#7153: Fix 'Series.corr' with 'method != pearson' (#7158)
  * FIX-#7157: Make sure `quantile` function works with `numeric_only=True` (#7160)
  * FIX-#7170: Don't use `MinPartitionSize` configuration variable in remote context (#7177)
* Performance enhancements
  * PERF-#5296: Partition parquet file if it has too few row groups (#7016)
  * PERF-#7068: Provide shape_hint="column" for some more operations with Series (#7069)
  * PERF-#7123: Preserve shape_hint for dropna (#7124)
  * PERF-#7130: Preserve partition lengths in apply_full_axis with keep_partitioning=True (#7131)
  * PERF-#7132: Preserve partition lengths in apply_full_axis with keep_partitioning=False (#7133)
  * PERF-#7150: Reduce peak memory consumption (#7149)
* Refactor Codebase
  * REFACTOR-#3257: Move logging and caching to the `gen_data` internal function (#7046)
  * REFACTOR-#7105: Deprecate 'cfg.RangePartitioningGroupby' (#7161)
  * REFACTOR-#7106: Rename from/to_ray_dataset to from/to_ray (#7107)
  * REFACTOR-#7109: Remove the outdated aws_example.yaml file. (#7110)
* Update testing suite
  * TEST-#3622: Centralize tests in Modin (#7137)
  * TEST-#6016: Make sure `eval_general` doesn't expect exceptions by default (#6954)
  * TEST-#7064: Explicitly check for exceptions in `test_groupby.py` (#7065)
  * TEST-#7066: Explicitly check for exceptions in `test_io.py` (#7067)
  * TEST-#7073: Explicitly check for exceptions in `test_default.py` (#7074)
  * TEST-#7076: Explicitly check for exceptions in `test_map_metadata.py` (#7077)
  * TEST-#7082: Explicitly check for exceptions in 'test_series.py' (#7083)
  * TEST-#7084: Explicitly check for exceptions in 'test_indexing.py' (#7085)
  * TEST-#7086: Explicitly check for exceptions in `test_reduce.py` (#7087)
  * TEST-#7094: Rename 'raising_exceptions' argument of 'eval_general' testing function (#7095)
  * TEST-#7125: Explicitly install modin in ci tests (#7126)
  * TEST-#7165: Add codecov token to fix CI on master (#7175)
  * TEST-