Skip to content

Tags: pyrito/modin

Tags

0.15.0

Toggle 0.15.0's commit message

Verified

This tag was signed with the committer’s verified signature.
RehanSD Rehan Sohail Durrani
Modin 0.15.0

This release includes updated support for pandas 1.4.2, new Batch and Logging APIs, and a plethora
of bug fixes and documentation improvements.

Key Features and Updates
------------------------

* Stability and Bugfixes
  * FIX-modin-project#4376: Upgrade pandas to 1.4.2 (modin-project#4377)
  * FIX-modin-project#3615: Relax some deps in development env (modin-project#4365)
  * FIX-modin-project#4370: Fix broken docstring links (modin-project#4375)
  * FIX-modin-project#4392: Align Modin XGBoost with xgb>=1.6 (modin-project#4393)
  * FIX-modin-project#4385: Get rid of `use-deprecated` option in `pip` (modin-project#4386)
  * FIX-modin-project#3527: Fix parquet partitioning issue causing negative row length partitions (modin-project#4368)
  * FIX-modin-project#4330: Override the memory limit to start ray 1.11.0 on Macs (modin-project#4335)
  * FIX-modin-project#4407: Align `insert` function with pandas in case of numpy array with several columns (modin-project#4408)
  * FIX-modin-project#4373: Fix invalid file path when trying `read_csv_glob` with `usecols` parameter (modin-project#4405)
  * FIX-modin-project#4394: Fix issue with multiindex metadata desync (modin-project#4395)
  * FIX-modin-project#4438: Fix `reindex` function that doesn't preserve initial index metadata (modin-project#4442)
  * FIX-modin-project#4425: Add parameters to groupby pct_change (modin-project#4429)
  * FIX-modin-project#4457: Fix `loc` in case when need reindex item (modin-project#4457)
  * FIX-modin-project#4414: Add missing f prefix on f-strings found at https://codereview.doctor (modin-project#4415)
  * FIX-modin-project#4461: Fix S3 CSV data path (modin-project#4462)
  * FIX-modin-project#4467: `drop_duplicates` no longer removes items based on index values (modin-project#4468)
  * FIX-modin-project#4449: Drain the call queue before waiting on result in benchmark mode (modin-project#4472)
  * FIX-modin-project#4518: Fix Modin Logging to report specific Modin warnings/errors (modin-project#4519)
  * FIX-modin-project#4481: Allow clipping with a Modin Series of bounds (modin-project#4486)
  * FIX-modin-project#4504: Support na_action in applymap (modin-project#4505)
  * FIX-modin-project#4503: Stop the memory logging thread after session exit (modin-project#4515)
  * FIX-modin-project#4531: Fix a makedirs race condition in to_parquet (modin-project#4533)
  * FIX-modin-project#4464: Refactor Ray utils and quick fix groupby.count failing on virtual partitions (modin-project#4490)
  * FIX-modin-project#4436: Fix to_pydatetime dtype for timezone None (modin-project#4437)
  * FIX-modin-project#4541: Fix merge_asof with non-unique right index (modin-project#4542)
* Performance enhancements
  * FEAT-modin-project#4320: Add connectorx as an alternative engine for read_sql (modin-project#4346)
  * PERF-modin-project#4493: Use partition size caches more in Modin dataframe (modin-project#4495)
* Benchmarking enhancements
  * FEAT-modin-project#4371: Add logging to Modin (modin-project#4372)
  * FEAT-modin-project#4501: Add RSS Memory Profiling to Modin Logging (modin-project#4502)
  * FEAT-modin-project#4524: Split Modin API and Memory log files (modin-project#4526)
* Refactor Codebase
  * REFACTOR-modin-project#4284: use variable length unpacking when getting results from `deploy` function (modin-project#4285)
  * REFACTOR-modin-project#3642: Move PyArrow storage format usage from main feature to experimental ones (modin-project#4374)
  * REFACTOR-modin-project#4003: Delete the deprecated cloud mortgage example (modin-project#4406)
  * REFACTOR-modin-project#4513: Fix spelling mistakes in docs and docstrings (modin-project#4514)
  * REFACTOR-modin-project#4510: Align experimental and regular IO modules initializations (modin-project#4511)
* Developer API enhancements
  * FEAT-modin-project#4359: Add __dataframe__ method to the protocol dataframe (modin-project#4360)
* Update testing suite
  * TEST-modin-project#4363: Use Ray from pypi in CI (modin-project#4364)
  * FIX-modin-project#4422: get rid of case sensitivity for `warns_that_defaulting_to_pandas` (modin-project#4423)
  * TEST-modin-project#4426: Stop passing is_default kwarg to Modin and pandas (modin-project#4428)
  * FIX-modin-project#4439: Fix flake8 CI fail (modin-project#4440)
  * FIX-modin-project#4409: Fix `eval_insert` utility that doesn't actually check results of `insert` function (modin-project#4410)
  * TEST-modin-project#4482: Fix getitem and loc with series of bools (modin-project#4483).
* Documentation improvements
  * DOCS-modin-project#4296: Fix docs warnings (modin-project#4297)
  * DOCS-modin-project#4388: Turn off fail_on_warning option for docs build (modin-project#4389)
  * DOCS-modin-project#4469: Say that commit messages can start with PERF (modin-project#4470).
  * DOCS-modin-project#4466: Recommend GitHub issues over [email protected] (modin-project#4474).
  * DOCS-modin-project#4487: Recommend GitHub issues over [email protected] (modin-project#4489).
* Dependencies
  * FIX-modin-project#4327: Update min pin for xgboost version (modin-project#4328)
  * FIX-modin-project#4383: Remove `pathlib` from deps (modin-project#4384)
  * FIX-modin-project#4390: Add `redis` to Modin dependencies (modin-project#4396)
  * FIX-modin-project#3689: Add black and flake8 into development environment files (modin-project#4480)
  * TEST-modin-project#4516: Add numpydoc to developer requirements (modin-project#4517)
* New Features
  * FEAT-modin-project#4412: Add Batch Pipeline API to Modin (modin-project#4452)

Contributors
------------
@YarShev
@Garra1980
@prutskov
@alexander3774
@amyskov
@wangxiaoying
@jeffreykennethli
@mvashishtha
@anmyachev
@dchigarev
@devin-petersohn
@jrsacher
@orcahmlee
@naren-ponder
@RehanSD

0.8.3.post0

Toggle 0.8.3.post0's commit message

Verified

This tag was signed with the committer’s verified signature.
devin-petersohn Devin Petersohn
Modin 0.8.3.post0

This release contains minor fixes to database connections
and improves how data is inserted to databases. This release
is not intended for general use.

0.14.1

Toggle 0.14.1's commit message

Verified

This tag was signed with the committer’s verified signature.
devin-petersohn Devin Petersohn
Modin 0.14.1

This release contains a few key bugfixes and pandas version update.

Key Features and Updates
------------------------
* FIX-modin-project#4376: Upgrade pandas to 1.4.2 (modin-project#4377)
* FIX-modin-project#4390: Add redis to Modin dependencies (modin-project#4396)
* FIX-modin-project#3527: Fix parquet partitioning issue causing negative row length partitions (modin-project#4368)
* FIX-modin-project#4330: Override the memory limit to start ray 1.11.0 on Macs. (modin-project#4335)
* FIX-modin-project#4394: Fix issue with multiindex metadata desync (modin-project#4395)
* FIX-modin-project#4373: fix usage of 'read_csv_glob' with 'usecols' parameter (modin-project#4405)
* FIX-modin-project#4425: Add parameters to groupby pct_change. (modin-project#4429)

0.14.0

Toggle 0.14.0's commit message
Modin 0.14.0

This release contains significant upgrades to Developer API, as well as to Modin's documentation,
some refactor codebase and performance enhancements, and multiple bugfixes.

Key Features and Updates
------------------------

* Stability and Bugfixes
  * FIX-modin-project#4058: Allow pickling empty dataframes and series (modin-project#4095)
  * FIX-modin-project#4136: Fix exercise_3.ipynb example notebook (modin-project#4137)
  * FIX-modin-project#4105: Fix names of pandas options to avoid `OptionError` (modin-project#4109)
  * FIX-modin-project#3417: Fix read_csv with skiprows and header parameters (modin-project#3419)
  * FIX-modin-project#4142: Fix OmniSci enabling (modin-project#4146)
  * FIX-modin-project#4162: Use `skipif` instead of `skip` for compatibility with pytest 7.0 (modin-project#4163)
  * FIX-modin-project#4158: Do not print OmniSci logs to stdout by default (modin-project#4159)
  * FIX-modin-project#4177: Support read_feather from pathlike objects (modin-project#4177)
  * FIX-modin-project#4234: Upgrade pandas to 1.4.1 (modin-project#4235)
  * FIX-modin-project#3368: support unsigned integers in OmniSci backend (modin-project#4256)
  * FIX-modin-project#4057: Allow reading an empty parquet file (modin-project#4075)
  * FIX-modin-project#3884: Fix read_excel() dropping empty rows (modin-project#4161)
  * FIX-modin-project#4257: Fix Categorical() for scalar categories (modin-project#4258)
  * FIX-modin-project#4300: Fix Modin Categorical column dtype categories (modin-project#4276)
  * FIX-modin-project#4208: Fix lazy metadata update for `PandasDataFrame.from_labels` (modin-project#4209)
  * FIX-modin-project#3981, FIX-modin-project#3801, FIX-modin-project#4149: Stop broadcasting scalars to set items (modin-project#4160)
  * FIX-modin-project#4185: Fix rolling across column partitions (modin-project#4262)
  * FIX-modin-project#4303: Fix the syntax error in reading from postgres (modin-project#4304)
  * FIX-modin-project#4308: Add proper error handling in df.set_index (modin-project#4309)
  * FIX-modin-project#4056: Allow an empty parse_date list in `read_csv_glob` (modin-project#4074)
  * FIX-modin-project#4312: Fix constructing categorical frame with duplicate column names (modin-project#4313).
  * FIX-modin-project#4314: Allow passing a series of dtypes to astype (modin-project#4318)
  * FIX-modin-project#4310: Handle lists of lists of ints in read_csv_glob (modin-project#4319)
  * FIX-modin-project#4138, FIX-modin-project#4009: remove redundant sorting in the internal
* Performance enhancements
  * FIX-modin-project#4138, FIX-modin-project#4009: remove redundant sorting in the internal '.mask()' flow (modin-project#4140)
  * FIX-modin-project#4183: Stop shallow copies from creating global shared state. (modin-project#4184)
* Benchmarking enhancements
  * FIX-modin-project#4221: add `wait` method for `PandasOnRayDataframeColumnPartition` class (modin-project#4231)
* Refactor Codebase
  * REFACTOR-modin-project#3990: remove code duplication in `PandasDataframePartition` hierarchy (modin-project#3991)
  * REFACTOR-modin-project#4229: remove unused `dask_client` global variable in `modin\pandas\__init__.py` (modin-project#4230)
  * REFACTOR-modin-project#3997: remove code duplication for `broadcast_apply` method (modin-project#3996)
  * REFACTOR-modin-project#3994: remove code duplication for `get_indices` function (modin-project#3995)
  * REFACTOR-modin-project#4331: remove code duplication for `to_pandas`, `to_numpy` functions in `QueryCompiler` hierarchy (modin-project#4332)
  * REFACTOR-modin-project#4213: Refactor `modin/examples/tutorial/` directory (modin-project#4214)
  * REFACTOR-modin-project#4206: add assert check into `__init__` method of `PandasOnDaskDataframePartition` class (modin-project#4207)
  * REFACTOR-modin-project#3900: add flake8-no-implicit-concat plugin and refactor flake8 error codes (modin-project#3901)
  * REFACTOR-modin-project#4093: Refactor base to be smaller (modin-project#4220)
  * REFACTOR-modin-project#4047: Rename `cluster` directory to `cloud` in examples (modin-project#4212)
  * REFACTOR-modin-project#3853: interacting with Dask interface through `DaskWrapper` class (modin-project#3854)
  * REFACTOR-modin-project#4322: Move is_reduce_fn outside of groupby_agg (modin-project#4323)
* Pandas API implementations and improvements
  * FEAT-modin-project#3603: add experimental `read_custom_text` function that can read custom line-by-line text files (modin-project#3441)
  * FEAT-modin-project#979: Enable reading from SQL server (modin-project#4279)
* Developer API enhancements
  * FEAT-modin-project#4245: Define base interface for dataframe exchange protocol (modin-project#4246)
  * FEAT-modin-project#4244: Implement dataframe exchange protocol for OmnisciOnNative execution (modin-project#4269)
  * FEAT-modin-project#4144: Implement dataframe exchange protocol for pandas storage format (modin-project#4150)
  * FEAT-modin-project#4342: Support `from_dataframe`` for pandas storage format (modin-project#4343)
* Update testing suite
  * TEST-modin-project#3628: Report coverage data for `test-internals` CI job (modin-project#4198)
  * TEST-modin-project#3938: Test tutorial notebooks in CI (modin-project#4145)
  * TEST-modin-project#4153: Fix condition of running lint-commit and set of CI triggers (modin-project#4156)
  * TEST-modin-project#4201: Add read_parquet, explode, tail, and various arithmetic functions to asv_bench (modin-project#4203)
* Documentation improvements
  * DOCS-modin-project#4077: Add release notes template to docs folder (modin-project#4078)
  * DOCS-modin-project#4082: Add pdf/epub/htmlzip formats for doc builds (modin-project#4083)
  * DOCS-modin-project#4168: Fix rendering the examples on troubleshooting page (modin-project#4169)
  * DOCS-modin-project#4151: Add info in troubleshooting page related to Dask engine usage (modin-project#4152)
  * DOCS-modin-project#4172: Refresh Intel Distribution of Modin paragraph (modin-project#4175)
  * DOCS-modin-project#4173: Mention strict channel priority in conda install section (modin-project#4178)
  * DOCS-modin-project#4176: Update OmniSci usage section (modin-project#4192)
  * DOCS-modin-project#4027: Add GIF images and chart to Modin README demonstrating speedups (modin-project#4232)
  * DOCS-modin-project#3954: Add Dask example notebooks (modin-project#4139)
  * DOCS-modin-project#4272: Add bar chart comparisons to quick start guide (modin-project#4277)
  * DOCS-modin-project#3953: Add docs and notebook examples on running Modin with OmniSci (modin-project#4001)
  * DOCS-modin-project#4280: Change links in jupyter notebooks (modin-project#4281)
  * DOCS-modin-project#4290: Add changes for OmniSci notebooks (modin-project#4291)
  * DOCS-modin-project#4241: Update warnings and docs regarding defaulting to pandas (modin-project#4242)
  * DOCS-modin-project#3099: Fix `BasePandasDataSet` docstrings warnings (modin-project#4333)
  * DOCS-modin-project#4339: Reformat I/O functions docstrings (modin-project#4341)
  * DOCS-modin-project#4336: Reformat general utilities docstrings (modin-project#4338)
* Dependencies
  * FIX-modin-project#4113, FIX-modin-project#4116, FIX-modin-project#4115: Apply new `black` formatting, fix pydocstyle check and readthedocs build (modin-project#4114)
  * TEST-modin-project#3227: Use codecov github action instead of bash form in GA workflows (modin-project#3226)
  * FIX-modin-project#4115: Unpin `pip` in readthedocs deps list (modin-project#4170)
  * TEST-modin-project#4217: Pin `Dask<2022.2.0` as a temporary fix of CI (modin-project#4218)

Contributors
------------

@prutskov, @amyskov, @paulovn, @anmyachev, @YarShev, @RehanSD, @devin-petersohn,
@dchigarev, @Garra1980, @mvashishtha, @naren-ponder, @jeffreykennethli, @dorisjlee, @Rubtsowa

0.13.3

Toggle 0.13.3's commit message

Verified

This tag was signed with the committer’s verified signature.
vnlitvinov Vasily Litvinov
Modin 0.13.3

This release contains a few key bugfixes and pandas version update.

Key Features and Updates
------------------------
* Stability and Bugfixes
  * Stop shallow datafrane copies from creating global shared state (modin-project#4184)
  * Make PandasOnRayDataframeColumnPartition conformant to partition interface (modin-project#4231)
  * Fix lazy metadata update for PandasDataFrame.from_labels (modin-project#4209)
  * Fix Categorical() for scalar categories (modin-project#4258)
  * Fix some cases when assigning a scalar to a subset of dataframe or series. (modin-project#4160)
  * Align read_excel() behaviour on empty rows with pandas 1.3+ (modin-project#4161)
  * Allow reading an empty parquet file. (modin-project#4075)
  * Pin Dask<2022.2.0 as a temporary fix. (modin-project#4218)
  * Add proper error handling in df.set_index. (modin-project#4309)
* Documentation improvements
  * Clarify OmniSci activation in its usage section. (modin-project#4192)
* Upgrade pandas to 1.4.1 (modin-project#4235)

Contributors
------------
@mvashishtha @anmyachev @prutskov @devin-petersohn @naren-ponder @YarShev @Garra1980

0.13.2

Toggle 0.13.2's commit message

Verified

This tag was signed with the committer’s verified signature.
vnlitvinov Vasily Litvinov
Modin 0.13.2

This release contains documentation polishing and small user experience
improvements.

Key Features and Updates
------------------------
Mention strict channel priority in conda install section (modin-project#4178)
Refresh Intel Distribution of Modin paragraph (modin-project#4175)
Add info in troubleshooting page related to Dask engine usage (modin-project#4152)
Do not print OmniSci logs to stdout by default (modin-project#4159)
Fix rendering the examples on troubleshooting page (modin-project#4169)
Use skipif instead of skip for compatibility with pytest 7.0 (modin-project#4163)

Contributors
------------
@RehanSD, @YarShev, @dchigarev, @prutskov, @Garra1980

0.13.1

Toggle 0.13.1's commit message

Verified

This tag was signed with the committer’s verified signature.
RehanSD Rehan Sohail Durrani
Modin 0.13.1

This release contains a few key bugfixes and updates to the documentation.

Key Features and Updates
------------------------
* Stability and Bugfixes
  * FIX-modin-project#4058: Allow pickling empty dataframes and series (modin-project#4095)
  * FIX-modin-project#4105: Fix names of pandas options to avoid `OptionError` (modin-project#4109)
  * FIX-modin-project#4142: Fix OmniSci enabling (modin-project#4146)
* Documentation improvements
  * DOCS-modin-project#4082: Add pdf/epub/htmlzip formats for doc builds (modin-project#4083)
  * DOCS-modin-project#4079: Fix link to `PandasDataframe` in docs (modin-project#4080)
Contributors
------------
@prutskov, @paulovn, @YarShev, @RehanSD, @devin-petersohn,
@mvashishtha

0.13.0

Toggle 0.13.0's commit message

Verified

This tag was signed with the committer’s verified signature.
RehanSD Rehan Sohail Durrani
Modin 0.13.0

This release contains significant upgrades to Modin's documentation,
support for pandas 1.4, new algebra and partitioning layer APIs, and some bugfixes.

Key Features and Updates
------------------------
* Stability and bugfixes
  * Support for subscripting Resampler (1a1edfd)
  * Fix groupby with column name for `by` (a04d7b7)
  * Workaround for groupby with `sort=False` with categorical keys (c67a7c5)
  * Align default value of `REDIS_PASSWORD` with Ray's `DEFAULT_REDIS_PASSWORD` (f79cb85)
  * Fix groupby dictionary aggregation when `by` and columns to aggregate overlap (d42c070)
  * Fix `read_csv` when callables are provided for `skip_rows` parameter (7c84758)
  * Ensure address is not passed to `ray.init` when running Ray in local mode (02a23d4)
  * Ensure that `groupby.indices` returns positional indices (e9c06f2)
  * Fix setting of categorical values (0e36e22)
  * Ensure `df.__getitem__` respects step attribute of slice (7e85c5d)
  * Ensure data argument is delievered to the Dataframe in experimental cloud mode (2f7da1f)
  * Fix assigning to a Series with a single item (0d9d14e)
  * Fix the default to pandas in pd.DataFrame.sparse.from_spmatrix (ab2855b)
  * Fix `apply` result type inference (ac17ca1)
  * Exclude "scripts" from setup package (6224aba)
  * Fix assigning a Categorical to a column (cb4e727)
  * Ensure `df.to_csv` propagates metadata (e.g. index) (154697b)
  * Update `pyarrow` requirement in environment files (b55b08d)
* Performance enhancements
  * Optimize `__getitem__` flow for .loc/.iloc (0947ee8)
  * Delay instantiation of lazy `dtypes` on transpose (cd8db0c)
* Benchmarking enhancements
  * Update benchmarks for groupby that are more representative (0582aa2)
* Refactor Codebase
  * Update CODEOWNERS to reflect repository after refactor (cde6390)
  * Remove duplicate import of `FactoryDispatcher` in Modin experimental pandas IO (2cfabaf)
  * Update Modin to incorporate dataframe algebra (58bbcc3)
* Pandas API implementations and improvements
  * Add support for `storage_options` argument to `read_csv_glob` (7c33afe)
  * Add support for `dropna` argument for `groupby.indices` and `groupby.groups` (144a613)
  * Ensure relabeling Modin Frame does not lose partition shape (3c740db)
  * Update `Series.values` to default to `to_numpy()` (67228ef)
  * Add support for `modin.pandas.show_versions` and `python -m modin --versions` (efe717f)
  * Upgrade pandas support to 1.4 (39fbc57)
* OmniSci enhancements
  * Update benchmarks for groupby that are more representative (9396f23)
  * Update documentation on Native + OmniSci (edc1608)
  * Add support for `getArrowTable()` (6882ec2)
  * Fix segfault during `init` when only OmniSci is present (8c8a6a3)
  * Optimize `append` with default arguments (67013f9)
  * Fix OmniSci engine enabling for IO functions (9d1a334)
* XGBoost enhancements
* Developer API enhancements
  * Add parameter for minimum partition size (1be66d1)
  * Improve documentation for `read_csv_glob` and ensure warning raised if wildcard not in `filepath_or_buffer` (be10ba9)
  * Expand virtual partitioning utility (8d1004f)
* Update testing suite
* Documentation improvements
  * Improve documentation on pandas on Ray execution (b76dc57)
  * Reformat documentation to match pandas documentation theme (cc96f5d)
  * Improve documentation on pandas on Python execution (d590de0)
  * Improve System view in architecture documentation (6d51921)
  * Improve documentation on using pandas on Dask (003f338)
  * Improve documentation on pandas on Dask execution (61bf043)
  * Add documentation on using pandas on Python (195b668)
  * Improve Modin Out of Core documentation (cf426c4)
  * Improve documentation on OmniSci on native execution (689faee)
  * Improve documentation on IO (ffa67c7)
  * Add documentation on factories and parsers (6ca66db)
  * Improve documentation for experimental pandas on Ray execution (20abddd)
  * Improve documentation for `modin.core.dataframe.base` and `modin.core.dataframe.pandas` (cf1e541)
  * Update troubleshooting documentation and add FAQs (cc95ae2)
  * Improve README introduction and installation sections (a632d1f)
  * Update copyright year (7da1dc8)
  * Update a link to `pandas.read_json` (0315823)
  * Improve documentation for Modin vs. Dask (34732cb)
  * Fix links to the contributing page (81a06d6)
  * Remove broken links from supported apis (c04502d)
  * Change docs copyright statement to 'Modin Developers' (ed2a7a4)
  * Rename Developer page to Development in docs (406af7c)
  * Improve "Getting Started" section (4a62bba)
  * Update Modin tutorials (76707bf)
  * Add back quickstart notebook (4dd97ab)
  * Fix links in README and update README and FAQs (5d84042)
  * Update Modin module layout in architecture docs (7fcafa7)
  * Update documentation with new algebra operators and `ModinDataframe` (4b70725)
  * Add usage guide to documentation (4511566)
  * Build docs with Python 3.8 (01c1876)
* Dependencies
  * Update PyArrow to 6.0 and OmniSci to 5.10.1 (018515f)

Contributors
------------
@anmyachev, @prutskov, @Rubtsowa, @vnlitvinov, @dchigarev, @YarShev, @amyskov,
@mvashishtha, @dorisjlee, @devin-petersohn, @jeffreykennethli, @RehanSD,
@novichkovg, @Lozovskii-Aleksandr, @naren-ponder, @ahallermed, @fexolm,
@adityagp, @susmitpy, @ienkovich

0.12.1

Toggle 0.12.1's commit message

Verified

This tag was signed with the committer’s verified signature.
devin-petersohn Devin Petersohn
Modin 0.12.1

This release contains an update to the pandas version and a few bugfixes.

Key Features and Updates
------------------------
* Update supported pandas version to 1.3.5 (b79989a)
* Improvements to groupby
  * Fix `groupby` for case `by` is `None` (40d45c8)
  * Fix handling of dictionary aggregation (29f927b)
  * Return positional indices for Groupby property (c66324d)
* Fix slicing dataframes with `step` property (5651844)
* Fix assignment of data to category column (23dd3f8)

Contributors this release
-------------------------

@Rubtsowa, @prutskov, @dchigarev, @amyskov, @vnlitvinov, @mvashishtha,
@YarShev, @devin-petersohn

0.12.0

Toggle 0.12.0's commit message

Verified

This tag was signed with the committer’s verified signature.
RehanSD Rehan Sohail Durrani
Modin 0.12.0

This release contains a refactor to the codebase, encapsulating
significant amounts of improvements to the maintainability of the code,
and a plethora of bugfixes.

This release also introduces a slack community for Modin users to interact
with Modin developers. Please join us at our [Slack](https://modin.org/slack.html)
to continue the conversation!

Key Features and Updates
------------------------
* Stability and bugfixes
  * Support allowing callables and scalars together in .loc/.iloc (25ea7fd)
  * Ensure .loc with slice and scalar column returns Series (9492878)
  * Fix Modin OmniSci Docker example (b853c51)
  * Ensure Modin OmniSci + Modin Ray Docker containers install packages from conda-forge (032afd6)
  * Determine return type (Series or DataFrame) from one element Series (17ad1f0)
  * Update cloud examples (648b6a0)
  * Fix Modin OmniSci memory leak during `read_csv` (8581ba1)
  * Use `floor` for casting `float` to `int` for OmniSci 5.8.0 (c67a936)
  * Fix .loc on empty DataFrame (2260431)
  * Ensure Modin on Ray does not duplicate writes to disk on `to_csv` when workers die (6178a57)
  * Add support for `storage_options` argument in `read_*` functions except `read_excel` (77a00cc)
  * Ensure Modin Ray correctly raises exceptions when `to_parquet` or `to_csv` fail (8d67cd3)
  * Ensure Modin Ray does not hang when workers crash on `to_csv` (73bf061)
  * Remove platform specific code from `setup.py` to ensure distributions are pure Python (b186e40)
* Refactor Codebase
  * Update import of public index classes to import from `pandas.core.indexes.api ` module (488357a)
  * Replace `try...finally` with pytest fixtures (c349a94)
  * Restructure project files (b37bcf8)
  * Use `fsspec` to open files (b8a9c07)
  * Add LGTM Service to CI (b193fef)
  * Remove extraneous `*NUM_THREADS` environment variables from CI (b925625)
  * Update documentation + code + comment language to reflect new project structure (7a81588)
  * Update language to reflect new project structure and add implementation to BaseDataframeAxisPartition (7ab2d90)
  * De-dupe `read_fwf` and `read_csv` code (2f824f8)
  * Reformat entire codebase with `black` and `flake8` (75f698c)
* Pandas API implementations and improvements
  * Add support for `{true|false}_values` for `read_csv` for Modin OmniSci (9cd93f2)
  * Implement `explode` for Series and DataFrame (ddd4afe)
  * Support reading gzipped fwf (a80cb3b)
  * Add support for `to_parquet` Modin Ray (643596d)
  * Add support for creating an `sqlalchemy` connection with arbitrary arguments (ece98a6, 4a42e04)
  * Add support for `set_index` with different input types (cab37f2)
* XGBoost enhancements
  * Support new DMatrix parameters (4d7f6d4)
* Developer API enhancements
  * Throw custom errors when optional dependencies are missing (53bb047)
  * Improve Modin OmniSci quickstart (167957b)
* Update testing suite
* Documentation improvements
* Dependencies
  * Add fsspec (dependency for IO) to dependencies (44e3f10)
  * Make `botocore` import optional (adc15c6)
  * Pin minimum `s3fs` dependency to fix `aibotocore` issue (8acad95)
  * Update PyArrow to 5.0 and OmniSci to 5.8 (4121358)

Contributors
------------
@ienkovich, @vnlitvinov, @mvashishtha, @devin-petersohn, @dchigarev, @prutskov, @amyskov,
@gshimansky, @anmyachev, @YarShev, @Garra1980, @Rubtsowa, @jeffreykennethli, @RehanSD,
@dorisjlee, @naren-ponder