-
Notifications
You must be signed in to change notification settings - Fork 872
Insights: rapidsai/cudf
Overview
Could not load contribution data
Please try again later
36 Pull requests merged by 19 people
-
Build and test with CUDA 12.5.1
#16259 merged
Jul 16, 2024 -
Introduce dedicated options for low memory readers
#16289 merged
Jul 16, 2024 -
Introduce version file so we can conditionally handle things in tests
#16280 merged
Jul 16, 2024 -
Fix logic in to_arrow for empty list column
#16279 merged
Jul 16, 2024 -
Replace is_bool_type with checking .dtype.kind
#16255 merged
Jul 16, 2024 -
Type & reduce cupy usage
#16277 merged
Jul 16, 2024 -
Replace is_datetime/timedelta_dtype checks with .kind checks
#16262 merged
Jul 16, 2024 -
Fix convert_dtypes with convert_integer=False/convert_floating=True
#15964 merged
Jul 15, 2024 -
Make nvcomp adapter compatible with new version macros
#16245 merged
Jul 15, 2024 -
API: Check for integer overflows when creating scalar form python int
#16140 merged
Jul 15, 2024 -
MAINT: Adapt to NumPy 2 promotion changes
#16141 merged
Jul 15, 2024 -
Add multi-file support to
dask_cudf.read_json
#16057 merged
Jul 15, 2024 -
Add low memory JSON reader for
cudf.pandas
#16204 merged
Jul 12, 2024 -
Clean up state variables in MultiIndex
#16203 merged
Jul 12, 2024 -
Remove temporary functor overloads required by cuco version bump
#16242 merged
Jul 12, 2024 -
Update contains_tests.cpp to use public cudf::slice
#16253 merged
Jul 12, 2024 -
Expose sorted groupby parameters to pylibcudf
#16240 merged
Jul 12, 2024 -
Add docstring for from_dataframe
#16260 merged
Jul 12, 2024 -
Handle nans in groupby-aggregations in polars executor
#16233 merged
Jul 12, 2024 -
Improve the test data for pylibcudf I/O tests
#16247 merged
Jul 12, 2024 -
Assert valid metadata is passed in to_arrow for list_view
#16198 merged
Jul 12, 2024 -
Fix ArrowDeviceArray interface to pass address of event
#16058 merged
Jul 11, 2024 -
Expose reflection to check if casting between two types is supported
#16239 merged
Jul 11, 2024 -
remove
cuco_noexcept.diff
#16254 merged
Jul 11, 2024 -
Add Column.strftime/strptime instead of overloading
as_string/datetime/timedelta_column
#16243 merged
Jul 11, 2024 -
Allow only scale=0 fixed-point values in fixed_width_column_wrapper
#16120 merged
Jul 11, 2024 -
Add custom name setter and getter for proxy objects in
cudf.pandas
#16234 merged
Jul 11, 2024 -
New Decimal <--> Floating conversion
#15905 merged
Jul 11, 2024 -
Remove
mr
param fromwrite_csv
andwrite_json
#16231 merged
Jul 10, 2024 -
Migrate lists/extract to pylibcudf
#16071 merged
Jul 10, 2024 -
Promote IO support queries to cudf API
#16125 merged
Jul 10, 2024 -
Disable dict support for split-page kernel in the parquet reader.
#16128 merged
Jul 10, 2024 -
Add groupby_max multi-threaded benchmark
#16154 merged
Jul 10, 2024 -
Support
arrow:schema
in Parquet writer to faithfully roundtripduration
types with Arrow#15875 merged
Jul 9, 2024 -
Migrate pylibcudf lists gathering
#16170 merged
Jul 9, 2024 -
Parallelize
gpuInitStringDescriptors
for fixed length byte array data#16109 merged
Jul 9, 2024
25 Pull requests opened by 14 people
-
Deduplicate decimal32/decimal64 to decimal128 conversion function
#16236 opened
Jul 9, 2024 -
Remove hash_character_ngrams dependency from jaccard_index
#16241 opened
Jul 10, 2024 -
Short circuit some Column methods
#16246 opened
Jul 10, 2024 -
Update JNI build to support nvcomp4
#16250 opened
Jul 11, 2024 -
[WIP][RFC] Add sparse host buffer source
#16252 opened
Jul 11, 2024 -
Replace is_float/integer_dtype checks with .kind checks
#16261 opened
Jul 11, 2024 -
Implement support for scan_ndjson in cudf-polars
#16263 opened
Jul 11, 2024 -
Experimental gather prefetch
#16265 opened
Jul 12, 2024 -
[BUG]: Fix how args and kwargs are passed in `_fast_slow_function_call`
#16266 opened
Jul 12, 2024 -
Revert "Add custom name setter and getter for proxy objects in `cudf.pandas`"
#16267 opened
Jul 12, 2024 -
Preserve order in left join for cudf-polars
#16268 opened
Jul 12, 2024 -
[BUG] Make name attr of Index fast slow attrs
#16270 opened
Jul 12, 2024 -
Fix issue in horizontal concat implementation in cudf-polars
#16271 opened
Jul 12, 2024 -
Remove xml from sort_ninja_log.py utility
#16274 opened
Jul 12, 2024 -
Replace np.isscalar/issubdtype checks with is_scalar/.kind checks
#16275 opened
Jul 12, 2024 -
Update cudf::detail::grid_1d to use thread_index_type
#16276 opened
Jul 12, 2024 -
Added batch memset to memset data and validity buffers in parquet reader
#16281 opened
Jul 15, 2024 -
Revert "New Decimal <--> Floating conversion (#15905)"
#16283 opened
Jul 15, 2024 -
Make ColumnAccessor strictly require a mapping of columns
#16285 opened
Jul 15, 2024 -
Initial investigation into NumPy proxying in `cudf.pandas`
#16286 opened
Jul 16, 2024 -
Remove decimal/floating 64/128bit switches due to register pressure
#16287 opened
Jul 16, 2024 -
[JNI] Add setKernelPinnedCopyThreshold and setPinnedAllocationThreshold
#16288 opened
Jul 16, 2024 -
Add `drop_nulls` in `cudf-polars`
#16290 opened
Jul 16, 2024 -
Fix split_record for all empty strings column
#16291 opened
Jul 16, 2024 -
Fix tests for polars 1.2
#16292 opened
Jul 16, 2024
12 Issues closed by 4 people
-
[BUG] off-by-one errors in `cudf.date_range`
#12133 closed
Jul 16, 2024 -
[FEA] Add a low-memory JSON lines reader option based on byte range reads
#16122 closed
Jul 12, 2024 -
[BUG] skip_rows doesn't work properly in ChunkedParquetReader
#16273 closed
Jul 12, 2024 -
[BUG] Segfault in pylibcudf to_arrow interop when passing nested list and metadata
#16069 closed
Jul 12, 2024 -
[FEA] Reduce page faults when using managed memory
#13821 closed
Jul 11, 2024 -
[BUG] casting from `float32` to `Decimal64Dtype` is resulting in incorrect values
#14169 closed
Jul 11, 2024 -
[BUG] Column type for fixed_width_column_wrapper should be restricted.
#16092 closed
Jul 11, 2024 -
[BUG] `df.index.name = "indexer"` does not work as expected under `cudf.pandas`
#14524 closed
Jul 11, 2024 -
Remove memory resource parameter from `cudf::io::write_csv` and `cudf::io::write_json()` APIs
#16200 closed
Jul 10, 2024 -
[FEA] Create a multi-threaded nvbenchmark for groupby_max
#16134 closed
Jul 10, 2024 -
[FEA] Support `arrow:Schema` in Parquet writer for faithful roundtrip with Arrow via Parquet
#15847 closed
Jul 9, 2024 -
[FEA] Parallelize gpuInitStringDescriptors when Parquet input type is FIXED_LEN_BYTE_ARRAY
#14113 closed
Jul 9, 2024
15 Issues opened by 11 people
-
[BUG] `strings::split_record` throws exception on input having one empty row
#16284 opened
Jul 15, 2024 -
[BUG] Integer promotion fixes needed for NumPy 2 for comparison operators
#16282 opened
Jul 15, 2024 -
[BUG] error: subprocess-exited-with-error,error: metadata-generation-failed
#16278 opened
Jul 13, 2024 -
[FEA] Refactor Column/NamedColumn split in cudf-polars
#16272 opened
Jul 12, 2024 -
[DOC] cudf/source/user_guide/10min.ipynb gives warning on docs build as dask_cudf is missing
#16264 opened
Jul 11, 2024 -
[FEA] Support parquet row group skipping in Polars physical engine
#16257 opened
Jul 11, 2024 -
[FEA] Optionally inform users when a Polars query falls back to the CPU
#16256 opened
Jul 11, 2024 -
[Story] Enabling prefetching of unified memory
#16251 opened
Jul 11, 2024 -
[FEA] Enable using `num_rows` and `skip_rows` with `ParquetReader`
#16249 opened
Jul 11, 2024 -
[BUG] libcudf JSON reader crash with compressed data
#16248 opened
Jul 10, 2024 -
[FEA] Reduce time required to import cudf_polars
#16244 opened
Jul 10, 2024 -
[FEA] Updates to groupby_max multithreaded benchmark
#16237 opened
Jul 10, 2024 -
[FEA] Port the logic at page_data.cu:282 to use `thread_group`s and avoid the magic 32 multiples.
#16235 opened
Jul 9, 2024
52 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add TPC-H inspired examples for Libcudf
#16088 commented on
Jul 16, 2024 • 71 new comments -
Report number of rows per file read by PQ reader when no row selection and fix segfault in chunked PQ reader when skip_rows > 0
#16195 commented on
Jul 16, 2024 • 27 new comments -
Return `cudf::detail::host_vector` from `make_host_vector` and add a `make_device_uvector` overload
#16206 commented on
Jul 16, 2024 • 20 new comments -
Migrate CSV reader to pylibcudf
#16011 commented on
Jul 16, 2024 • 17 new comments -
Add tests for `pylibcudf` binaryops
#15470 commented on
Jul 16, 2024 • 16 new comments -
Deprecate Arrow support in I/O
#16132 commented on
Jul 12, 2024 • 13 new comments -
Update vendored thread_pool implementation
#16210 commented on
Jul 12, 2024 • 11 new comments -
Migrate expressions to pylibcudf
#16056 commented on
Jul 16, 2024 • 10 new comments -
Remove size constraints on source files in batched JSON reading
#16162 commented on
Jul 16, 2024 • 8 new comments -
Adds write-coalescing code path optimization to FST
#16143 commented on
Jul 16, 2024 • 6 new comments -
JSON tree algorithms refactor I: CSR data structure for column tree
#15979 commented on
Jul 16, 2024 • 5 new comments -
Implement polars string Replace and ReplaceMany
#16039 commented on
Jul 15, 2024 • 4 new comments -
Migrate lists/sorting to pylibcudf
#16179 commented on
Jul 15, 2024 • 4 new comments -
DOC: use intersphinx mapping in pandas-compat ext
#15846 commented on
Jul 15, 2024 • 3 new comments -
Hide visibility of non public symbols
#15982 commented on
Jul 15, 2024 • 3 new comments -
Migrate lists/filtering to pylibcudf
#16184 commented on
Jul 15, 2024 • 3 new comments -
Migrate lists/set_operations to pylibcudf
#16190 commented on
Jul 15, 2024 • 3 new comments -
Add skiprows and nrows to parquet reader
#16214 commented on
Jul 12, 2024 • 2 new comments -
Add support for axis=None in reductions
#16229 commented on
Jul 12, 2024 • 2 new comments -
Add environment variable to log cudf.pandas fallback calls
#16161 commented on
Jul 12, 2024 • 1 new comment -
Migrate Parquet reader to pylibcudf
#16078 commented on
Jul 16, 2024 • 1 new comment -
[FEA] Support axis=None in reductions
#12335 commented on
Jul 10, 2024 • 0 new comments -
Support quantile in cudf_polars
#16093 commented on
Jul 16, 2024 • 0 new comments -
Support min_by group by aggregate
#16163 commented on
Jul 11, 2024 • 0 new comments -
[WIP] Improve pyarrow-free remote-IO performance
#16166 commented on
Jul 15, 2024 • 0 new comments -
Migrate lists/filling to pylibcudf
#16189 commented on
Jul 15, 2024 • 0 new comments -
Improve performance of hash_character_ngrams using warp-per-string kernel
#16212 commented on
Jul 12, 2024 • 0 new comments -
Ensure cudf::ast::expressions api doesn't use detail types
#16217 commented on
Jul 16, 2024 • 0 new comments -
Support Literals in groupby-agg
#16218 commented on
Jul 15, 2024 • 0 new comments -
Refactor mixed_semi_join using cuco::static_set
#16230 commented on
Jul 16, 2024 • 0 new comments -
[FEA] Make `cudf.pandas` not perform redundant CPU<->GPU transfers if there is no in-place write operations
#15670 commented on
Jul 11, 2024 • 0 new comments -
[FEA] Implement a more accurate float to decimal conversion that supports rounding instead of truncation
#16155 commented on
Jul 11, 2024 • 0 new comments -
[BUG] FLOAT32 rounding more inaccurate than necessary
#14528 commented on
Jul 11, 2024 • 0 new comments -
[FEA] Produce and Consume ArrowDeviceArray struct from cudf::table / cudf::column
#14926 commented on
Jul 11, 2024 • 0 new comments -
[BUG] cudf-cuda11 not working in Databricks DBR 13.3 ML LTS on GPU instance
#16041 commented on
Jul 12, 2024 • 0 new comments -
[BUG] `cudaErrorInvalidDevice` when reading a parquet file with chunked reader when skip_rows > 0 and pass_read_limit > 0
#16186 commented on
Jul 12, 2024 • 0 new comments -
[FEA] Support Polars pearson correlation expression
#16220 commented on
Jul 15, 2024 • 0 new comments -
[FEA] Support rolling operations in Polars engine (window functions)
#16176 commented on
Jul 15, 2024 • 0 new comments -
[FEA] Support Polars `over` expression for rolling windows
#16227 commented on
Jul 15, 2024 • 0 new comments -
[BUG] Support string to datetime conversion in Polars engine
#16174 commented on
Jul 15, 2024 • 0 new comments -
[FEA] Support Polars `drop_nulls`
#16219 commented on
Jul 15, 2024 • 0 new comments -
Deprecate windowslinetermination from libcudf read_csv
#15985 commented on
Jul 15, 2024 • 0 new comments -
[BUG] StringMethods - Jaccard-index fails with long strings
#16157 commented on
Jul 15, 2024 • 0 new comments -
[QST] Running cudf terribly slow
#15976 commented on
Jul 15, 2024 • 0 new comments -
[FEA] Support Polars datetime `round` expression
#16226 commented on
Jul 15, 2024 • 0 new comments -
[FEA] Improve support or failure modes for numpy and other libraries with C APIs in cudf.pandas
#15397 commented on
Jul 15, 2024 • 0 new comments -
[FEA] Deduplicate `convert_data_to_decimal128()` function
#16194 commented on
Jul 16, 2024 • 0 new comments -
Occupancy improvement for Hash table build
#15700 commented on
Jul 15, 2024 • 0 new comments -
JSON reader validation of values
#15968 commented on
Jul 10, 2024 • 0 new comments -
Add libcudf example with large strings
#15983 commented on
Jul 10, 2024 • 0 new comments -
Experimental support for configurable prefetching
#16020 commented on
Jul 16, 2024 • 0 new comments -
Migrate lists/count_elements to pylibcudf
#16072 commented on
Jul 15, 2024 • 0 new comments