Skip to content

Tags: rikhuijzer/DataFramesFork.jl

Tags

v1.3.4

Toggle v1.3.4's commit message
[Diff since v1.3.3](JuliaData/DataFrames.jl@v1.3.3...v1.3.4)

**Closed issues:**
- add expandgrid (JuliaData#3027)
- `stack` not catching invalid value of keyword `variable_eltype` (JuliaData#3042)
- Appending `Dataframe`s after `CSV.read` fails for different length `String` columns (JuliaData#3044)
- make `clipboard(df)` work (JuliaData#3045)

**Merged pull requests:**
- add allcombinations (JuliaData#3031) (@bkamins)
- allow scalars in subset and subset! as conditions (JuliaData#3032) (@bkamins)
- Fix handling of variable_eltype in stack (JuliaData#3043) (@bkamins)

v1.3.3

Toggle v1.3.3's commit message
[Diff since v1.3.2](JuliaData/DataFrames.jl@v1.3.2...v1.3.3)

**Closed issues:**
- Add shuffle, shuffle! functions (JuliaData#2048)
- Add `groupindices` as special source argument in minilanguage (JuliaData#2683)
- Update the broadcasted getproperty when Julia 1.7 is out (JuliaData#2804)
- Better error for disallowmissing function (JuliaData#2945)
- Could it be useful to add the ungroup keyword to the filter function?  (JuliaData#2954)
- Error message for "Number of returned columns does not match" (JuliaData#2959)
- Function to insert columns (JuliaData#2972)
- Allow functions in DataFrames.jl to pick how many threads they use (JuliaData#2992)
- `first(gdf::GroupedDataFrame, n::Int)` should give a `GroupedDataFrame`? (JuliaData#2993)
- unstack fails without an id column (JuliaData#2994)
- Some error on Julia 1.7.1 (JuliaData#2996)
- groupby docs error? (JuliaData#2997)
- permutedims with CategoricalArray (JuliaData#3003)
- `d[:a, ]` changed the original data.frame (JuliaData#3014)
- Add keyword argument `source` in `mapreduce` to match `reduce` (JuliaData#3016)
- Fix describe documentation (JuliaData#3018)
- Flag to disable threading for debug purposes (JuliaData#3019)
- Make indexing of eachrow and eachcol return the object of the same type on a view of the parent (JuliaData#3023)
- subset(df) with no conditions should return unaltered DataFrame (JuliaData#3024)
- Keyword arg `cols` and `source` for `mapreduce` (JuliaData#3028)
- ```outerjoin```: keyword augument ```matchmissing``` not correctly passed (JuliaData#3039)

**Merged pull requests:**
- allow no rowkey in unstack (JuliaData#2995) (@bkamins)
- allow function in allowduplicates in unstack (JuliaData#2998) (@bkamins)
- Use `julia-actions/cache`; also for the `docs` job (JuliaData#2999) (@rikhuijzer)
- Fix typo in `groupby` docstring (JuliaData#3000) (@nalimilan)
- Implementation of eachindex, proprow, and groupindices (JuliaData#3001) (@bkamins)
- Handle Base.CanonicalIndexError introduced in Julia 1.8 (JuliaData#3002) (@bkamins)
- make permutedims more flexible (JuliaData#3004) (@bkamins)
- add `first`, `last` methods with `n` for gdf (JuliaData#3006) (@ericphanson)
- doc: remove superfluous word (JuliaData#3007) (@Mo-Gul)
- Improved error when column numbers do not match in transformations (JuliaData#3009) (@bkamins)
- add reverse!, shuffle, shuffle!, permute!, and invpermute! (JuliaData#3010) (@bkamins)
- Add fillcombinations function (JuliaData#3012) (@bkamins)
- Fix docstrings of fast row-wise transformation (JuliaData#3015) (@bkamins)
- add insertcols (JuliaData#3020) (@bkamins)
- add ungroup keyword argument to filter (JuliaData#3021) (@bkamins)
- make broadcasting assignment consistent with ! (JuliaData#3022) (@bkamins)
- handle empty args in subset (JuliaData#3025) (@bkamins)
- avoid categorical promotion (JuliaData#3026) (@bkamins)
- Update LICENSE.md (JuliaData#3029) (@bkamins)
- Use cycle notation to speed up `permute!` (JuliaData#3035) (@LilithHafner)
- Make indexing of eachrow return the object of the same type on a view of the parent (JuliaData#3037) (@bkamins)
- Fix keyword argument syntax in `DataFrame` docstring (JuliaData#3038) (@nalimilan)
- make sure we correctly pass matchmissing in joins (JuliaData#3040) (@bkamins)

v1.3.2

Toggle v1.3.2's commit message
[Diff since v1.3.1](JuliaData/DataFrames.jl@v1.3.1...v1.3.2)

**Closed issues:**
- Variance in runtime reduction functions (JuliaData#2956)
- use of map in ByRow (JuliaData#2957)
- Replace and Missing Values (JuliaData#2976)
- Subset and Missing Values (JuliaData#2977)
- copying of columns in select! and transform! (JuliaData#2978)
- Unexpected Behavior of Combined Column Selection (JuliaData#2980)

**Merged pull requests:**
- Add a note about df.col .= v broadcasting changes (JuliaData#2971) (@bkamins)
- Update working_with_dataframes.md (JuliaData#2973) (@alfaromartino)
- Clean up join code (JuliaData#2975) (@bkamins)
- Add links to docs, rephrase a bit (JuliaData#2979) (@nalimilan)
- fix aliasing detection in sort! (JuliaData#2981) (@bkamins)
- make sure ByRow invokes generic map (JuliaData#2982) (@bkamins)
- make sure we use source column only once (JuliaData#2983) (@bkamins)
- Update subset to handle large number of selectors better (JuliaData#2989) (@bkamins)

v1.3.1

Toggle v1.3.1's commit message
[Diff since v1.3.0](JuliaData/DataFrames.jl@v1.3.0...v1.3.1)

**Closed issues:**
- Decide if we want to rename All to Cols (JuliaData#2203)
- Creating new columns on a `view` should fill in missings everywhere else.  (JuliaData#2211)
- Consider allowing to sort! a SubDataFrame (JuliaData#2300)
- Locate the problem in disallowmissing error (JuliaData#2965)
- Arrow Notation within Column Selection is Inconsistent (JuliaData#2969)

**Merged pull requests:**
- fix: change "dont" to "don't" (JuliaData#2962) (@Mo-Gul)
- better disallowmissing error message (JuliaData#2966) (@bkamins)
- fix issues with parameter type printing in doctests (JuliaData#2967) (@bkamins)
- update docs for join on (left, right) tuple (JuliaData#2968) (@visr)
- fix getindex with vector of Pairs (JuliaData#2970) (@bkamins)

v1.3.0

Toggle v1.3.0's commit message
[Diff since v1.2.2](JuliaData/DataFrames.jl@v1.2.2...v1.3.0)

**Closed issues:**
- Port pqr benchmarks (JuliaData#298)
- Memory efficiency of join (JuliaData#1334)
- Selections.jl + DataFrames.jl (JuliaData#1936)
- Add support for All, Between and Not broadcating (JuliaData#2171)
- `filter(df, :x => f)` would be useful to have (JuliaData#2187)
- allow selector => fun1 => fun2 in select and combine (JuliaData#2207)
- add a `leftjoin!` (or `match!` or `merge!` or whatever it should be called) (JuliaData#2259)
- Provide a syntax to perform row aggregations fast (JuliaData#2439)
- Investigate performance of aggregations (JuliaData#2440)
- Rework the manual (JuliaData#2595)
- Add `after` keyword argument to `insertcols!` (JuliaData#2613)
- control fill value for missing cells in `unstack` (JuliaData#2698)
- Allow selecting columns based on predicate on column contents (JuliaData#2747)
- Fast row aggregation in DataFrames.jl (JuliaData#2768)
- Add a method to add/insert empty columns (JuliaData#2783)
- Assignment to SubDataFrame (JuliaData#2785)
- DataFrameMacros.jl and DataFramesMeta.jl (JuliaData#2793)
- DataFrames not threadsafe  (JuliaData#2795)
- Better documentation for `combine(gd, fun => :x)` (JuliaData#2830)
- AsTable in combine seems to require at least one column (JuliaData#2832)
- implement `Tables.materializer(::Type{<:AbstractDataFrame})`? (JuliaData#2833)
- Should ByRow use map or not (JuliaData#2834)
- Error for `unstack`ing an empty dataframe (JuliaData#2841)
- The `test/show.jl` tests fail when Julia is started with `julia --color=no` (JuliaData#2846)
- Faster count (JuliaData#2849)
- AsTable docstring doesn't mention it can be used as a target for select etc. (JuliaData#2850)
- delete! in DataFrames.jl (JuliaData#2853)
- Import nrow and ncol from DataAPI.jl (JuliaData#2855)
- Support the Case of `Matrix{Any}` as Data and `Vector{Any}` as Header (JuliaData#2858)
- Allow DataFrame(matrix, names, copycols=false) (JuliaData#2860)
- Displaying `DateTime` columns (JuliaData#2861)
- update docs to CSV.jl 0.9 (JuliaData#2864)
- Better error messages when frame is empty (JuliaData#2867)
- Add "Filtering" section to the documentation User Guide. (JuliaData#2871)
- Add documentation for transformation functions without the Split-Apply-Combine strategy to User Guide. (JuliaData#2872)
- Make Cols more flexible (JuliaData#2875)
- In src => fun => dst allow transformation function in dst (JuliaData#2876)
- Ambiguity error between CategoricalArrays and SentinelArrays (JuliaData#2883)
- ByRow and transform not working (JuliaData#2884)
- Avoid mixing standard and scientific floats in output (JuliaData#2885)
- Updating ClassImbalance.jl; Needed help debugging (JuliaData#2886)
- mixing `:x => :y` and `:x => f => :y` syntax in vector to `select` errors (JuliaData#2888)
- Trimming variables in a data frame (JuliaData#2891)
- renamecols function for transform (JuliaData#2893)
- `TableOperations.joinpartitions` doesn't work properly (JuliaData#2895)
- Correct `isiterable(DataFrame)` (JuliaData#2896)
- Strange behaviour with non-ASCII column names (JuliaData#2901)
- `tf` keyword argument from PrettyTables.jl does not work in DataFrames.jl `show` function. (JuliaData#2903)
- Aggregate function with multiple output columns of different types (JuliaData#2905)
- Recommend PooledArrays to pool data (JuliaData#2908)
- update DataFramesMeta.jl docs (JuliaData#2910)
- Add contributing opportunities to the contributing guide (JuliaData#2912)
- Default `show` truncates too soon (JuliaData#2913)
- DataFrames logo banner (JuliaData#2917)
- Regenerate precompile statements for 1.3 release (JuliaData#2921)
- subset doesn't accept a vector of transformations (JuliaData#2924)
- Printing of data frames in try-catch (JuliaData#2925)
- Modifying transformations with grouped dataframes (JuliaData#2927)
- Improve filter docs (JuliaData#2930)
- Improve sort docs (JuliaData#2931)
- DataFrames errors on loading with `--depwarn=error` (JuliaData#2935)
- Add `AsTable([:a, :b]) => AsTable` (JuliaData#2939)
- Grouped describe fails or "clashes" with StatsBase (JuliaData#2952)

**Merged pull requests:**
- Add standard deviation and 25% and 75% quantiles to `describe` :detailed (JuliaData#2459) (@nalimilan)
- Support adding columns to views (JuliaData#2794) (@bkamins)
- Add muli-threading support description to the manual (JuliaData#2823) (@bkamins)
- feat: `unstack` receives kwarg `fillvalue` (JuliaData#2828) (@pstorozenko)
- feat: `insertcols!` receives kwarg `after` (JuliaData#2829) (@pstorozenko)
- explain that fun => target does not work in general (JuliaData#2836) (@bkamins)
- more careful test of ByRow for PooledArray (JuliaData#2837) (@bkamins)
- fix transformation minilanguage docs (JuliaData#2838) (@bkamins)
- add Tables.materializer for types methods (JuliaData#2839) (@bkamins)
- Fix typo math => match (JuliaData#2840) (@Nosferican)
- Fix empty unstack on empty data frame (JuliaData#2842) (@bkamins)
- Bk/add leftjoin! (JuliaData#2843) (@bkamins)
- Fix tests broken by Julia Base changes (JuliaData#2844) (@bkamins)
- Disable color testing when color is not supported (JuliaData#2847) (@bkamins)
- Improve docstring of AsTable (JuliaData#2851) (@bkamins)
- Fix three uses of "data table" (JuliaData#2852) (@nalimilan)
- deprecate delete!, define deleteat! (JuliaData#2854) (@bkamins)
- use nrow and ncol from DataAPI.j (JuliaData#2856) (@bkamins)
- Fix signature of constructor in docstring (JuliaData#2857) (@nalimilan)
- make DataFrame constructor more flexible (JuliaData#2859) (@bkamins)
- fix transpose error message and clean up code (JuliaData#2862) (@bkamins)
- Update to latest GA for docs (JuliaData#2863) (@quinnj)
- update docs following CSV.jl 0.9 release (JuliaData#2865) (@bkamins)
- code cleanup to improve error messages (JuliaData#2868) (@bkamins)
- Add fast reductions (JuliaData#2869) (@bkamins)
- fix: typo (JuliaData#2873) (@kunzaatko)
- fix: do not copy syntax is with ! (JuliaData#2874) (@kunzaatko)
- Allow constructing Matrix from empty dataframe (JuliaData#2878) (@jakobnissen)
- Fix typo in NEWS.md (JuliaData#2880) (@bkamins)
- Allow predicate in Cols (JuliaData#2881) (@bkamins)
- Improve docstring for names() (JuliaData#2882) (@xluo127)
- avoid not specialized Pair issue (JuliaData#2889) (@bkamins)
- Specify why leftjoin! needs at most one match (JuliaData#2894) (@rikhuijzer)
- allow transformation destination to be a function (JuliaData#2897) (@bkamins)
- improve docs alignment (JuliaData#2898) (@bkamins)
- improve missings documentation (JuliaData#2899) (@bkamins)
- add filter and subset to documentation (JuliaData#2900) (@bkamins)
- Try to detect unicode normalization issues in column names (JuliaData#2904) (@bkamins)
- Faster computation of quantiles in `describe` (JuliaData#2909) (@nalimilan)
- add info about PooledArrays (JuliaData#2911) (@bkamins)
- Add more guidance for new contributors (JuliaData#2914) (@bkamins)
- Update Querying frameworks DataFramesMeta.jl docs  (JuliaData#2915) (@pdeffebach)
- hardening haskey (JuliaData#2916) (@bkamins)
- Add broadcasting of selectors to the minilanguage (JuliaData#2918) (@bkamins)
- Add general fast aggregation for wide tables with collect (JuliaData#2920) (@bkamins)
- Fix tests of names (JuliaData#2922) (@bkamins)
- Update ci.yml in preparation of Julia 1.6 LTS (JuliaData#2923) (@bkamins)
- Allow passing multiple columns to subset (JuliaData#2926) (@bkamins)
- docs: fix typo and add some newlines in tutorial (JuliaData#2932) (@rfourquet)
- mention ClipData.jl (JuliaData#2933) (@Datseris)
- Correctly handle functors when auto-generating column names (JuliaData#2934) (@bkamins)
- plan for a change in broadcasting rules in Julia 1.7 (JuliaData#2937) (@bkamins)
- Change join tests to reduce memory consumption (JuliaData#2938) (@bkamins)
- Improve Docstrings for `sort` and `sort!` (JuliaData#2940) (@Chandu-4444)
- Add examples for `issorted` docstrings. (JuliaData#2941) (@Chandu-4444)
- Add row indexing to filter docstring and examples. (JuliaData#2942) (@nathanrboyer)
- Reduce test memory usage (JuliaData#2943) (@bkamins)
- Add  `reverse` prototype (JuliaData#2944) (@Chandu-4444)
- Define sort! for AbstractDataFrame and fix issues of kwargs in sorting functions (JuliaData#2946) (@bkamins)
- Make transformation docstring more precise (JuliaData#2948) (@bkamins)
- Catch OutOfMemoryError (JuliaData#2949) (@bkamins)
- clean up source code (JuliaData#2950) (@bkamins)
- Add `view` kwarg to `first` and `last` (JuliaData#2951) (@Chandu-4444)
- Generate precompile statements for Julia 1.7 (JuliaData#2955) (@bkamins)

v1.2.2

Toggle v1.2.2's commit message
[Diff since v1.2.1](JuliaData/DataFrames.jl@v1.2.1...v1.2.2)

**Closed issues:**
- Add method to filter on `Bool` column symbols (JuliaData#2465)
- Enable documenter doctests (JuliaData#2702)
- Extend => renaming syntax (JuliaData#2728)
- add a keyword to specify group order in groupby (JuliaData#2762)
- `subset` with grouped data frame has worse compile times than `transform` (JuliaData#2806)
- Performance Issues with filter and subset  (JuliaData#2821)
- Extremely slow GroupBy behaviour on a small table (JuliaData#2822)
- Is there any Julia alternatives to to_dict function in pandas? (JuliaData#2824)

**Merged pull requests:**
- making a new top level section to work with DataFrames (JuliaData#2717) (@RohitRathore1)
- make sort kwarg in groupby more flexible (JuliaData#2812) (@bkamins)
- add a link to JuliaCon2021 tutorial to docs (JuliaData#2817) (@bkamins)
- review of the DataFrames.jl tutorial (JuliaData#2825) (@bkamins)
- correct signature of merge for AbstractIndex (JuliaData#2826) (@bkamins)

v1.2.1

Toggle v1.2.1's commit message
[Diff since v1.2.0](JuliaData/DataFrames.jl@v1.2.0...v1.2.1)

**Closed issues:**
- "transform" function not available (JuliaData#2815)
- How to change the value of a cell to a different data type? (JuliaData#2816)
- `dropmissing!` creates weird memory bugs/errors on 1.7 and 1.6 (JuliaData#2819)

**Merged pull requests:**
- Document GroupedDataFrame consistency check (JuliaData#2811) (@bkamins)
- fix delete! for versions of Julia 1.6.2 or earlier (JuliaData#2820) (@bkamins)

v1.2.0

Toggle v1.2.0's commit message
[Diff since v1.1.1](JuliaData/DataFrames.jl@v1.1.1...v1.2.0)

**Closed issues:**
- Add `matchmissing = :notequal` option (JuliaData#2650)
- Implement `pushfirst!` to allow appending rows in the beginning of a DataFrame (JuliaData#2678)
- Review comparisons with R/Python (JuliaData#2737)
- Slow sorts in columns with Union{<:Any, missing} even if no missing values in the column (JuliaData#2745)
- Display complex numbers - alignment (JuliaData#2754)
- Slow row aggregation in presence of missings (JuliaData#2757)
- Convert column from string to float (JuliaData#2761)
- Improve SubDataFrame creation for AbstractVector{Bool} (JuliaData#2765)
- Flatten in case column contains string and array (JuliaData#2766)
- Question: Small Delimited file into DataFrame (JuliaData#2772)
- transform(df, :x => AsTable)` should probably work (JuliaData#2779)
- missing method `combine(gd::GroupedDataFrame, ::Matrix)` (JuliaData#2781)
- Sync with DataAPI.jl 1.7 release (JuliaData#2788)
- inconsistency of groupby() for -0.0 (JuliaData#2790)
- Clean up precompile statements (JuliaData#2792)
- Test failures when using `julia --color=no` (JuliaData#2796)
- Differently typed columns when using `DataFrame(myVector)` vs `DataFrame(x = myVector)` (JuliaData#2798)
- DataFrame(table) != DataFrame(table, copycols=true) (JuliaData#2799)
- html dataframe representation includes invalid placement of <p> tag (JuliaData#2800)
- subset!(gd::GroupedDataFrame, ...) should make sure `gd` still works after (JuliaData#2808)

**Merged pull requests:**
- Matchmissing == :notequal (JuliaData#2724) (@pstorozenko)
- Update comparisons with data.table info (JuliaData#2725) (@eloualiche)
- Run `findall(rows)` only if `rows` are not all true (JuliaData#2727) (@pstorozenko)
- Fix type instability in sort for few columns case and fix issorted bug (JuliaData#2746) (@bkamins)
- Cover corner case of compactype (wide name and CategoricalValue) (JuliaData#2751) (@bkamins)
- Update docs URLs in README (JuliaData#2752) (@ViralBShah)
- reviewed and fixed (JuliaData#2755) (@RohitRathore1)
- Alignment of complex numbers (JuliaData#2756) (@ronisbr)
- audit more master -> main (JuliaData#2758) (@Moelf)
- make "Edit on Github" points to main branch (JuliaData#2759) (@Moelf)
- Mark outdated docs (JuliaData#2760) (@pfitzseb)
- update NEWS.md (JuliaData#2763) (@bkamins)
- add _findall for AbstractVector{Bool} and use it in internal functions (JuliaData#2769) (@bkamins)
- Explicit loop in `_findall` to avoid allocations (JuliaData#2771) (@pstorozenko)
- add information how DelimitedFiles can be used (JuliaData#2773) (@bkamins)
- Put longer type into th title argument in HTML show (JuliaData#2774) (@mortenpi)
- Deprecate AbstractVector in hcat (JuliaData#2777) (@bkamins)
- remove escape in Char (JuliaData#2778) (@bkamins)
- allow :col => AsTable and :col => cols (JuliaData#2780) (@bkamins)
- allow Matrices in transformations of GroupedDataFrame (JuliaData#2782) (@bkamins)
- Use latest Documenter.jl (JuliaData#2786) (@bkamins)
- Fix float grouping (JuliaData#2791) (@bkamins)
- Use standard Tables.Schema constructor instead of constructing directly (JuliaData#2797) (@quinnj)
- move summary outside of a <table> in text/html (JuliaData#2801) (@bkamins)
- Add some clarifying comments on copycols for Tables.jl inputs (JuliaData#2805) (@quinnj)
- up DataAPI.jl to 1.7 and CategoricalArrays.jl to 0.10.0 (JuliaData#2807) (@bkamins)
- improve subset! for GroupedDataFrame (JuliaData#2809) (@bkamins)
- update precompilation and .gitignore (JuliaData#2810) (@bkamins)

v1.1.1

Toggle v1.1.1's commit message
[Diff since v1.1.0](JuliaData/DataFrames.jl@v1.1.0...v1.1.1)

**Closed issues:**
- DataFrames with many columns are too slow (because of show()) (JuliaData#2739)
- Unable to install DataFrames: error regarding ComposedFunction (JuliaData#2748)

**Merged pull requests:**
- Optimize `completecases` to process only missingable columns (JuliaData#2726) (@pstorozenko)
- fix performance issue in multirow split-apply-combine (JuliaData#2749) (@bkamins)
- use dict to cache eltype names (JuliaData#2750) (@bkamins)

v1.1.0

Toggle v1.1.0's commit message
[Diff since v1.0.2](JuliaData/DataFrames.jl@v1.0.2...v1.1.0)

**Merged pull requests:**
- require AbstractVector from subset selectors (JuliaData#2744) (@bkamins)