Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate $ as bitwise xor, replace by xor(), as discussed in #18696 #18977

Merged
merged 3 commits into from
Nov 14, 2016

Conversation

davidavdav
Copy link
Contributor

These two commits implement the replacement of $ (bitwise exclusive or) by xor(). The first commit includes all changes that are enough to pass the Julia unit tests. The second commit implements more changes in Base, found by grepping for ' $ '.

There still are 3851 lines in base with a $, but it is not directly clear if any of them is a xor or not (pun intended). This would require a Julia parser perhaps. It would be rather handy to be able to find all usages of $ as xor in the code.

@vchuravy
Copy link
Sponsor Member

I am not a big fan of this change. The shorthand form is really useful for implementing algorithm that require bit-twiddling . I see where the need for deprecation is coming from, but could we atleast then also add a Unicode infix alias? \veebar or \oplus should be the right ones.

@davidavdav
Copy link
Contributor Author

Sure, we can add one of these as a replacement, why not? This PR make searching for xor-usage of $ a little bit easier, although there is no guarantee it is complete.

@@ -904,7 +904,8 @@ function _ispermutationvalid_permute!{Ti<:Integer,Tp<:Integer}(perm::AbstractVec
n = length(perm)
checkspace[1:n] = 0
for k in perm
(0 < k <= n) && ((checkspace[k] $= 1) == 1) || return false
checkspace[k] = xor(checkspace[k], 1)
Copy link
Contributor

@tkelman tkelman Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this changes the control flow order, doesn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. I didn't think about the shortcut behaviour.

@carlobaldassi
Copy link
Member

could we atleast then also add a Unicode infix alias? \veebar or \oplus should be the right ones.

\veebar seems the correct one. At which point, we might as well have \vee, \wedge etc.

Though it may be worth noting that although having an infix substitute goes a long way, full "feature parity" will only be achieved when #15964 is fixed (cf. this and following).

@davidavdav
Copy link
Contributor Author

It looks like already is an operator. (Where is this defined?)

 = xor

should be enough, then?

But anyhow, this PR is not really about the addition of alternative operators. There is some discussion about that in #18696 and references therein. It looks like once there is consensus about the unicode operators, it is easy enough to add them. (And if people feel that should happen in this PR, I am happy to add them.)

Deprecation of $ (with the purpose of freeing it for some syntactic function in the future) needs to happen first, I believe. I've just gone through all $s in base, and at the moment it already appears to have quite varied semantics.

@tkelman
Copy link
Contributor

tkelman commented Oct 18, 2016

It looks like already is an operator. (Where is this defined?)

(add-dots '(+ - |\|| ⊕ ⊖ ⊞ ⊟ |++| ∪ ∨ ⊔ ± ∓ ∔ ∸ ≂ ≏ ⊎ ⊻ ⊽ ⋎ ⋓ ⧺ ⧻ ⨈ ⨢ ⨣ ⨤ ⨥ ⨦ ⨧ ⨨ ⨩ ⨪ ⨫ ⨬ ⨭ ⨮ ⨹ ⨺ ⩁ ⩂ ⩅ ⩊ ⩌ ⩏ ⩐ ⩒ ⩔ ⩖ ⩗ ⩛ ⩝ ⩡ ⩢ ⩣))))

@nalimilan
Copy link
Member

Deprecation of $ (with the purpose of freeing it for some syntactic function in the future) needs to happen first, I believe. I've just gone through all $s in base, and at the moment it already appears to have quite varied semantics.

I disagree. It would be much easier to just replace $ with , and mention this in the deprecation warning, than to remove all uses of $ first in Base and in packages, and only then introduce .

@KristofferC
Copy link
Sponsor Member

You need const if you want to do the definition as above.

@simonbyrne
Copy link
Contributor

It might also be worth adding \xor tab completion (I'm never going to remember \veebar)

@davidavdav
Copy link
Contributor Author

It might also be worth adding \xor tab completion (I'm never going to remember \veebar)

That is a good idea.

@JeffBezanson
Copy link
Sponsor Member

I've just gone through all $s in base, and at the moment it already appears to have quite varied semantics.

I'm aware of 2; xor (infix) and interpolation (prefix). Are there more?

@davidavdav
Copy link
Contributor Author

I'm aware of 2; xor (infix) and interpolation (prefix). Are there more?

I won't doubt you on Julia parsing or semantics :-)

Just grepping for $ I see a lot of $(...) constructions which probably are just interpolation, there is the odd end-of-string match re, sometimes the operator itself (as in export $)---I was just looking for the infix meaning. I hope to have found them all.

Copy link
Sponsor Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are using \veebar we should maybe remove the .\veebar version?

(define prec-plus (append! '($)
(add-dots '(+ - |\|| ⊕ ⊖ ⊞ ⊟ |++| ∪ ∨ ⊔ ± ∓ ∔ ∸ ≂ ≏ ⊎ ⊻ ⊽ ⋎ ⋓ ⧺ ⧻ ⨈ ⨢ ⨣ ⨤ ⨥ ⨦ ⨧ ⨨ ⨩ ⨪ ⨫ ⨬ ⨭ ⨮ ⨹ ⨺ ⩁ ⩂ ⩅ ⩊ ⩌ ⩏ ⩐ ⩒ ⩔ ⩖ ⩗ ⩛ ⩝ ⩡ ⩢ ⩣))))

@@ -252,7 +252,7 @@ const hasha_seed = UInt === UInt64 ? 0x6d35bb51952d5539 : 0x952d5539
function hash(a::Associative, h::UInt)
h = hash(hasha_seed, h)
for (k,v) in a
h $= hash(k, hash(v))
h = h ⊻ hash(k, hash(v))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add ⊻= in

'(:= => ~ $=)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure, thanks, but is this something for #15964, or is that about a more general approach?

Should I also deprecate $= there then, to really free $ for future syntactical purposes?

@simonbyrne
Copy link
Contributor

simonbyrne commented Oct 19, 2016

I don't think we should include ∧ and ∨ in this PR: see #6582 (comment) and #17472 for some discussion of this issue.

@simonbyrne
Copy link
Contributor

Also, I think we should keep \veebar, there's no problem with having 2 aliases

@davidavdav
Copy link
Contributor Author

davidavdav commented Oct 24, 2016

All right, I rebased everything a few times---things should be in line with master-as-we-write. I kept three two commits, so there is a choice what to include (either one or two or all three commits)

  • 63bbaef, deprecates $ and replaces infix-$ occurrences in base with xor()
  • e296b3e, introduces and ⊻=, and replaces most of the above xor()s in base with these. The result of these two commits compares most favourably with master, as it includes mainly $ diffs.

e4a340c, also introduces and as synonyms to | and &, as some people in this discussion suggested. This change is still under debate, I gather, so this commit can easily be cherry-unpicked.

I hope this rebase satisfies everyone who have taken part in this discussion.

@tkelman
Copy link
Contributor

tkelman commented Oct 24, 2016

please separate the third commit to a different pull request

@davidavdav
Copy link
Contributor Author

please separate the third commit to a different pull request

No problem, I removed the third commit from this PR and updated the comment above accordingly.

@@ -449,7 +449,7 @@ const uni_ops = Set{Symbol}([:(+), :(-), :(!), :(¬), :(~), :(<:), :(>:), :(√)
const expr_infix_wide = Set{Symbol}([
:(=), :(+=), :(-=), :(*=), :(/=), :(\=), :(^=), :(&=), :(|=), :(÷=), :(%=), :(>>>=), :(>>=), :(<<=),
:(.=), :(.+=), :(.-=), :(.*=), :(./=), :(.\=), :(.^=), :(.&=), :(.|=), :(.÷=), :(.%=), :(.>>>=), :(.>>=), :(.<<=),
:(&&), :(||), :(<:), :(=>), :($=)])
:(&&), :(||), :(<:), :(=>), :($=), :(⊻=)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess $= should be marked somehow as "remove from this list when deprecation is removed," right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right there. I left $= in there because it may still be used and it should be shown correctly in expressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment as Tony mentioned? Something like

# $= was previously used for xor, issue #18977

@@ -8,7 +8,7 @@
;; be an operator.
(define prec-assignment
(append! (add-dots '(= += -= *= /= //= |\\=| ^= ÷= %= <<= >>= >>>= |\|=| &=))
'(:= => ~ $=)))
'(:= => ~ $= ⊻=)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⊻= can be under add-dots now right? I think $= wasn't added there mostly because $ has other meanings that were making it complicated to parse correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure---I have to admit that my lisp is not up to scratch. From the code I understand that the add-dots operators add a dotted version as well. ⊻= already works element wise. But there is no broadcasting .⊻ indeed. Should that be added?

@@ -78,7 +78,7 @@
; operators that are special forms, not function names
(define syntactic-operators
(append! (add-dots '(= += -= *= /= //= |\\=| ^= ÷= %= <<= >>= >>>= |\|=| &=))
'(:= --> $= => && |\|\|| |.| ... ->)))
'(:= --> $= ⊻= => && |\|\|| |.| ... ->)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkelman I suppose this ⊻= should then also move one line up to add-dots, right?

@@ -2060,6 +2060,7 @@
'&= lower-update-op
'.&= lower-update-op
'$= lower-update-op
'⊻= lower-update-op
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkelman ...and here I should then add a line with '.⊻= ...

@davidavdav
Copy link
Contributor Author

Weekly rebase, to keep up with master.

  • 5e853bb deprecates $ and replaces infix-$ occurrences in base with xor()
  • ce94dff introduces and ⊻=, and replaces most of the above xor()s in base with these. The result of these two commits compares most favourably with master, as it includes mainly $⟶⊻ diffs.

This rebase includes the latest changes suggested in a review by @tkelman.

Incidentally, I am planning another PR that will introduce .&, .| and .⊻ broadcasting operators, which will automatically enable the use of .&=, .|= and .⊻=.

xor.() already automagically works in this commit thanks to #15032.

@davidavdav
Copy link
Contributor Author

Weekly rebase, to keep up with master.

@simonbyrne
Copy link
Contributor

Can you add an item to NEWS.md?

@simonbyrne
Copy link
Contributor

Can you resolve the conflict with NEWS.md? Then I think this is ready to go.

@simonbyrne simonbyrne added this to the 0.6.0 milestone Nov 10, 2016
This includes:
 - occurrances of $ (as infix operator) and $= in base
 - occurances of $ in test
 - the definition in the manual
 - the entry in REPL help

This commit does not introduce an alternative,
that will happen in the next commit.
Please note that `⊻=` is in the add-dots group in the parser, preparing for `.⊻=`.
However, this commit does not implement `.⊻` yet, this should happen in
a separate commit with `.&` and `.|` together.
@davidavdav
Copy link
Contributor Author

The build failure on linux is not related to this PR, I think.

@musm
Copy link
Contributor

musm commented Nov 14, 2016

although I'm against this change, it seems strange to just deprecate $, while still keeping & and |. If this change passes, It would make sense to change all three for consistency, otherwise it's confusing that & behaves as you expect, but $ doesn't.

@simonbyrne simonbyrne merged commit 2751493 into JuliaLang:master Nov 14, 2016
@simonbyrne
Copy link
Contributor

Done! Thanks for your patience, modifying that many files in one go is always a bit tricky.

@simonbyrne
Copy link
Contributor

@musm The distinction is that & and | have a long history as bitwise and/or (dating back to C at least). The use of $ for xor on the other hand was very much nonstandard (C uses ^, which we already use for raising to a power), and so I don't think changing this affect anyone's expectations.

simonbyrne added a commit to JuliaLang/Compat.jl that referenced this pull request Nov 14, 2016
simonbyrne added a commit to JuliaLang/Compat.jl that referenced this pull request Nov 14, 2016
@musm
Copy link
Contributor

musm commented Nov 14, 2016

fair point. Doesn't someone have to update the syntax highlighting on github etc. for this change? It used to be that $ got highlighted.

@davidavdav davidavdav deleted the deprecate-dollar branch November 19, 2016 13:47
@mgkuhn
Copy link
Contributor

mgkuhn commented Jan 27, 2017

In the computing literature, I encounter ⊕ (U+2295 CIRCLED PLUS) far more commonly as the XOR operator [same as addition over GF(2^n)] than ⊻ (U+22BB). How about supporting both?

@nalimilan
Copy link
Member

In general it's not a good idea to support multiple ways of doing the same thing in a language (an exception to this is ASCII vs. Unicode operators for practical reasons, but at least this is a consistent rule). You can define it for yourself if you really find better.

@tkelman
Copy link
Contributor

tkelman commented Jan 27, 2017

U+22BB is called xor in unicode. I see ⊕ used for dilation/Minkowski sum pretty often, it's context-dependent and a useful operator to leave available since it doesn't have a single unambiguous interpretation.

@carlobaldassi
Copy link
Member

Besides what @nalimilan and @tkelman wrote, I would guess that the reason why ⊕ is commonly used for xor in the literature is that for 0/1 variables xor is the same as "addition modulo 2". However, when xor'ing 2 integers you expect a bitwise xor, so ⊕ doesn't seem right to me in that context.

@davidavdav davidavdav restored the deprecate-dollar branch January 27, 2017 15:29
@davidavdav davidavdav deleted the deprecate-dollar branch January 27, 2017 15:35
@mgkuhn
Copy link
Contributor

mgkuhn commented Jan 27, 2017

@tkelman I would only suggest to define ⊕ as carry-less addition (aka XOR) for Unsigned or Integer, as the operator makes no sense for other types (e.g. floating point). That excludes conflicts with non-integer uses of the same symbol, e.g. the ones in relativity you mention.

@carlobaldassi If you cut off the carry-over bits in n-bit modular integer arithmetic, the resulting operations are called binary Galois Field arithmetic, often abbreviated as GF(2^n). The GF(2^n) addition is exactly the n-bit XOR over unsigned integers. Modulo 2 arithmetic is just a special case n=1 of binary Galois Field arithmetic, commonly labeled GF(2).

Galois Field arithmetic is widely used in cryptography and in error-correction algorithms, as it has similar algebraic properties as integers, without the hardware cost of having to wait for carry-over ripple in your adder. Intel and AMD CPUs since 2010 have included a carry-less multiplication instruction for it.

I would suggest to define in Julia the operators ⊕ and ⊗ for carry-less addition (aka XOR) and carry-less multiplication (CLMUL) between unsigned integers.

@carlobaldassi
Copy link
Member

@mgkuhn that was exactly my point: addition on GF(n) and xor are only equivalent for n=2. When you do e.g. xor(a::UInt, b::UInt) you expect bitwise xor instead, treating the UInts as bit masks. So ⊕ is not suited for that. For operations on GF(n), you'd better implement your own integer type with n as a type parameter, at which point you can just overload + and *.

@StefanKarpinski
Copy link
Sponsor Member

There are also many other possible meanings of ⊕ and ⊗ in other areas (e.g. ⊗ as a tensor product). You're free to define these operators in your own projects, but it's inappropriate for the base language to force specific meanings on symbols that have multiple standard meanings.

dpsanders pushed a commit to dpsanders/Compat.jl that referenced this pull request Feb 1, 2017
fcard pushed a commit to fcard/julia that referenced this pull request Feb 28, 2017
…Lang#18696 (JuliaLang#18977)

* Deprecate $ for xor() (JuliaLang#18696)

This includes:
 - occurrances of $ (as infix operator) and $= in base
 - occurances of $ in test
 - the definition in the manual
 - the entry in REPL help

This commit does not introduce an alternative,
that will happen in the next commit.

* Add ⊻ as alternative to xor, and `⊻=` as alternative to `$=`.

Please note that `⊻=` is in the add-dots group in the parser, preparing for `.⊻=`.
However, this commit does not implement `.⊻` yet, this should happen in
a separate commit with `.&` and `.|` together.

* Add deprecation comment in show.jl and deprecation item in NEWS.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet