Hacker News new | past | comments | ask | show | jobs | submit login
Mypyc: Compile type-annotated Python to C (github.com/python)
234 points by mvolfik 12 days ago | hide | past | favorite | 94 comments

I am always keeping and eye on mypyc, typed_python (llvm Python compiler)[0] and nuitka[1]

I guess that because Python is extremely dynamic, we may never have a full everything-works compiler, but I’m excited about the possibility of this becoming some kind of intermediate step where different parts of the program get compiled when possible.

[0] https://github.com/APrioriInvestments/typed_python [1] https://github.com/Nuitka/Nuitka

We can have a compiler that does everything. It's just a matter of whether you have to stick the python interpreter in the compiled binary or not, or how much of it you have to use and whether you can only use the parts required. This is how a lot of Scheme compilers work, even though you still have `eval` and similar things.

Nuitka only removes interpreter overhead. (just 30%) It's still quite slow. To get real performance improvements, we'd need memory optimizations such as a modern JIT's hidden classes and shapes, which store data directly on the object, instead of inside a dictionary. https://mathiasbynens.be/notes/shapes-ics

Have you considered Kotlin and Graal? It's obviously not Python, but Kotlin feels syntactically like Python meets Java, and since it compiles to byte code, you can do AoT compilation in Graal.

Edit: apparently GraalPython is a thing.

Syntactically, sure. But D is semantically a better combination of Python and Java. With `mixin`, you can `eval` arbitrary strings at compile time. You can call methods and use variables that are dynamically generated, like Python's `__getitem__` with D's `opDispatch`. You can generate code based on names and member variables using D's traits. You can use Python-like refcounting with `RefCounted!`. You can use Python's generators/iterators using D's lazy ranges, which are just as fast as manual for loops.[0] You can bind to Python with little effort using PyD. Just like Python, D has great C interop.

D compiles quickly and has much nicer syntax than C or C++.

[0]: https://forum.dlang.org/post/[email protected]

There a new book about D for Python programmers.

"D, the Best Programming Language, for Former Python Developers: (Learn D Programming for Python Developers)"


Nim seems like a much more compelling alternative.

It generally feels pretty pythonic (At least as much as anything typed can), and it certainly scratches the "compiles to efficient machine code with no dependencies" itch.

The main benefit of Python is the ecosystem.

For ML and scientific work, yes. Aside from that, Java has a very good ecosystem. I personally think it's better than Python's, but they're both good.

In what concerns bindings to C, C++ and Fortran libraries, almost every language has them.

Kotlin Native also compiles to platform binaries.

Kotlin Native is going through a reboot after they realised making a memory model incompatible with JVM semantics wasn't that great idea after all.

Who would have guessed....

Have any links so that I can read up on this? I found this from last July[1].

[1] https://blog.jetbrains.com/kotlin/2020/07/kotlin-native-memo...

"Why the Kotlin/Native memory model cannot hold."



I feel like it should be the agenda of the typed python syntax to allow writing annotated python code that can be compiled into a form that is as fast as equivalent c code.

Cython is also mentioned downthread.

typed_python is new to me. I'll check it out. I'm too am keeping an eye on this space. I think that compiling or transpiling python may be the solution to both the major problems I have with python: performance and distribution. Exciting times.

I'd add Pythran to that list. It's a python to cpp compiler for numerical python. It achieves impressive speed ups often with very little adjustment of the numerical code. It's highly undervalued IMO, you get speed similar or better than highly optimized cython or c code with very little or no adjustments.

I compared it to cython, numba and Julia for DSO, which I wrote about here: https://jochenschroeder.com/blog/articles/DSP_with_Python2/

If you have a Python2 codebase, Shedskin also gives excellent speedups for numerical codes, the only thing that didn't see as good of a speed boost was string operations. Although that might be fixed.


If Tcl can be compiled (to a large degree, and without type annotations) to machine code (AOT) using TclQuadCode there's every hope for Python !

SELF, Lisp and Smalltalk are just as dynamic.

There isn't any Python dynamic feature that those languages lack, yet they solved out the problem a couple of decades ago, and SELF JIT is the genesis of HotSpot.

What Python lacks is JIT love and being too much attached to CPython.

As a bit of background info, mypyc is “not really” ready for broader use yet. The devs are planning a soft-launch: https://github.com/mypyc/mypyc/issues/780

It is quite promising though, if it becomes more robust and compatible. I also believe they have still only scratched the surface of possible optimizations.

Yes, this. Actually I first shared it here, because I thought that's cool and could work quite cleanly since mypy works well, but when I actually tried compiling one of my Advent of Code solutions with it, what i got was goto stuffed mess. I know I can't expect nice C code, but i certainly didn't expect gotos.

As for the performance gain - 13.5 s with Python, 9 s compiled. It was a naive implementation of AoC 2020/23, so a lot of array cutting, concatenation etc. So this isn't really math, rather lot of RAM I/O

There's nothing wrong with gotos in compiled code. At the end of the day, machine code is really just a bunch of gotos with other instructions in between.

The reason goto is considered bad is that it can make code hard to follow for humans. Since this is an intermediate step in compilation, that's not an issue here.

Yes and it works.

What is the difference between cython and mypyc? I think they should answer the question why anyone would want this over cython on the readme.

Not having worked with cython, the difference seems to be that cython requires using special types in its annotations as well as not supporting specializing the standard types like ‘list’.

Mypy aims to be compatible with the standard Python type annotations and still be able to optimize them. So in theory, you don’t need to modify your existing type-annotated program. In practice I believe there are limitations currently.

Cython has first class treatment for Numpy arrays. Can Mypyc generate machine optimized code for chomping Numpy arrays element-wise?

I don’t think I want my toolchain to have first class knowledge of specific libraries...

Python is married to Numpy for scientific computing.

In my opinion it's this sort of short-sighted thinking that has cursed the Python project. "Everyone uses CPython" leads to "let's just let third party packages depend on any part of CPython" which leads to "Can't optimize CPython because it might break a dependency" which leads to "CPython is too slow, the ecosystem needs to invest heavily in c-extensions [including numpy]" which leads to "Can't create alternate Python implementations because the ecosystem depends concretely on CPython"[^1] and probably also the mess that is Python package management.

I'm not sure that the Numpy/Pandas hegemony over Python scientific computing will last. Eventually the ecosystem might move toward Arrow or something else. In this case it's probably not such a big deal because Arrow's mainstream debut will probably predate any serious adoption of Cython, but if it didn't then the latter would effectively preclude the former--Arrow becomes infeasible because everyone is using Cython/Numpy and Cython/Arrow performance is too poor to make the move, and since no one is making the move it's not worth investing in an Arrow special case in Cython and now no one gets the benefits that Arrow confers over Numpy/Pandas.

[^1]: Yes, Pypy exists and its maintainers have done yeoman's work in striving for compatibility with the ecosystem, and still (last I checked) you couldn't do such exotic things as "talking to a Postgres database via a production-ready (read: 'maintained, performant, secure, tested, stable, etc') package".

You are mixing up "how things are implemented" with "stuff that data scientists interact with."

Arrow is a low-level implementation detail, like BLAS. "Using" Arrow in data science in Python would mean implementing an Arrow-backed Pandas (or Pandas-like) DataFrame.

Your rank-and-file data scientist doesn't even know that Arrow exists, let alone that you can theoretically implement arrays, matrices, and data frames backed by it.

If you want to break the hegemony of Numpy, you will have to reimplement Numpy using CFFI instead of the CPython C API. There is no other way, unless you get everyone to switch to Julia.

Scientists are typically not trained computer scientists. They do not care, nor appreciate these technical arguments. They have two datasets A, and B, and want their sum, expressed in a neat tidy form.

C = A + B

Python with Numpy perfectly service just that need. We all have our grief with the status quo, but Python needs data processing acceleration from somewhere. In my view, Python needs to implement a JIT to alleviate 95% of the need for Numpy.

Scientists aren't the only players at the scientific computing table these days. There's increasing demand to bring data science applications to market, which implies engineering requirements in addition to the science requirements.

> In my view, Python needs to implement a JIT to alleviate 95% of the need for Numpy.

Numpy is just statically typed arrays. This seems like best case for AOT compilers, no? I'm all for JIT as well, but I don't have faith in the Python community to get us there.

JIT works great here too. It would see iteration and the associated mathematical calculations as a hotspot, and optimize only those parts, which is easy since the arrays are statically typed and sized.

I say this as a Computer Scientist at NASA that tends to re-write the scientific code in straight C. But for many workloads, a JIT would make my team more productive, basically for free as a user.

Yes, JIT would work well also, and I would strictly prefer a JIT, but I don’t think we’re likely to see a JIT Python with good ecosystem compatibility in the next decade. Good luck to the people who are using Python these days, but I’m tired of fighting the same major problems we had 15 years ago. Other ecosystems solved those problems and they actually improve materially.

That is why I am so much into Julia, even with its adoption bumps.

The problem is not that Python lacks JITs, rather the community culture of rewriting code in C instead of contributing to JIT efforts.

Personally I just use a JVM/.NET based language, and if I need I can use the same C, C++ and Fortran libraries that Python uses anyway.

Julia was created to tackle problems in applied disciplines (physics, neuroscience, genetics, material engineering, etc.). I was expecting it not to be picked up by your everyday app developer or by the overly abstract functional programmer. As an afterthought, personally I think Julia can do much more than that, I would say it can do at least as much as Python is capable today, but better.

The ecosystem is slowly expanding beyond applied disciplines, because when those people need to code something else, e.g. a Web site for their research data, then as usual they try to use the hammer they already know.

I'm really interested in Julia's performance for general purpose application development. It's great that it can work with large numerical arrays very efficiently, but what about large, diverse graphs of small objects like you commonly find in general purpose application development? I think I want a hybrid between Julia and JVM or something.

Does your team use Numba?

Where able, but its poor treatment of SWIG makes interfacing with standard tooling a royal pain. In many cases, I've rewritten Numba code in Cython or C for this very reason.


The hope is to create a new C API which doesn't expose CPython interpreter details, is easily exposed by interpreters other than CPython, and then port C-based APIs to it. Sadly it seems they aren't making much progress in 2020/2021. And I don't think it will eliminate Cython/Numpy overhead entirely, so Cython adding Numpy-specific features will still improve performance.

Also Pypy now has a compatibility shim for CPython extension modules. But last time I checked, it was slower than CPython for running one of my Numpy-based programs (corrscope), due to interfacing overhead.

Cython was around long before Python got type annotations so they kind of had to come up with their own thing. Cython will also happily compile Python WITHOUT type annotations, you just won't see much of a performance boost.

Even without types cython provides a neat way to embed your code and the interpreter into a native executable and has applications for distributing python programs on systems that are tricky for python like Android and WASM.

> Note the use of cython.int rather than int - Cython does not translate an int annotation to a C integer by default since the behaviour can be quite different with respect to overflow and division.

This seems like an important difference to me. Your regular type annotations can be used.

Cython is great, but it (used to?) introduce its own language with its own type syntax.

But that's because Python didn't have type annotations. Now that it has them, cython can just use those instead of its own and developers will get the benefit of being able to compile to C using pure Python.

I am not qualified to make any technical arguments. There’s a strong security and tech-managerial argument for using the software that’s aligned to the reference implementation. Obviously cython is currently the better choice for risk-adverse organizations that need compiled Python. But I think C-ish level people have a good reason to trust the stability, longevity, and security of a product built by the “most official” Python folks. There would need to be a deeply compelling technological reason to choose cython, not merely chasing a few wasted cycles or nifty features.

Obviously organizations that don’t manage human lives or large amounts of money can use ‘riskier’ tools without as much worry. This isn’t an argument against cython generally. But I worked at a hospital and wrote a lot of Python, and would not have been able to get the security team to support cython on their SELinux servers without a really good argument. Cython is just an unnecessary liability when your job manages identifiers and medical details on servers accessible on a fairly wide (albeit private) network.

Cython lets you use C structs to speed up memory access, and generally gives you lower-level access.

Note that GraalPython has the C structs memory layout too.

Actually spent the evening trying to compile black through mypyc. The tooling is there (blacks setup.py has a thing) but most recent revisions of mypyc with black aren’t quite working for me

The biggest issue right now seems to be miscompiles and the resulting errors being a bit inscrutable. It leaves you in the “am I wrong or is the system what’s wrong?” stuff a bit still.

But overall I think the techniques are really sound and I believe this is the most promising way forward for perf in Python.

IMHO it makes little sense to compile complete Python programs vs just compiling the slow parts. Some of the best reasons to choose Python are precisely the ones that preclude compilation, including:

- "batteries included" including a massive set of libraries (any one of which won't be supported by the compiler)

- dynamism which makes it easy to wrangle syntax to your needs (including the creation of domain-specific languages), but which destroys the performance improvement of compilation, even if the compiler can handle all the crazy introspection.

I think this extends outside of Python. Performance and safety are trade offs, not absolutes, and the balance of needing safety or performance vs extensibility vs ease of development may result in dozens or hundreds of different trade off needs in different parts of a single application.

One consequence is that it never makes sense to use static typing or compilation as application-wide absolutes for any language or paradigm.

You should virtually never be writing whole applications in Rust, C, C++, Java, Haskell, etc. It is a huge sign of bad premature optimization and dogmatism. Compiling subsets in these languages and then exposing them through extension modules in other languages that don’t force those constant trade offs is almost always superior, and it’s very telling about poor engineering culture when this results in debates or vitriolic dismissiveness from people with prior dogmatic commitments to static typing or AOT compilation.

That's a weird take from someone named MLthoughts :p

But you seem to think everyone agrees that dynamic languages are more productive and that using (say) haskell is a trade-off for performance. For people used to (good) static type systems — for me that'd be OCaml — this is just not the case. Types do not impede, they help. I guess it might be a question of taste or habit, but don't make it a universal truth and accuse others of being biased when they disagree.

The “types do not impede, they help [clarity of thought / program structure / catching many important classes of bugs]” claim is hogwash.

I say that as someone who spent most of a decade doing Haskell and Scala professionally in large companies that built lots of developer tooling and workflows for them.

The most critical aspect of business software is to be able to drop into any particular local section of the code and make significant changes according to shifting business constraints. Anything that enforces a program structure that makes this harder to do, or requires a sequence of significant refactors around things like type class design / OOP interfaces and so forth, is strictly a loss for the business, not a win, even considering correctness, safety, a developer efficiency as critical success measures.

It’s often much worse than “a loss for the business” too, given that 99% of the time, those type class designs and OOP interfaces or nested inheritance models were premature abstractions and all the extensibility or well factored SOLID code (or equivalent ideas in FP) winds up being sheer debt that fails to be extensible in the ways that reality turned out to require but which nobody foresaw.

In a world with excellent foresight and ability to hit pause to refactor architectures, then baking in domain modeling constraints through type system designs would be great. Unfortunately that doesn’t map to the real world at all.

> The “types do not impede, they help [clarity of thought / program structure / catching many important classes of bugs]” claim is hogwash.

You say this, but then never address this claim in the text below. Could you expand on it?

You also do a lot of conflating typeclasses and OOP concepts like interfaces and inheritance, while those OOP concepts don't have much of anything to do with static typing and exist in object-oriented dynamic languages as well.

The "types prevent you from easily changing things to meet business needs" argument is one I've heard a lot but I'm not familiar with any concrete scenarios where that would be the case. Do you have any examples you can share from your time working with Haskell or Scala?

> “You say this, but then never address this claim in the text below. Could you expand on it?”

I believe I did answer this in my original comment, so I will just refer you back to that.

I disagree that there was any conflating going on. Type system designs enforced with static typing are a hallmark aspect of most of these design patterns around things like type classes, interfaces and inheritance. Of course similar things can exist in dynamically typed languages but they are not the same. For example “interfaces” in Python are just duck typing conventions (apart from built-in CPython data model properties). That duck typing interface is not at all similar to interfaces as a type system design pattern in a statically types language. Any similarity is purely semantic.

As for examples, one example that I worked on heavily involved a Scala system for managing DAG dependencies in task execution. The system was set up using phantom typing and a bunch of sealed case classes such that for any logical type of Task that could exist in a DAG, the task had an “Active” and “Passive” variant, where the “Active” variant could only be obtained through a monadic validator processing a “Passive” variant.

The goal was to use the type system itself to encode the concept of “this task has passed through validation and it’s allowed to be processed.”

Because this was designed at the type system level, it created huge problems and never added any real value in the sense of making it “logically impossible” to create invalid Tasks. Number one, it led to huge, painful boiler plate to create the case classes for every type of Task and specialize the type class with a “validator” function. Number two, we eventually realized there were many different aspects to “validation” that did not map well to the concept of “passing through a validator.” For example, some tasks depended on data that didn’t exist in the required location at a certain time, and hence needed retry logic to validate. Some situations involved re-running an already complete task (usually for resource usage observability reasons, or because an external data dependency changed). At any rate, baking validation status into a static type via the phantom type design was nothing but a headache. For all the beautiful code supposedly protecting us from processing invalid jobs, all that we got was difficult constant refactors.

Eventually we abandoned it and just used Luigi instead, and wrote all DAG management code in Python. It was the best decision we made. We lost zero safety and our defect rate did not get worse. Testing caught all the same bugs that compilation would have caught in Scala, and more, with less total code. And because the nature of the tasks in Luigi was just “whatever arbitrary Python you want” it was super easy to write effective validators, accommodate new use cases on the fly, and keep the code clean without dogmatic adherence to a precommitted type system design. Luigi happening to use some lightweight inheritance patterns was forgivable, given the dynamic typing flexibility.

Thanks for the example, always good to hear about some real-world experience with this stuff. I'm curious if you think that the rewriting of the system itself also helped improve the situation, as I've found that oftentimes if I rewrite something, I'm able to use what I learned from working on the original iteration to design things a bit more effectively. Not saying that if the initial implementation had been in Python it wouldn't have been better than the Scala one, just curious how much of a difference you think that made.

Certainly there’s a lot of credit due to “lessons learned” - but they key part is that the main lesson learned was to prioritize spot change flexibility over a “pluggable” model of extensibility enforced with a type system design and rigidity. Any smaller scale tactical improvements in code structure paled in comparison to that core property.

This isn't about compiling an entire program, this is about compiling the individual libraries that you may be consuming, if they already have type hint coverage. A "free" performance boost.

If I have a pure python, fully type hinted library I'm consuming, hats off to them, and they choose to use this, awesome.

> IMHO it makes little sense to compile complete Python programs

Which is why this compiles specified modules, which can freely call noncompiled modules, not “complete Python programs”.

> IMHO it makes little sense to compile complete Python programs vs just compiling the slow parts.

It makes sense for distribution of apps to end users, which is a particular pain point with Python.

Somewhat related, I had a devil of a time a little bit ago trying to ship a small Python app as a fully standalone environment runnable on "any Linux" (but for practical purposes, Ubuntu 16.04, 18.04, and 20.04). It turns out that if you don't want to use pip, and you don't want to build separate bundles for different OSes and Python versions, it can be surprisingly tricky to get this right. Just bundling the whole interpreter doesn't work either because it's tied to a particular stdlib which is then linked to specific versions of a bunch of system dependencies, so if you go that route, you basically end up taking an entire rootfs/container with you.

After evaluating a number of different solutions, I ended up being quite happy with pex: https://github.com/pantsbuild/pex

It basically bundles up the wheels for whatever your workspace needs, and then ships them in an archive with a bootstrap script that can recreate that environment on your target. But critically, it natively supports the idea of targeting multiple OS and Python versions, you just explicitly tell it which ones to include, eg:

    --platform=manylinux2014_x86_64-cp-38-cp38   # 16.04
    --platform=manylinux2014_x86_64-cp-36-cp36m  # 18.04
    --platform=manylinux2014_x86_64-cp-35-cp35m  # 20.04
Docs on this: https://pex.readthedocs.io/en/latest/buildingpex.html#platfo...

And you can see the tags in use for any package on PyPI which ships compiled parts, eg: https://pypi.org/project/numpy/#files

I don't know that this would be suitable for something like a game, but in my case for a small utility supporting a commercial product, it was perfect.

I recently just used pyinstaller and pip on an Ubuntu 16.04 build machine. Everything works for 16, 18, 20 and even some late Redhat versions with no work. Installed it on 3000 servers with paramiko under prefect. Aside from the odd individual server issue it all worked.

Pyinstaller was definitely one of the ones I evaluated— I don't have detailed notes, but it seems I wasn't able to get it working with one of my dependencies, possibly an issue with cffi.

> “if you don't want to use pip”

Why wouldn’t you want to use pip?

Pip is suitable for use by developers working in python, setting up python workspaces with python sources and python dependencies, but it's a UX fiasco for an end-user who just wants to run a black box application and not have to care.

In my particular case the "application" was in fact interactive bootstrap/install scripts for a large, proprietary blob which wouldn't have been suitable for publishing on PyPI, anyway. Setting up a separate, possibly authenticated PyPI instance, and then training end users how to use it, vs just shipping everything together in a single package? Total non-starter.

Interesting, sounds like a very unique use case. Is containerizing not a possible solution?

That would have worked, but it would have made the whole thing a lot bigger— even a featherweight base image would have added more than what pex was able to do. It complicates the usage side too, as then you need to be root to chroot/nspawn/docker/whatever your way into the container.

Definitely a complicating factor was that all of this was meant to be usable by non-nerds and in an environment with limited or possibly no internet access. It wouldn't have been acceptable to download your installer package at the hotel, and then get to site and invoke it only to discover that you were on the hook for a few GBs of transfer from docker.io.

This sounds a bit like a GUI application, so containers would bring their own problems. Also you again force end user install docker etc

I am slightly surprised about the general take on this I see here. I am definitely no python-hater and would take it a hundred times out of a hundred over, say, javascript, but as a language I feel like it is showing its age in a number of ways, and is becoming a little bloated as it tries to be all things to all people. What keeps me from ditching it is the best-in-class datascience ecosystem. Personally, I would prefer that the direction of travel be that I write less python, but keep the ecosystem. Whereas this project is facilitating writing python in other ecosystems.

Have you tried the Julia language? With relatively performant Python interoperability it seems to fit your needs.

Thanks, I have been meaning to check out Julia for some time and I think your comment may have finally spurred me into action.

Come say hi in https://julialang.zulipchat.com or https://discourse.julialang.org/, especially if you have questions about anything!

> Classes are compiled into extension classes without __dict__ (much, but not quite, like if they used __slots__)

Is there any way to say "no, a really want a __dict__ class here, please"?

> Is there any way to say "no, a really want a __dict__ class here, please"?

Write it in a module you aren't compiling, and import it, since this supports compiled modules using noncompiled ones.

I think defining __dict__ explicitly should work.

Here is one recent benchmark. Looks very promising. https://github.com/mypyc/mypyc-benchmark-results/blob/master...

There is also Pyccel https://github.com/pyccel/pyccel. When I last tried it, it worked on most small codes, but there were some bugs.

"The aim of Pyccel is to provide a simple way to generate automatically, parallel low level code. The main uses would be:

Convert a Python code (or project) into a Fortran or C code. Accelerate Python functions by converting them to Fortran or C functions. Pyccel can be viewed as:

Python-to-Fortran/C converter a compiler for a Domain Specific Language with Python syntax"

It would be a dream-come-true to be able to compile Python (or some kind of very-close Python) down to a static binary. I want to run it like a Go binary.

You already could?[0] Or are you asking about something else?

[0]: https://stackoverflow.com/questions/39913847/is-there-a-way-...

I looked into this and it seems like no, these are not static binaries at all. They dynamically load stuff, and cython seems to be embedding very specific headers, including linux specific ones (asm/errno.h). I tried to build using musl-gcc but it was too different.

When I say build static like a Go binary, I mean that the binary contains everything, and is not allowed to dynamically load anything at all. Also, preferrably it doesn't need a C standard library, and does system calls manually.

> When I say build static like a Go binary, I mean that the binary contains everything, and is not allowed to dynamically load anything at all.

Just code? Or are you talking about data (e.g. HTML files) too? A lot of python's ecosystem seems pretty built around the assumption that non-code assets won't be bundled into single files, so any magic single-file Python compiler would probably have to include the ability to distribute those assets separately with the executable.

> preferrably it doesn't need a C standard library

I don't believe that golang even meets this standard--not with default options, at least. Also, why? How often have dynamically linked libc version differences really been a challenge for you in production?

> does system calls manually

This is generally considered to be a bad idea. Golang has moved away from this on several platforms, and people are making noise about doing it on Linux, too. A minority of compiled languages opt to go to this route. Why do you want this?

It seems really interesting that the mypy team went to such lengths to create a binary version of their linter.

The big draw with mypyc has got to be direct integration with other source code in C.

Can anyone answer if it’s possible to replace PyPy’s VM backend with LLVM for AOT compilation? I wonder if that will results in any performance improvements.

CUDA is basically C with Fortran semantics, right? Wouldn't something like that be possible with Python?

Nope, CUDA is a polyglot runtime for GPGPU programming, using PTX bytecode, with support for C, C++ and Fortran from NVidia, and third party support for .NET, Java, Haskell, Julia.

Nowadays, CUDA hardware uses C++11 memory semantics, which are based on Java's memory model.

This is one of the reasons why Krhonos insistence in keeping OpenCL a "C99 dialect on GPUs" has not gained the love of most researchers, and now is too late to win them back, despite SPIR.

Thanks - I think you got exactly what I meant, how out of date I am, and answered the real question.

Does the resulting code run as fast as native C?

Would love to see some benchmarks on this.

> Does the resulting code run as fast as native C?

The motivating use case is mypy, so I guess if someone wants to hand code mypy in native C we can assess this. But not doing that is as much, I would expect, of the motivation as speeding up mypy is.


The downvotes probably came from non-slavic readers. I read it as Murus too haha.

This sounds awesome.

What does it do?

It compiles type-annotated Python to C

Does the resulting code run as fast as native C?

> Mypyc is a compiler that compiles mypy-annotated, statically typed Python modules into CPython C extensions. Currently our primary focus is on making mypy faster through compilation -- the default mypy wheels are compiled with mypyc. Compiled mypy is about 4x faster than without compilation.

My wager is that it does not. It may if you have math intensive code, but if you have an algorithm that touches lots of python built in datatypes, access to those types will be the bottleneck.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact