• Shortcuts : 'n' next unread feed - 'p' previous unread feed • Styles : 1 2

» Publishers, Monetize your RSS feeds with FeedShow:  More infos  (Show/Hide Ads)


Date: Tuesday, 08 Jul 2014 12:38

Hi all,

PyPy-STM is now reaching a point where we can say it's good enough to be a GIL-less Python. (We don't guarantee there are no more bugs, so please report them :-) The first official STM release:

This corresponds roughly to PyPy 2.3 (not 2.3.1). It requires 64-bit Linux. More precisely, this release is built for Ubuntu 12.04 to 14.04; you can also rebuild it from source by getting the branch stmgc-c7. You need clang to compile, and you need a patched version of llvm.

This version's performance can reasonably be compared with a regular PyPy, where both include the JIT. Thanks for following the meandering progress of PyPy-STM over the past three years --- we're finally getting somewhere really interesting! We cannot thank enough all contributors to the previous PyPy-STM money pot that made this possible. And, although this blog post is focused on the results from that period of time, I have of course to remind you that we're running a second call for donation for future work, which I will briefly mention again later.

A recap of what we did to get there: around the start of the year we found a new model, a "redo-log"-based STM which uses a couple of hardware tricks to not require chasing pointers, giving it (in this context) exceptionally cheap read barriers. This idea was developed over the following months and (relatively) easily integrated with the JIT compiler. The most recent improvements on the Garbage Collection side are closing the gap with a regular PyPy (there is still a bit more to do there). There is some preliminary user documentation.

Today, the result of this is a PyPy-STM that is capable of running pure Python code on multiple threads in parallel, as we will show in the benchmarks that follow. A quick warning: this is only about pure Python code. We didn't try so far to optimize the case where most of the time is spent in external libraries, or even manipulating "raw" memory like array.array or numpy arrays. To some extent there is no point because the approach of CPython works well for this case, i.e. releasing the GIL around the long-running operations in C. Of course it would be nice if such cases worked as well in PyPy-STM --- which they do to some extent; but checking and optimizing that is future work.

As a starting point for our benchmarks, when running code that only uses one thread, we get a slow-down between 1.2 and 3: at worst, three times as slow; at best only 20% slower than a regular PyPy. This worst case has been brought down --it used to be 10x-- by recent work on "card marking", a useful GC technique that is also present in the regular PyPy (and about which I don't find any blog post; maybe we should write one :-) The main remaining issue is fork(), or any function that creates subprocesses: it works, but is very slow. To remind you of this fact, it prints a line to stderr when used.

Now the real main part: when you run multithreaded code, it scales very nicely with two threads, and less-than-linearly but still not badly with three or four threads. Here is an artificial example:

    total = 0
    lst1 = ["foo"]
    for i in range(100000000):
        lst1.append(i)
        total += lst1.pop()

We run this code N times, once in each of N threads (full benchmark). Run times, best of three:

Number of threads Regular PyPy (head) PyPy-STM
N = 1 real 0.92s
user+sys 0.92s
real 1.34s
user+sys 1.34s
N = 2 real 1.77s
user+sys 1.74s
real 1.39s
user+sys 2.47s
N = 3 real 2.57s
user+sys 2.56s
real 1.58s
user+sys 4.106s
N = 4 real 3.38s
user+sys 3.38s
real 1.64s
user+sys 5.35s

(The "real" time is the wall clock time. The "user+sys" time is the recorded CPU time, which can be larger than the wall clock time if multiple CPUs run in parallel. This was run on a 4x2 cores machine. For direct comparison, avoid loops that are so trivial that the JIT can remove all allocations from them: right now PyPy-STM does not handle this case well. It has to force a dummy allocation in such loops, which makes minor collections occur much more frequently.)

Four threads is the limit so far: only four threads can be executed in parallel. Similarly, the memory usage is limited to 2.5 GB of GC objects. These two limitations are not hard to increase, but at least increasing the memory limit requires fighting against more LLVM bugs. (Include here snark remarks about LLVM.)

Here are some measurements from more real-world benchmarks. This time, the amount of work is fixed and we parallelize it on T threads. The first benchmark is just running translate.py on a trunk PyPy. The last three benchmarks are here.

Benchmark PyPy 2.3 (PyPy head) PyPy-STM, T=1 T=2 T=3 T=4
translate.py --no-allworkingmodules
(annotation step)
184s (170s) 386s (2.10x) n/a
multithread-richards
5000 iterations
24.2s (16.8s) 52.5s (2.17x) 37.4s (1.55x) 25.9s (1.07x) 32.7s (1.35x)
mandelbrot
divided in 16-18 bands
22.9s (18.2s) 27.5s (1.20x) 14.4s (0.63x) 10.3s (0.45x) 8.71s (0.38x)
btree 2.26s (2.00s) 2.01s (0.89x) 2.22s (0.98x) 2.14s (0.95x) 2.42s (1.07x)

This shows various cases that can occur:

  • The mandelbrot example runs with minimal overhead and very good parallelization. It's dividing the plane to compute in bands, and each of the T threads receives the same number of bands.
  • Richards, a classical benchmark for PyPy (tweaked to run the iterations in multiple threads), is hard to beat on regular PyPy: we suspect that the difference is due to the fact that a lot of paths through the loops don't allocate, triggering the issue already explained above. Moreover, the speed of Richards was again improved dramatically recently, in trunk.
  • The translation benchmark measures the time translate.py takes to run the first phase only, "annotation" (for now it consumes too much memory to run translate.py to the end). Moreover the timing starts only after the large number of subprocesses spawned at the beginning (mostly gcc). This benchmark is not parallel, but we include it for reference here. The slow-down factor of 2.1x is still too much, but we have some idea about the reasons: most likely, again the Garbage Collector, missing the regular PyPy's very fast small-object allocator for old objects. Also, translate.py is an example of application that could, with reasonable efforts, be made largely parallel in the future using atomic blocks.
  • Atomic blocks are also present in the btree benchmark. I'm not completely sure but it seems that, in this case, the atomic blocks create too many conflicts between the threads for actual parallization: the base time is very good, but running more threads does not help at all.

As a summary, PyPy-STM looks already useful to run CPU-bound multithreaded applications. We are certainly still going to fight slow-downs, but it seems that there are cases where 2 threads are enough to outperform a regular PyPy, by a large margin. Please try it out on your own small examples!

And, at the same time, please don't attempt to retrofit threads inside an existing large program just to benefit from PyPy-STM! Our goal is not to send everyone down the obscure route of multithreaded programming and its dark traps. We are going finally to shift our main focus on the phase 2 of our research (donations welcome): how to enable a better way of writing multi-core programs. The starting point is to fix and test atomic blocks. Then we will have to debug common causes of conflicts and fix them or work around them; and try to see how common frameworks like Twisted can be adapted.

Lots of work ahead, but lots of work behind too :-)

Armin (thanks Remi as well for the work).

Author: "Armin Rigo (noreply@blogger.com)" Tags: "stm"
Send by mail Print  Save  Delicious 
Date: Friday, 20 Jun 2014 23:31

We're pleased to announce the first stable release of PyPy3. PyPy3
targets Python 3 (3.2.5) compatibility.

We would like to thank all of the people who donated to the py3k proposal
for supporting the work that went into this.

You can download the PyPy3 2.3.1 release here:

http://pypy.org/download.html#pypy3-2-3-1

Highlights

  • The first stable release of PyPy3: support for Python 3!
  • The stdlib has been updated to Python 3.2.5
  • Additional support for the u'unicode' syntax (PEP 414) from Python 3.3
  • Updates from the default branch, such as incremental GC and various JIT
    improvements
  • Resolved some notable JIT performance regressions from PyPy2:
  • Re-enabled the previously disabled collection (list/dict/set) strategies
  • Resolved performance of iteration over range objects
  • Resolved handling of Python 3's exception __context__ unnecessarily forcing
    frame object overhead

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7.6 or 3.2.5. It's fast due to its integrated tracing JIT compiler.

This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows,
and OpenBSD,
as well as newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux.

While we support 32 bit python on Windows, work on the native Windows 64
bit python is still stalling, we would welcome a volunteer
to handle that.

How to use PyPy?

We suggest using PyPy from a virtualenv. Once you have a virtualenv
installed, you can follow instructions from pypy documentation on how
to proceed. This document also covers other installation schemes.

Cheers,
the PyPy team

Author: "Philip Jenvey (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Sunday, 08 Jun 2014 01:14
We're pleased to announce PyPy 2.3.1, a feature-and-bugfix improvement over our recent 2.3 release last month.

This release contains several bugfixes and enhancements among the user-facing improvements:
  • The built-in struct module was renamed to _struct, solving issues with IDLE and other modules
  • Support for compilation with gcc-4.9
  • A CFFI-based version of the gdbm module is now included in our binary bundle
  • Many issues were resolved since the 2.3 release on May 8

You can download the PyPy 2.3.1 release here:

http://pypy.org/download.html

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It's fast (pypy 2.3.1 and cpython 2.7.x performance comparison) due to its integrated tracing JIT compiler.

This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows, and OpenBSD, as well as newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux. 
We would like to thank our donors for the continued support of the PyPy project.

The complete release notice is here.

Please try it out and let us know what you think. We especially welcome success stories, please tell us about how it has helped you!

Cheers, The PyPy Team
Author: "mattip (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Friday, 09 May 2014 10:38
We’re pleased to announce PyPy 2.3, which targets version 2.7.6 of the Python language. This release updates the stdlib from 2.7.3, jumping directly to 2.7.6.

This release also contains several bugfixes and performance improvements, many generated by real users finding corner cases. CFFI has made it easier than ever to use existing C code with both cpython and PyPy, easing the transition for packages like cryptographyPillow(Python Imaging Library [Fork]), a basic port of pygame-cffi, and others.

PyPy can now be embedded in a hosting application, for instance inside uWSGI

You can download the PyPy 2.3 release here:

http://pypy.org/download.html

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It's fast (pypy 2.3 and cpython 2.7.x performance comparison; note that cpython's speed has not changed since 2.7.2) due to its integrated tracing JIT compiler.

This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows, and OpenBSD, as well as newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux. 

We would like to thank our donors for the continued support of the PyPy project.

The complete release notice is here

Cheers, The PyPy Team
Author: "mattip (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 15 Apr 2014 22:08
Work on NumPy on PyPy continued in March, though at a lighter pace than the previous few months. Progress was made on both compatibility and speed fronts. Several behavioral issues reported to the bug tracker were resolved. The most significant of these was probably the correction of casting to built-in Python types. Previously, int/long conversions of numpy scalars such as inf/nan/1e100 would return bogus results. Now, they raise or return values, as appropriate.

On the speed front, enhancements to the PyPy JIT were made to support virtualizing the raw_store/raw_load memory operations used in numpy arrays. Further work remains here in virtualizing the alloc_raw_storage when possible. This will allow scalars to have storages but still be virtualized when possible in loops.

Aside from continued work on compatibility/speed of existing code, we also hope to begin implementing the C-level components of other numpy modules such as mtrand, nditer, linalg, and so on. Several approaches could be taken to get C-level code in these modules working, ranging from reimplementing in RPython to interfacing with existing code with CFFI, if possible. The appropriate approach depends on many factors and will probably vary from module to module.

To try out PyPy + NumPy, grab a nightly PyPy and install our NumPy fork. Feel free to report comments/issues to IRC, our mailing list, or bug tracker. Thanks to the contributors to the NumPy on PyPy proposal for supporting this work.
Author: "Brian Kearns (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Wednesday, 09 Apr 2014 11:33

Hi all,

We now have a preliminary version of PyPy-STM with the JIT, from the new STM documentation page. This PyPy-STM is still not quite useful, failing to top the performance of a regular PyPy by a small margin on most benchmarks, but it's definitely getting there :-) The overheads with the JIT are still a bit too high. (I've been tracking an obscure bug since days. It turned out to be a simple buffer overflow. But if anybody has a clue about why a hardware watchpoint in gdb, set on one of the garbled memory locations, fails to trigger but the memory ends up being modified anyway... and, it turns out, by just a regular pointer write... ideas welcome.)

But I go off-topic :-) The main point of this post is to announce the 2nd Call for Donation about STM. We achieved most of the goals laid out in the first call. We even largely overachieved them in terms of raw performance, even if there are many cases that are unreasonably slow for now. So, after the successful research, we are launching a second proposal about the development part of the project:

  1. Polish PyPy-STM to get a consistently reasonable speed, 25%-40% slower than a regular JITted PyPy when running single-threaded code. Of course it is supposed to scale nicely as long as there are no user-visible conflicts.

  2. Focus on developing the Python-facing interface: both internal things (e.g. do dictionaries need to be more TM-friendly in general?) as well as directly visible things (e.g. some profiler-like interface to explore common conflicts in a program).

  3. Regular multithreaded code should benefit out of the box, but the final goal is to explore and tweak some existing non-multithreaded frameworks and improve their TM-friendliness. So existing programs using Twisted or Stackless, for example, should run on multiple cores without any major change.

See the full call for more details! I'd like to thank Remi Meier for getting involved. And a big thank you to everybody who contributed money on the first call. It took more time than anticipated, but it's there in good but rough shape. Now it needs a lot of polishing :-)

Armin

Author: "Armin Rigo (noreply@blogger.com)" Tags: "stm"
Send by mail Print  Save  Delicious 
Date: Tuesday, 08 Apr 2014 20:44

Hi all,

Here is one of the first full PyPy's (edit: it was r69967+, but the general list of versions is currently here) compiled with the new StmGC-c7 library. It has no JIT so far, but it runs some small single-threaded benchmarks by taking around 40% more time than a corresponding non-STM, no-JIT version of PyPy. It scales --- up to two threads only, which is the hard-coded maximum so far in the c7 code. But the scaling looks perfect in these small benchmarks without conflict: starting two threads each running a copy of the benchmark takes almost exactly the same amount of total time, simply using two cores.

Feel free to try it! It is not actually useful so far, because it is limited to two cores and CPython is something like 2.5x faster. One of the important next steps is to re-enable the JIT. Based on our current understanding of the "40%" figure, we can probably reduce it with enough efforts; but also, the JIT should be able to easily produce machine code that suffers a bit less than the interpreter from these effects. This seems to mean that we're looking at 20%-ish slow-downs for the future PyPy-STM-JIT.

Interesting times :-)

For reference, this is what you get by downloading the PyPy binary linked above: a Linux 64 binary (Ubuntu 12.04) that should behave mostly like a regular PyPy. (One main missing feature is that destructors are never called.) It uses two cores, but obviously only if the Python program you run is multithreaded. The only new built-in feature is with __pypy__.thread.atomic: this gives you a way to enforce that a block of code runs "atomically", which means without any operation from any other thread randomly interleaved.

If you want to translate it yourself, you need a trunk version of clang with three patches applied. That's the number of bugs that we couldn't find workarounds for, not the total number of bugs we found by (ab)using the address_space feature...

Stay tuned for more!

Armin & Remi

Author: "Armin Rigo (noreply@blogger.com)" Tags: "stm"
Send by mail Print  Save  Delicious 
Date: Monday, 31 Mar 2014 00:18
Here is what has been happening with NumPy in PyPy in October thanks to the people who donated to the NumPyPy proposal:

The biggest change is that we shifted to using an external fork of numpy rather than a minimal numpypy module. The idea is that we will be able to reuse most of the upstream pure-python numpy components, replacing the C modules with appropriate RPython micronumpy pieces at the correct places in the module namespace.

The numpy fork should work just as well as the old numpypy for functionality that existed previously, and also include much new functionality from the pure-python numpy pieces that simply hadn't been imported yet in numpypy. However, this new functionality will not have been "hand picked" to only include pieces that work, so you may run into functionality that relies on unimplemented components (which should fail with user-level exceptions).

This setup also allows us to run the entire numpy test suite, which will help in directing future compatibility development. The recent PyPy release includes these changes, so download it and let us know how it works! And if you want to live on the edge, the nightly includes even more numpy progress made in November.

To install the fork, download the latest release, and then install numpy either separately with a virtualenv: pip install git+https://bitbucket.org/pypy/numpy.git; or directly: git clone https://bitbucket.org/pypy/numpy.git; cd numpy; pypy setup.py install.

EDIT: if you install numpy as root, you may need to also import it once as root before it works: sudo pypy -c 'import numpy'

Along with this change, progress was made in fixing internal micronumpy bugs and increasing compatibility:
  • Fixed a bug with strings in record dtypes
  • Fixed a bug where the multiplication of an ndarray with a Python int or float resulted in loss of the array's dtype
  • Fixed several segfaults encountered in the numpy test suite (suite should run now without segfaulting)

We also began working on __array_prepare__ and __array_wrap__, which are necessary pieces for a working matplotlib module.

Cheers,
Romain and Brian
Author: "Romain Guillebert (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Wednesday, 26 Mar 2014 17:28

The Raspberry Pi aims to be a low-cost educational tool that anyone can use to learn about electronics and programming. Python and pygame are included in the Pi's programming toolkit. And since last year, thanks in part to sponsorship from the Raspberry Pi Foundation, PyPy also works on the Pi (read more here).

With PyPy working on the Pi, game logic written in Python stands to gain an awesome performance boost. However, the original pygame is a Python C extension. This means it performs poorly on PyPy and negates any speedup in the Python parts of the game code.

One solution to making pygame games run faster on PyPy, and eventually on the Raspberry Pi, comes in the form of pygame_cffi. pygame_cffi uses CFFI to wrap the underlying SDL library instead of a C extension. A few months ago, the Raspberry Pi Foundation sponsored a Cape Town Python User Group hackathon to build a proof-of-concept pygame using CFFI. This hackathon was a success and it produced an early working version of pygame_cffi.

So for the last 5 weeks Raspberry Pi has been funding work on pygame_cffi. The goal was a complete implementation of the core modules. We also wanted benchmarks to illuminate performance differences between pygame_cffi on PyPy and pygame on CPython. We are happy to report that those goals were met. So without further ado, here's a rundown of what works.

Current functionality

Invention screenshot:
Mutable mamba screenshot:

With the above-mentioned functionality in place we could get 10+ of the pygame examples to work, and a number of PyWeek games. At the time of writing, if a game doesn't work it is most likely due to an unimplemented transform or draw function. That will be remedied soon.

Performance

In terms of performance, pygame_cffi on PyPy is showing a lot of promise. It beats pygame on CPython by a significant margin in our events processing and collision detection benchmarks, while blit and fill benchmarks perform similarly. The pygame examples we checked also perform better.

However, there is still work to be done to identify and eliminate bottlenecks. On the Raspberry Pi performance is markedly worse compared to pygame (barring collision detection). The PyWeek games we tested also performed slightly worse. Fortunately there is room for improvement in various places.

Invention & Mutable Mamba (x86)
Standard pygame examples (Raspberry Pi)

Here's a summary of some of the benchmarks. Relative speed refers to the frame rate obtained in pygame_cffi on PyPy relative to pygame on CPython.

Benchmark Relative speed (pypy speedup)
Events (x86) 1.41
Events (Pi) 0.58
N2 collision detection on 100 sprites (x86) 4.14
N2 collision detection on 100 sprites (Pi) 1.01
Blit 100 surfaces (x86) 1.06
Blit 100 surfaces (Pi) 0.60
Invention (x86) 0.95
Mutable Mamba (x86) 0.72
stars example (x86) 1.95
stars example (Pi) 0.84

OpenGL

Some not-so-great news is that PyOpenGL performs poorly on PyPy since PyOpenGL uses ctypes. This translates into a nasty reduction in frame rate for games that use OpenGL surfaces. It might be worthwhile creating a CFFI-powered version of PyOpenGL as well.

Where to now?

Work on pygame_cffi is ongoing. Here are some things that are in the pipeline:

  • Get pygame_cffi on PyPy to a place where it is consistently faster than pygame on CPython.
  • Implement the remaining modules and functions, starting with draw and transform.
  • Improve test coverage.
  • Reduce the time it takes for CFFI to parse the cdef. This makes the initial pygame import slow.

If you want to contribute you can find pygame_cffi on Github. Feel free to find us on #pypy on freenode or post issues on github.

Cheers,
Rizmari Versfeld


Author: "Maciej Fijalkowski (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Wednesday, 12 Mar 2014 11:28
Hello everyone

There is an interview with Roberto De Ioris (from uWSGI fame) about embedding PyPy in uWSGI that covers recent addition of a PyPy embedding interface using cffi and the experience with using it. Read The full interview

Cheers
fijal
Author: "Maciej Fijalkowski (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Monday, 10 Mar 2014 19:38
More progress was made on the NumPy front in the past month. On the compatibility front, we now pass ~130 more tests from NumPy's suite since the end of January. Currently, we pass 2336 tests out of 3265 tests run, with many of the failures representing portions of NumPy that we don't plan to implement in the near future (object dtypes, unicode, etc). There are still some failures that do represent issues, such as special indexing cases and failures to respect subclassed ndarrays in return values, which we do plan to resolve. There are also some unimplemented components and ufuncs remaining which we hope to implement, such as nditer and mtrand. Overall, the most common array functionality should be working.

Additionally, I began to take a look at some of the loops generated by our code. One widely used loop is dot, and we were running about 5x slower than NumPy's C version. I was able to optimize the dot loop and also the general array iterator to get us to ~1.5x NumPy C time on dot operations of various sizes. Further progress in this area could be made by using CFFI to tie into BLAS libraries, when available. Also, work remains in examining traces generated for our other loops and checking for potential optimizations.

To try out PyPy + NumPy, grab a nightly PyPy and install our NumPy fork. Feel free to report comments/issues to IRC, our mailing list, or bug tracker. Thanks to the contributors to the NumPy on PyPy proposal for supporting this work.

Cheers,
Brian
Author: "Brian Kearns (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 18 Feb 2014 03:33

This is the 13th status update about our work on the py3k branch, which we
can work on thanks to all of the people who donated to the py3k proposal.

We're just finishing up a cleanup of int/long types. This work helps the py3k
branch unify these types into the Python 3 int and restore JIT compilation of
machine sized integers
.

This cleanup also removes multimethods from these types. PyPy has
historically used a clever implementation of multimethod dispatch for declaring
methods of the __builtin__ types in RPython.

This multimethod scheme provides some convenient features for doing this,
however we've come to the conclusion that it may be more trouble than it's
worth. A major problem of multimethods is that they generate a large amount of
stub methods which burden the already lengthy and memory hungry RPython
translation process. Also, their implementation and behavior can be somewhat
complicated/obscure.

The alternative to multimethods involves doing the work of the type checking
and dispatching rules in a more verbose, manual way. It's a little more work in
the end but less magical.

Recently, Manuel Jacob finished a large cleanup effort of the
unicode/string/bytearray types that also removed their multimethods. This work
also benefits the py3k branch: it'll help with future PEP 393 (or PEP 393
alternative
) work. This effort was partly sponsored by Google's Summer of
Code: thanks Manuel and Google!

Now there's only a couple major pieces left in the multimethod removal (the
float/complex types and special marshaling code) and a few minor pieces that
should be relatively easy.

In conclusion, there's been some good progress made on py3k and multimethod
removal this winter, albeit a bit slower than we would have liked.

cheers,
Phil

Author: "Philip Jenvey (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 11 Feb 2014 03:02
Work continued on the NumPy + PyPy front steadily in December and more lightly in January. The continued focus was compatibility, targeting incorrect or unimplemented features that appeared in multiple NumPy test suite failures. We now pass ~2/3 of the NumPy test suite. The biggest improvements were made in these areas:

- Bugs in conversions of arrays/scalars to/from native types
- Fix cases where we would choose incorrect dtypes when initializing or computing results
- Improve handling of subclasses of ndarray through computations
- Support some optional arguments for array methods that are used in the pure-python part of NumPy
- Support additional attributes in arrays, array.flags, and dtypes
- Fix some indexing corner cases that arise in NumPy testing
- Implemented part of numpy.fft (cffti and cfftf)

Looking forward, we plan to continue improving the correctness of the existing implemented NumPy functionality, while also beginning to look at performance. The initial focus for performance will be to look at areas where we are significantly worse than CPython+NumPy. Those interested in trying these improvements out will need a PyPy nightly, and an install of the PyPy NumPy fork. Thanks again to the NumPy on PyPy donors for funding this work.
Author: "Brian Kearns (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Sunday, 09 Feb 2014 23:16

Hi all,

A quick note about the Software Transactional Memory (STM) front.

Since the previous post, we believe we progressed a lot by discovering an alternative core model for software transactions. Why do I say "believe"? It's because it means again that we have to rewrite from scratch the C library handling STM. This is currently work in progress. Once this is done, we should be able to adapt the existing pypy-stm to run on top of it without much rewriting efforts; in fact it should simplify the difficult issues we ran into for the JIT. So while this is basically yet another restart similar to last June's, the difference is that the work that we have already put in the PyPy part (as opposed to the C library) remains.

You can read about the basic ideas of this new C library here. It is still STM-only, not HTM, but because it doesn't constantly move objects around in memory, it would be easier to adapt an HTM version. There are even potential ideas about a hybrid TM, like using HTM but only to speed up the commits. It is based on a Linux-only system call, remap_file_pages() (poll: who heard about it before? :-). As previously, the work is done by Remi Meier and myself.

Currently, the C library is incomplete, but early experiments show good results in running duhton, the interpreter for a minimal language created for the purpose of testing STM. Good results means we brough down the slow-downs from 60-80% (previous version) to around 15% (current version). This number measures the slow-down from the non-STM-enabled to the STM-enabled version, on one CPU core; of course, the idea is that the STM version scales up when using more than one core.

This means that we are looking forward to a result that is much better than originally predicted. The pypy-stm has chances to run at a one-thread speed that is only "n%" slower than the regular pypy-jit, for a value of "n" that is optimistically 15 --- but more likely some number around 25 or 50. This is seriously better than the original estimate, which was "between 2x and 5x". It would mean that using pypy-stm is quite worthwhile even with just two cores.

More updates later...

Armin

Author: "Armin Rigo (noreply@blogger.com)" Tags: "stm"
Send by mail Print  Save  Delicious 
Date: Tuesday, 10 Dec 2013 16:48
Since the PyPy 2.2 release last month, more progress has been made on the NumPy compatibility front. Initial work has been directed by running the NumPy test suite and targeting failures that appear most frequently, along with fixing the few bugs reported on the bug tracker.

Improvements were made in these areas:
- Many missing/broken scalar functionalities were added/fixed. The scalar API should match up more closely with arrays now.
- Some missing dtype functionality was added (newbyteorder, hasobject, descr, etc)
- Support for optional arguments (axis, order) was added to some ndarray functions
- Fixed some corner cases for string/record types

Most of these improvements went onto trunk after 2.2 was split, so if you're interested in trying them out or running into problems on 2.2, try the nightly.

Thanks again to the NumPy on PyPy donors who make this continued progress possible.

Cheers,
Brian
Author: "Brian Kearns (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Monday, 09 Dec 2013 13:32

One of the RaspberryPi's goals is to be a fun toolkit for school children (and adults!) to learn programming and electronics with. Python and pygame are part of this toolkit. Recently the RaspberryPi Foundation funded parts of the effort of porting of pypy to the Pi -- making Python programs on the Pi faster!

Unfortunately pygame is written as a Python C extension that wraps SDL which means performance of pygame under pypy remains mediocre. To fix this pygame needs to be rewritten using cffi to wrap SDL instead.

RaspberryPi sponsored a CTPUG (Cape Town Python User Group) hackathon to put together a proof-of-concept pygame-cffi. The day was quite successful - we got a basic version of the bub'n'bros client working on pygame-cffi (and on PyPy). The results can be found on github with contributions from the five people present at the sprint.

While far from complete, the proof of concept does show that there are no major obstacles to porting pygame to cffi and that cffi is a great way to bind your Python package to C libraries.

Amazingly, we managed to have machines running all three major platforms (OS X, Linux and Windows) at the hackathon so the code runs on all of them!

We would like to thank the Praekelt foundation for providing the venue and The Raspberry Pi foundation for providing food and drinks!

Cheers,
Simon Cross, Jeremy Thurgood, Neil Muller, David Sharpe and fijal.


Author: "Maciej Fijalkowski (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Saturday, 30 Nov 2013 09:57

The next PyPy sprint will be in Leysin, Switzerland, for the ninth time. This is a fully public sprint: newcomers and topics other than those proposed below are welcome.

Goals and topics of the sprint

  • Py3k: work towards supporting Python 3 in PyPy
  • NumPyPy: work towards supporting the numpy module in PyPy
  • STM: work towards supporting Software Transactional Memory
  • And as usual, the main side goal is to have fun in winter sports :-) We can take a day off for ski.

Exact times

For a change, and as an attempt to simplify things, I specified the dates as 11-19 January 2014, where 11 and 19 are travel days. We will work full days between the 12 and the 18. You are of course allowed to show up for a part of that time only, too.

Location & Accomodation

Leysin, Switzerland, "same place as before". Let me refresh your memory: both the sprint venue and the lodging will be in a very spacious pair of chalets built specifically for bed & breakfast: http://www.ermina.ch/. The place has a good ADSL Internet connexion with wireless installed. You can of course arrange your own lodging anywhere (as long as you are in Leysin, you cannot be more than a 15 minutes walk away from the sprint venue), but I definitely recommend lodging there too -- you won't find a better view anywhere else (though you probably won't get much worse ones easily, either :-)

Please confirm that you are coming so that we can adjust the reservations as appropriate. The rate so far has been around 60 CHF a night all included in 2-person rooms, with breakfast. There are larger rooms too (less expensive per person) and maybe the possibility to get a single room if you really want to.

Please register by Mercurial:

https://bitbucket.org/pypy/extradoc/
https://bitbucket.org/pypy/extradoc/raw/extradoc/sprintinfo/leysin-winter-2014

or on the pypy-dev mailing list if you do not yet have check-in rights:

http://mail.python.org/mailman/listinfo/pypy-dev

You need a Swiss-to-(insert country here) power adapter. There will be some Swiss-to-EU adapters around -- bring a EU-format power strip if you have one.

Author: "Armin Rigo (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Wednesday, 27 Nov 2013 15:00

We're pleased to announce PyPy 2.2.1, which targets version 2.7.3 of the Python language. This is a bugfix release over 2.2.

You can download the PyPy 2.2.1 release here:

http://pypy.org/download.html

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It's fast (pypy 2.2 and cpython 2.7.2 performance comparison) due to its integrated tracing JIT compiler.

This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows 32, or ARM (ARMv6 or ARMv7, with VFPv3).

Work on the native Windows 64 is still stalling, we would welcome a volunteer to handle that.

Highlights

This is a bugfix release. The most important bugs fixed are:

  • an issue in sockets' reference counting emulation, showing up notably when using the ssl module and calling makefile().
  • Tkinter support on Windows.
  • If sys.maxunicode==65535 (on Windows and maybe OS/X), the json decoder incorrectly decoded surrogate pairs.
  • some FreeBSD fixes.

Note that CFFI 0.8.1 was released. Both versions 0.8 and 0.8.1 are compatible with both PyPy 2.2 and 2.2.1.

Cheers, Armin Rigo & everybody

Author: "Armin Rigo (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
CFFI 0.8   New window
Date: Wednesday, 27 Nov 2013 14:53

Hi all,

CFFI 0.8 for CPython (2.6-3.x) has been released.

Quick download: pip install cffi --upgrade
Documentation: https://cffi.readthedocs.org/en/release-0.8/

What's new: a number of small fixes; ffi.getwinerror(); integrated support for C99 variable-sized structures; multi-thread safety.

--- Armin

Update: CFFI 0.8.1, with fixes on Python 3 on OS/X, and some FreeBSD fixes (thanks Tobias).

Author: "Armin Rigo (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Sunday, 17 Nov 2013 10:32
We're pleased to announce PyPy 2.2, which targets version 2.7.3 of the Python language. This release main highlight is the introduction of the incremental garbage collector, sponsored by the Raspberry Pi Foundation.
This release also contains several bugfixes and performance improvements.
You can download the PyPy 2.2 release here:
http://pypy.org/download.html
We would like to thank our donors for the continued support of the PyPy project. We showed quite a bit of progress on all three projects (see below) and we're slowly running out of funds. Please consider donating more so we can finish those projects! The three projects are:
  • Py3k (supporting Python 3.x): the release PyPy3 2.2 is imminent.
  • STM (software transactional memory): a preview will be released very soon, as soon as we fix a few bugs
  • NumPy: the work done is included in the PyPy 2.2 release. More details below.

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It's fast (pypy 2.2 and cpython 2.7.2 performance comparison) due to its integrated tracing JIT compiler.
This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows 32, or ARM (ARMv6 or ARMv7, with VFPv3).
Work on the native Windows 64 is still stalling, we would welcome a volunteer to handle that.

Highlights

  • Our Garbage Collector is now "incremental". It should avoid almost all pauses due to a major collection taking place. Previously, it would pause the program (rarely) to walk all live objects, which could take arbitrarily long if your process is using a whole lot of RAM. Now the same work is done in steps. This should make PyPy more responsive, e.g. in games. There are still other pauses, from the GC and the JIT, but they should be on the order of 5 milliseconds each.
  • The JIT counters for hot code were never reset, which meant that a process running for long enough would eventually JIT-compile more and more rarely executed code. Not only is it useless to compile such code, but as more compiled code means more memory used, this gives the impression of a memory leak. This has been tentatively fixed by decreasing the counters from time to time.
  • NumPy has been split: now PyPy only contains the core module, called _numpypy. The numpy module itself has been moved to https://bitbucket.org/pypy/numpy and numpypy disappeared. You need to install NumPy separately with a virtualenv: pip install git+https://bitbucket.org/pypy/numpy.git; or directly: git clone https://bitbucket.org/pypy/numpy.git; cd numpy; pypy setup.py install.
  • non-inlined calls have less overhead
  • Things that use sys.set_trace are now JITted (like coverage)
  • JSON decoding is now very fast (JSON encoding was already very fast)
  • various buffer copying methods experience speedups (like list-of-ints to int[] buffer from cffi)
  • We finally wrote (hopefully) all the missing os.xxx() functions, including os.startfile() on Windows and a handful of rare ones on Posix.
  • numpy has a rudimentary C API that cooperates with cpyext
Cheers,
Armin Rigo and Maciej Fijalkowski
Author: "Armin Rigo (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Next page
» You can also retrieve older items : Read
» © All content and copyrights belong to their respective authors.«
» © FeedShow - Online RSS Feeds Reader