Tuesday, April 10, 2018

Improving SyntaxError in PyPy

For the last year, my halftime job has been to teach non-CS uni students to program in Python. While doing that, I have been trying to see what common stumbling blocks exist for novice programmers. There are many things that could be said here, but a common theme that emerges is hard-to-understand error messages. One source of such error messages, particularly when starting out, is SyntaxErrors.

PyPy's parser (mostly following the architecture of CPython) uses a regular-expression-based tokenizer with some cleverness to deal with indentation, and a simple LR(1) parser. Both of these components obviously produce errors for invalid syntax, but the messages are not very helpful. Often, the message is just "invalid syntax", without any hint of what exactly is wrong. In the last couple of weeks I have invested a little bit of effort to make them a tiny bit better. They will be part of the upcoming PyPy 6.0 release. Here are some examples of what changed.

Missing Characters

The first class of errors occurs when a token is missing, often there is only one valid token that the parser expects. This happens most commonly by leaving out the ':' after control flow statements (which is the syntax error I personally still make at least a few times a day). In such situations, the parser will now tell you which character it expected:

>>>> # before
>>>> if 1
  File "<stdin>", line 1
    if 1
       ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> if 1
  File "<stdin>", line 1
    if 1
       ^
SyntaxError: invalid syntax (expected ':')
>>>>

Another example of this feature:

>>>> # before
>>>> def f:
  File "<stdin>", line 1
    def f:
        ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> def f:
  File "<stdin>", line 1
    def f:
         ^
SyntaxError: invalid syntax (expected '(')
>>>>

Parentheses

Another source of errors are unmatched parentheses. Here, PyPy has always had slightly better error messages than CPython:

>>> # CPython
>>> )
  File "<stdin>", line 1
    )
    ^
SyntaxError: invalid syntax
>>>

>>>> # PyPy
>>> )
  File "<stdin>", line 1
    )
    ^
SyntaxError: unmatched ')'
>>>>

The same is true for parentheses that are never closed (the call to eval is needed to get the error, otherwise the repl will just wait for more input):

>>> # CPython
>>> eval('(')
  File "<string>", line 1
    (
    ^
SyntaxError: unexpected EOF while parsing
>>>

>>>> # PyPy
>>>> eval('(')
  File "<string>", line 1
    (
    ^
SyntaxError: parenthesis is never closed
>>>>

What I have now improved is the case of parentheses that are matched wrongly:

>>>> # before
>>>> (1,
.... 2,
.... ]
  File "<stdin>", line 3
    ]
    ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> (1,
.... 2,
.... ]
  File "<stdin>", line 3
    ]
    ^
SyntaxError: closing parenthesis ']' does not match opening parenthesis '(' on line 1
>>>>

Conclusion

Obviously these are just some very simple cases, and there is still a lot of room for improvement (one huge problem is that only a single SyntaxError is ever shown per parse attempt, but fixing that is rather hard).

If you have a favorite unhelpful SyntaxError message you love to hate, please tell us in the comments and we might try to improve it. Other kinds of non-informative error messages are also always welcome!

Tuesday, March 27, 2018

Leysin Winter Sprint 2018: review

Like every year, the PyPy developers and a couple of newcomers gathered in Leysin, Switzerland, to share their thoughts and contribute to the development of PyPy.

As always, we had interesting discussions about how we could improve PyPy, to make it the first choice for even more developers. We also made some progress with current issues, like compatibility with Python 3.6 and improving the performance of CPython extension modules, where we fixed a lot of bugs and gained new insights about where and how we could tweak PyPy.

We were very happy about the number of new people who joined us for the first time, and hope they enjoyed it as much as everyone else.

Topics

We worked on the following topics (and more!):
  • Introductions for newcomers
  • Python 3.5 and 3.6 improvements
  • CPyExt performance improvements and GC implementation
  • JIT: guard-compatible implementation
  • Pygame performance improvements
  • Unicode/UTF8 implementation
  • CFFI tutorial/overview rewrite
  • py3 test runners refactoring
  • RevDB improvements
The weather was really fine for most of the week, with only occasional snow and fog. We started our days with a short (and sometimes not so short) planning session and enjoyed our dinners in the great restaurants in the area. Some of us even started earlier and continued till late night. It was a relaxed, but also very productive atmosphere. On our break day on Wednesday, we enjoyed the great conditions and went skiing and hiking.

Attendees

  • Arianna
  • Jean-Daniel
  • Stefan Beyer
  • Floris Bruynooghe
  • Antonio Cuni
  • RenĂ© Dudfield
  • Manuel Jacob
  • Ronan Lamy
  • Remi Meier
  • Matti Picus
  • Armin Rigo
  • Alexander Schremmer
Leysin is easily reachable by Geneva Airport, so feel free to join us next time!


Cheers,
Stefan

Saturday, January 13, 2018

PyPy 5.10.1 bugfix release for python 3.5

We have released a bug fix PyPy3.5-v5.10.1 due to the following issues:
  • Fix time.sleep(float('nan')) which would hang on Windows
  • Fix missing errno constants on Windows
  • Fix issue 2718 for the REPL on Linux
  • Fix an overflow in converting int secs to nanosecs (issue 2717 )
  • Using kwarg 'flag' to os.setxattr had no effect
  • Fix the winreg module for unicode entries in the registry on Windows
Note that many of these fixes are for our new beta version of PyPy3.5 on Windows. There may be more unicode problems in the Windows beta version, especially concerning directory- and file-names with non-ASCII characters.

On macOS, we recommend you wait for the Homebrew package to prevent issues with third-party packages. For other supported platforms our downloads are available now.
Thanks to those who reported the issues.

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7 and CPython 3.5. It’s fast (PyPy and CPython 2.7.x performance comparison) due to its integrated tracing JIT compiler.
We also welcome developers of other dynamic languages to see what RPython can do for them.
This PyPy 3.5 release supports:
  • x86 machines on most common operating systems (Linux 32/64 bits, macOS 64 bits, Windows 32 bits, OpenBSD, FreeBSD)
  • newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux,
  • big- and little-endian variants of PPC64 running Linux,
  • s390x running Linux
Please update, and continue to help us make PyPy better.
Cheers
The PyPy Team

Monday, January 8, 2018

Leysin Winter sprint: 17-24 March 2018

The next PyPy sprint will be in Leysin, Switzerland, for the thirteenth time. This is a fully public sprint: newcomers and topics other than those proposed below are welcome.

(Note: this sprint is independent from the suggested April-May sprint in Poland.)

Goals and topics of the sprint

The list of topics is open, but here is our current list:

  • cffi tutorial/overview rewrite
  • py3 test runners are too complicated
  • make win32 builds green
  • make packaging more like cpython/portable builds
  • get CI builders for PyPy into mainstream projects (Numpy, Scipy, lxml, uwsgi)
  • get more of scientific stack working (tensorflow?)
  • cpyext performance improvements
  • General 3.5 and 3.6 improvements
  • JIT topics: guard-compatible, and the subsequent research project to save and reuse traces across processes
  • finish unicode-utf8
  • update www.pypy.org, speed.pypy.org (web devs needed)

As usual, the main side goal is to have fun in winter sports :-) We can take a day off (for ski or anything else).

Exact times

Work days: starting March 18th (~noon), ending March 24th (~noon).

Please see announcement.txt for more information.