Thursday, February 04, 2021

Python's tug of war between beginner-friendly features and support for advanced users

Python is my favourite programming language. Since I discovered it in 2004, programming in Python became my favourite hobby. I've tried to learn a few other languages and have never found one as friendly to beginners as Python. As readers of this blog know, these days I am particularly interested in tracebacks, and I am likely paying more attention than most to Python's improvements in this area: Python is becoming more and more precise in the information that it provides to the users when something goes wrong. For example, consider these 2 examples from Python 3.7

Given the message when we try to assign a value to None, we might have expected to see the same when trying to assign a value to the keyword "pass"; instead we get a not so useful "invalid syntax". Of course, if you've been reading this blog before, you won't be surprised that Friendly-traceback can provide a bit more useful information in this case.

However, this is not the point of this post...  Let's see what kind of information Python 3.8 gives us for the first case.

As you can see, it is much more precise: this is a definite improvement.

Let's have a look at another case, using Python 3.8 again:

Again, the dreaded "invalid syntax".  However, this has been significantly improve with the latest Python version, released yesterday.

Again, much better error messages which will be so much more useful for beginners that do not use Friendly-traceback [ even though they should! ;-) ]

There has been a few other similar improvements in the latest release ... but this one example should suffice to illustrate the work done to make Python even friendlier to beginners.  However, this is unfortunately not the whole story.

To make Python useful to advanced users having to deal with large code base, Python has introduced "optional" type annotations. This is certainly something that the vast majority of professional programmers find useful - unlike hobbyists like me.  Let me illustrate this by an example inspired from a Twitter post I saw today.  First, I'll use Python 3.8:

If you know Python and are not actively using type annotations, you likely will not be surprised by the above.  Now, what happens if we try to do the same thing with Python 3.9+

No exceptions are raised! Imagine you are a beginner having written the above code: you would certainly not expect an error then when doing the following immediately after:

Unfortunately, Friendly-traceback cannot (yet!) provide any help with this.

EDIT: this might be even more confusing.


Eventually, I'll make use of the following to provide some potentially useful information.

Ideally, I would really, really like if it were possible to have truly "optional" type annotation, and a way to turn them off (and make their use generate an exception). Alas, I gather that this will never be the case, which I find most unfortunate.


KeithCu said...

Great article.

This is a big challenge across Python. Another example is the ":=" operator.

Try explaining that variation on equals to an 8 year old. And notice that even advanced codebases like Numpy and PyTorch have worked fine for many years without it.

Numpy isn't just a library, it's extended syntax for the language itself to specify how to do optimized operations on n-dimensional arrays. It takes a while to learn to read and write it.

It's been said that even the core developers only understand a small fraction of the full Python language and ecosystem. If more core people understand how complicated Python already is, they would have less inclination to keep adding more.

Another example is the multi-threaded support. There are a number of choices and I don't know if any one person can keep all of them in their head at once.

IMO, it's a no-brainer to remove the ":=" given the benefits versus costs, but removing the excess complexity around multi-threading is much harder. I personally like just using threadpools which release the GIL when blocking, and running pools of Python processes to take full advantage of multiple processors. With that, the GIL running conventional code is not a problem. However, other people have different use cases, so it's a challenge.

smitty1e said...

I submit, with zero support, that the need to keep adding stuff to the language has to do with keeping the training and documentation ecosystems alive.

Unknown said...

Since type annotations will henceforth always be treated as strings, not parsed, it seems like we don't actually need list and dict types to be subscriptable anymore? Can't the typing system just parse "list[x, y, z]" as a string without evaluating it?

laike9m said...

I'm wondering, why doesn't Python raise a compile time error (SyntaxError for example) for `list[1, 2, 3, 4]`?

André Roberge said...

@laike9m This is a valid syntax for type annotation. See PEP 585

laike9m said...

Sorry, I should have been more clear. I know the change in 585, but I thought Python could (theoretically) detect this wrong usage (not putting types, but 1, 2, 3, 4 in the generics)

Rho said...

i'm guessing in few years they will refactor the language again like 2 vs 3. Although i would like to see fstring with ` symbol like in javascript.