friendly_idle is done!
I've found a better solution for the remaining issue I had mentioned in the previous blog post.
I also found a fix for an "annoyance" mentioned by Raymond Hettinger on Twitter!
friendly_idle is done!
I've found a better solution for the remaining issue I had mentioned in the previous blog post.
I also found a fix for an "annoyance" mentioned by Raymond Hettinger on Twitter!
friendly_idle is now available. This is just a quick announcement. Eventually I plan to write a longer blog post explaining how I use import hooks to patch IDLE and to provide seamless support for friend/friendly_traceback. Before I incorporated "partial" support for IDLE within friendly, I had released a package named friendly_idle ... but this is really a much better version.
When you launch it from a terminal, the only clue you get that this is not your regular IDLE is from the window title.
Beginning programmers are often surprised by floating point arithmetic inaccuracies. If they use Python, many will often write posts saying that Python is "broken" when the see results as follows:
>>> 0.1 + 0.2 0.30000000000000004
This particular result is not limited to Python. In fact, it is so common that there exists a site with a name inspired by this example (0.30000000000000004.com/), devoted to explaining the origin of this puzzling result, followed by examples from many programming languages.
Python provides some alternatives to standard floating point operations. For example, one can use the decimal module to perform fixed point arithmetic operations. Here's an example.
>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 7
>>> Decimal(0.1) + Decimal(0.2)
Decimal('0.3000000')
>>> print(_)
0.3000000
While one can set the precision (number of decimals) with which operations are performed, printed values can carry extra zeros: 0.3000000 does not look as "nice" as 0.3.
Another alternative included with Python's standard library is the fractions module: it provides support for rational number arithmetic.
>>> from fractions import Fraction
>>> Fraction("0.1") + Fraction("0.2")
Fraction(3, 10)
>>> print(_)
3/10
However, the fractions module can yield some surprising results if one does not use string arguments to represent floats, as was mentioned by Will McGugan (of Rich and Textual fame) in a recent tweet.
>>> from fractions import Fraction as F
>>> F("0.1")
Fraction(1, 10)
>>> F(0.1)
Fraction(3602879701896397, 36028797018963968)
In the second case, 0.1 is a float which means that it carries some intrinsic inaccuracy. For the first case, some parsing is done by Python to determine the number of decimal places to use before converting the result into a rational number. A similar result can be achieved using the limit_denominator method of the Fraction class:
>>> F(0.1).limit_denominator(10) Fraction(1, 10)
In fact, we do not have to be as restrictive in the limitation imposed on the denominator to achieve the same result
>>> F(0.1).limit_denominator(1_000_000_000) Fraction(1, 10)
While we can achieve some "more intuitive" results for floating point arithmetic using special modules from Python, the notation that one has to use is not as simple as "0.1 + 0.2". As Raymond Hettinger often says: "There has to be a better way."
As readers of this blog already know, I created a Python package named ideas to facilitate the creation of import hooks and to enable easy experimentation with modified Python syntax. ideas comes with its own console that support modified Python syntax. It can also be used with IPython (and thus with Jupyter notebooks).
Using ideas, one can "instruct" python to perform rational arithmetic. For example, suppose I have a Python file containing the following:
# simple_math.py a = 0.2 + 0.1 b = 0.2 + 1/10 c = 2/10 + 1/10 print(a, b, c)
I can run this with Python, getting the expected "unintuitive" result:
> py simple_math.py 0.30000000000000004 0.30000000000000004 0.30000000000000004
Alternatively, using ideas, I can execute this file using rational arithmetic:
> ideas simple_math -a rational_math 3/10 3/10 3/10
Using a different import hook, I can have the result shown with floating point notation.
> ideas simple_math -a nicer_floats 0.3 0.3 0.3
Instead of executing a script, let's use the ideas console instead, starting with "nicer_float"
ideas> 0.1 + 0.2 0.3 ideas> 1/10 + 2/10 0.3For "nicer_float", I've also adopted the Pyret's notation: floating-point number immediately preceded by "~" are treated as "approximate" floating points i.e. with the regular inaccuracy.
ideas> ~0.1 + 0.2 0.30000000000000004
And, as mentioned before, I can use ideas with IPython. Here's a very brief example
IPython 8.0.0b1 -- An enhanced Interactive Python. Type '?' for help. In [1]: from ideas.examples import rational_math In [2]: hook = rational_math.add_hook() The following initializing code from ideas is included: from fractions import Fraction In [3]: 0.1 + 0.2 Out[3]: Fraction(3, 10)
Given how confusing floating point arithmetic is to beginners, I think it would be nice if Python had an easy built-in way to switch modes and do calculations as done with ideas in the above examples. However, I doubt very much that this will ever happen. Fortunately, as demonstrated above, it is possible to use import hooks and modified interactive console to achieve this result.
At EuroSciPy in 2018, Marc Garcia gave a lightning talk which started by pointing out that scientific Python programmers like to alias everything, such as
import numpy as np import pandas as pd
and suggested that they perhaps would prefer to use emojis, such as
import pandas as 🐼
However, Python does not support emojis as code, so the above line cannot be used.
A year prior, Thomas A Caswell had created a pull request for CPython that would have made this possible. This code would have allowed the use of emojis in all environments, including in a Python REPL and even in Jupyter notebooks. Unsurprisingly, this was rejected.
Undeterred, Geir Arne Hjelle created a project called pythonji (available on Pypi) which enabled the use of emojis in Python code, but in a much more restricted way. With pythonji, one can run modules ending with 🐍 instead of .py from a terminal. However, such modules cannot be imported, nor can emojis be used in a terminal.
When I learned about this attempt by Geir Arne Hjelle from a tweet by Mike Driscoll, I thought it would be a fun little project to implement with ideas. Below, I use the same basic example included in the original pythonji project.
And, it works in Jupyter notebooks too!
😉
In the past week, there has been an interesting discussion on Python-ideas about Natural support for units in Python. As I have taught introductory courses in Physics for about 20 of the 30 years of my academic career, I am used to stressing the importance of using units correctly, but had never had the need to explore what kind of support for units was available in Python. I must admit to have been pleasantly surprised by many existing libraries.
In this blog post, I will give a very brief overview of parts of the discussion that took, and is still taking place, on Python-ideas about this topic. I will then give a very brief introduction to two existing libraries that provide support for units, before showing some actual code inspired by the Python-ideas discussion.
But first, putting my Physics teacher hat on, let me show you some partial Python code that I find extremely satisfying, and which contains a line that is almost guaranteed to horrify programmers everywhere, as it seemingly reuse the variable "m" with a completely different meaning.
>>> g = 9.8[m/s^2]
>>> m = 80[kg]
>>> weight = m * g
>>> weight
<Quantity(784.0, 'kilogram * meter / second ** 2')>
>>> tolerance = 1.e-12[N]
>>> abs(weight - 784[N]) < tolerance
True
The discussion on Python-ideas essentially started with the suggestion that "it would be nice if Python's syntax supported units". That is, if you could basically do something like:
length = 1m + 3cm
# or even
length = 1m 3cm
and it just worked as "expected". Currently, identifiers in Python cannot start with a number, and writing "3cm" is a SyntaxError. So, in theory, one could add support for this type of construct without causing any backward incompatibility.
While I never thought of it before, as I use Python as a hobby, I consider the idea of supporting handling units correctly to be an absolute requirement for any scientific calculations. Much emphasis is being made on adding type information to ensure correctness: to my mind, adding *unit* information to ensure correctness is even more important than adding type information.
During the course of the discussion on Python-ideas, other possible suggestions were made, some of which are actually supported by at least a couple of existing Python libraries. These suggestions included constructs like the following:
length = 1*m + 3*cm
speed = 4*m / 1*s # or speed = 4 * m / s
length = m(1) + cm(3)
speed = m_s(4)
length = 1_m + 3_cm
speed = 4_m_s
length = 1[m] + 3[cm]
speed = 4[m/s]
length = 1"m" + 3"m"
speed = 4"m/s"
density = 1.0[kg/m**3]
density = 1.0[kg/m3]
# No one suggested something like the following
density = 1.0[kg/m^3]
I will come back to looking at potential new syntax for units, as it currently my main interest in this topic. But first, I want to highlight one other main point of the discussion on Python-ideas, namely: Should the units be defined globally for an entire application, or locally according to the standard Python scopes?
My first thought was "of course, it should follow Python's normal scopes".
Thinking of the opposite argument, what happen if one uses units other than S.I. units in different module, including those from external libraries? Take for example "mile", and have a look at its Wikipedia entry. If one uses units with the same name but different values in different parts of an application, any pretense of using quantities with units to ensure accuracy goes out the window. Furthermore, many units libraries make it possible for users to define they own custom units. What happens if the same name is used for different custom units in different modules, with variables or functions using variables with units in one module are used in a second module?
Still, as long as libraries do not, or cannot change unit definitions globally, and if they provide clear and well-documented access to the units they use, then the normal Python scopes would likely be the best choice.
[For a detailed discussion of these two points of view, have a look at the thread on Python-ideas mentioned above. There doesn't seem to be a consensus as to what the correct approach should be.]
There are many unit libraries available on Pypi. After a brief look at many of them, I decided to focus on only two: astropy.units and pint. These seemed to be the most complete ones currently available, with source code and good supporting documentation available.
I will first look at an example that shows how equivalent description of units are easily handled in both of them. First, I use the units module from astropy:
>>> from astropy import units as u
>>> p1 = 1 * u.N / u.m**2
>>> p1
<Quantity 1. N / m2>
>>> p2 = 1 * u.Pa
>>> p1 == p2
True
Next, doing the same with pint.
>>> import pint
>>> u = pint.UnitRegistry()
>>> p1 = 1 * u.N / u.m**2
>>> p1
<Quantity(1.0, 'newton / meter ** 2')>
>>> p2 = 1 * u.Pa
>>> p1 == p2
True
In astropy, all the units are defined in a single module. Instead of prefacing the units with the name of the module, one can import units directly
>>> from astropy.units import m, N, Pa
>>> p1 = 1 * N / m**2
>>> p2 = 1 * Pa
>>> p1 == p2
True
The same cannot be done with pint.
As I was reading posts from the discussion on Python-ideas, I was thinking that it might be fun to come up with a way to "play" with some code written in a more user-friendly syntax for units. After reading the following, written by Matt del Valle, I decided that I should definitely do it.
My personal preference for adding units to python would be to make instances of all numeric classes subscriptable, with the implementation being roughly equivalent to:
def __getitem__(self, unit_cls: type[T]) -> T: return unit_cls(self)
We could then discuss the possibility of adding some implementation of units to the stdlib. For example:
from units.si import km, m, N, Pa
3[km] + 4[m] == 3004[m] # True 5[N]/1[m**2] == 5[Pa] # True
My first thought was to create a custom package building from and depending on astropy.units, as I had looked at it before looking at pint and found it to have everything one might need. However, as I read its rather unusual license, I decided that I should take another approach: I chose to simply add a new example to my ideas library, making it versatile enough so that it could be used with any unit library that uses the standard Python notation for multiplication, division and power of units, which both pint and astropy do. Note that my ideas library has been created to facilitate quick experiments and is not meant to be used in production code.
First, here's an example that mimics the example given by Matt del Valle above, with what I think is an even nicer (more compact) notation.
python -m ideas -t easy_units
Ideas Console version 0.0.29. [Python version: 3.9.10]
>>> from astropy.units import km, m, N, Pa
>>> 3[km] + 4[m] == 3004[m]
True
>>> 5[N/m^2] == 5[Pa]
True
In addition to allowing '**' for powers of units (not shown above), I chose to also recognize as equivalent the symbol '^' which is more often associated with exponentiation outside of the (Python) programming world.
Let's do essentially the same example using pint instead, and follow it with a few additional lines to illustrate further.
Ideas Console version 0.0.29. [Python version: 3.9.10]
>>> import pint
>>> unit = pint.UnitRegistry()
>>> 3[km] + 4[m] == 3004[m]
True
>>> 5[N/m^2] == 5[Pa]
True
>>> pressure = 5[N/m^2]
>>> pressure
<Quantity(5.0, 'newton / meter ** 2')>
>>> pressure = 5[N/m*m]
>>> pressure
<Quantity(5.0, 'newton / meter ** 2')>
In the last example, I made sure that "N/m*m" did not follow the regular left-to-right order of operation which might have resulted in unit cancellation as we first divide and then multiply by meters.
Using ideas with a "verbose" mode (-v or --verbose), one can see how the source is transformed prior to its execution. Furthermore, in the case of easy_units, sometime a "prefix" is "extracted" from the code, ensuring that the correct names are used. Here's a very quick look.
python -m ideas -t easy_units -v
Ideas Console version 0.0.29. [Python version: 3.9.10]
>>> import pint
>>> un = pint.UnitRegistry()
===========Prefix============
un.
-----------------------------
>>> pressure = 5[N/m^2]
===========Transformed============
pressure = 5 * un.N/(un.m**2)
-----------------------------
>>> pressure
<Quantity(5.0, 'newton / meter ** 2')>
In my previous post, I mentioned that, unlike IPython, friendly/friendly-traceback included values of relevant objects in a traceback. As I wrote in the update, Alex Hall pointed out that one could get this information by using a verbose mode in IPython. Here is the previous example when using the verbose mode.
UPDATE: Alex Hall pointed out that IPython can display the values of variables in the highlighted sections using %xmode verbose. He also suggested a different highlighting strategy when the problematic code spans multiple lines. I go into more details about these two issues in a future blog post.
======
I'm writing this blog post in the hope that some people will be encouraged to test friendly/friendly-traceback with IPython/Jupyter and make suggestions as to how it could be even more useful.
However, before you read any further...
Important clarification: IPython is a professionally developed program which is thoroughly tested, and is an essential tool for thousands of Python programmers. By contrast, friendly/friendly-traceback is mostly done by a hobbyist (myself) and is not nearly as widely used nor as reliable as IPython. Any comparison I make below should be taken in stride. Still, I can't help but draw your attention to this recent tweet from Matthias Bussonnier, the IPython project leader:
I don't believe that friendly/friendly-traceback is mature and stable enough to become part of IPython's distribution. However, it is because of this endorsement that I decided to see what I could do to improve friendly/friendly-traceback's integration with IPython.
The recent release of IPython included many traceback improvements. One of these changes, shown with the screen capture below:
is something that I am happy to have implemented many months ago as mentioned in this blog post. I have no reason to believe that my idea was the impetus for this change in IPython's formatting of tracebacks. Still, I think it validates my initial idea.
However, there have been other changes introduced in this latest IPython release, such as using colour instead of ^^ to highlight the location of the code causing a traceback is something that I had done only for IDLE but not for other environments such as IPython/Jupyter. So, I felt that I had to catch up with what IPython has implemented and, if possible, do even better. Of course, I must recognize that this work is greatly facilitated since I use Alex Hall's excellent stack_data (as well as some other of his packages on which stack_data depends) in friendly-traceback: stack_data is now used by IPython to generate these tracebacks. So, in principle, there is no reason why I shouldn't be able to implement similar features in friendly/friendly-traceback.
Again, I must note that the way I use stack_data is a bit hackish, and definitely not as elegant as it is used within IPython.
Enough of a preamble, time to provide some actual examples.
Here is the result:
I could replicate this example using the friendly console but, instead, I will use the specific IPython integration to see what else we could do at this point.
Until recently, this was all the information that one could get. However, it is now possible to get more details, in a way similar to that provided by IPython, but with the addition of the values of various objects. (Note that the syntax shown below to obtain this information is subject to change; it is just a proof of concept.)
If the highlighting is not adequate, it can be changed by using either named colours (converted to lowercase with spaces removed) or hexadecimal values; the name of the function and its arguments are subject to change:
When IPython is used in a Jupyter notebook (or lab), I chose yet again a different way to present the result. First, let's have a look at a simple example using the Jupyter default.
In this example, only two frames are highlighted. Let's see the result, using friendly.
We get a basic error message with a button to click if we want to have more details.Since we only had two frames in the traceback, where() gives us all the relevant information.
What happens if we have more than two frames in the traceback? First, let's give an example with the Jupyter default.
What happens if we use friendly in this case? Below I show the result after clicking "more"
These new features are simple proofs of concept that have not been thoroughly tested. If you read this far, and hopefully tried it on your own, I would really appreciate getting your feedback regarding the choices I made and any improvement you might be able to suggest.