Wednesday, May 17, 2017

What if range did not exist?

Over the years, various proposals for new syntactic constructs have been put forward to supplement or replace the range() function for looping over integers. Some of them have been documented in PEPs, whereas various others arose during discussions on the python-dev, comp.lang.python, and python-ideas lists.

This lead Guido van Rossum to write in 2005:

I don’t think that anyone can reasonably disagree with this assessment, even if I must admit to have suggested something somewhat related not too long ago on the python-ideas list.

Nonetheless, can we imagine what would have happened if range() had not been invented so early in Python’s history?

Consider a situation, in an alternative universe, where Monty, initially known as the Monty Python Programming Language, has evolved without guidance from a BDFL. Instead, through the wisdom of the crowds, and by sheer serendipity, it has grown into something that strongly resembles Python, except that there is no range() function - or, at least, not yet.  Please note that any reference to actual Python related discussions are made simply as a starting point for people interested in exploring what really happened, as opposed to what we invented below. Also, the fictitious names chosen below have been added in after I wrote the invented dialogues, with no connection whatsoever to any living person other than me - save for one obvious person who never actually wrote the 3 words I attributed to him.

Excerpts from the Monty-Musings IRC channel logs

# The evolution of the Monty Python Programming Language was known to be greatly influenced by discussions taking place on the Monty-Musings IRC channel.  The logs of these discussions are scattered over the Internet and it can be difficult to piece them together. As a self-appointed historian, I have assembled some relevant contributions found in the IRC logs; these discussions have taken place over many years. You will find these below, together with some added comments which will hopefully be useful to understand the historical context.

The_OP: I find myself often iterating over the integers.  Each time I end up writing an iterator to do so.  I think it would be useful if such an iterator could be included as a builtin in Monty, so that we could write something like:
seq = [x for x in integers(n)]
and this would give us a list with integers from 0 to n-1.

The_Self_Appointed_Gate_Keeper: New builtins should only be added when there is a strong case to be made for it. Given the very weak motivation for this addition, there really is no point in continuing this discussion.

# As usual, everyone ignored comments from The_Self_Appointed_Gate_Keeper and the discussion continued. Nonetheless, since The_Self_Appointed_Gate_Keeper maintained such a long and consistent presence on the Monty-Musings IRC channel, we felt it necessary to mention this contribution as a typical example.

The_Mathematician: what you refer to as “integers” should really be named “Natural numbers”, although this would leave some ambiguity as to whether zero is included or not. Furthermore, integers are an infinite set; what you propose to have here is a half-closed interval [0, n), which should not be confused with a half-closed interval (0, n].
You should really be more careful when you use mathematical terms so as to avoid confusion.

The_Minimalist: Why not simply turn integers into iterables so that we could write, for example:

seq = [x for x in 5]   # seq = [0, 1, 2, 3, 4]

# Reference SPAM 276.   For those unaware of it, the normal process by which changes are made to Monty involves the writing of a Standard Proposal for an Addition to Monty, usually simply referred to as a SPAM.

The_Proofreader:  Turning integers into iterators would likely lead to potential problems since something like

x, = 1

would become syntactically valid.I really think we should avoid introducing syntax that is valid and can lead to very different results based to the addition of a possibly single misplaced comma.

# This observation likely contributed to the rejection of SPAM 276.

The_Zen_Fence_Sitter: The proposal needs to follow the “Explicit is better than implicit” principle, and The_Mathematician makes a good point.  I suggest instead to name this function sequence_of_integers_on_a_semi_closed_interval(n).  However, this would still leaves some ambiguity as to which end is closed and which end is open.   And furthermore, following “Special cases aren't special enough to break the rules”, I am not sure if the use case warrants the introduction of a new builtin.  Then again, given that “Although practicality beats purity”, perhaps there is a good case to be made for this addition.

The_Practical_One:  Given that slices already are half-closed, half-open, following Monty’s convention, it should be clear that it is the upper-half that is open.  However, I would make the suggestion that it should be possible to start at something else than zero. This could be written as semi_open_interval(lower, upper) when the lower argument is non-zero -- although I am not too keen on such a long name.

The_Proofreader: The comparison to slices leaves out the fact that a “slice” notation with a single index, like [n], refers to a single item.  Thus, with a single argument, the meaning of semi_open_interval(n) cannot be obtained by using a parallel with the slice notation. In fact, it would lead to confusion given that when only one argument is used, it is meant to represent the _second_ argument of two, with the first one having the value 0 by default, which would be very unusual for a Monty builtin.

The_Unusual_Syntax_Enthusiast: What about using some standard mathematical notation like
for x in [lower, upper)

# At this point, both The_Practical_One and The_Proofreader jumped in and pointed out the risk of confusion with the notation already existing for lists and for tuples, and the problems associated with parsing more complex expressions with parentheses. This lead to a rather long discussion between the three of them, with a few others occasionally  joining in, unable to resist adding comments to something clearly irrelevant.

The_Mathematician: I think that The_Unusual_Syntax_Enthusiast has a point. In the general situation, the solution is simple.  There are only 4 cases worth considering and, as Knuth has shown many years ago, they can be written as follows:

for x in $[s,e]$:
for x in $(s,e]$:
for x in $[s,e)$:
for x in $(s,e}$:

This does not require any new keyword or builtin function, just a small change to Monty’s syntax to add \$ as a useful symbol. Anyone with even the most basic knowledge of \TeX or \LaTeX would be able to quickly grasp the meaning of each of these 4 cases; it’s hard to think of anything that could possibly be more readable, especially if you keep your variable names short like we do in Mathematics.

By the way, it would be so much better if keywords would be prefixed by a backslash, like \for and \in as it would remove any limitation when choosing variable names as it is, most unfortunately, sometimes completely unavoidable to use multi-letter variable names. Furthermore, it would ensure total backwards compatibility when introducing new keywords. I cannot believe I would be the first one to suggest this obvious improvement to Monty.

# The_Mathematician, whose research on the Riemann Hypothesis had been at a standstill for years and had joined the Monty-Musings channel to maintain the delusion that she could contribute something useful to an important project, left the discussion at this point and, if rumours are to be trusted, went off to work on an alternative implementation of Monty done entirely using TeX macros.

The_Unusual_Syntax_Enthusiast: What about using a simple new notation like
for x in [lower .. upper]

Again, as was often the case following a suggestion from The_Unusual_Syntax_Enthusiast, a pointless and long discussion followed. Reporting on this discussion would not contribute to the understanding of what lead to the final result.

The_Teacher: I do like this basic idea of having such a built-in iterator.  When teaching Monty to beginners, I often use something like
for number in 0, 1, 2, 3, 4:
print(number)

This is flexible enough and works for short sequences of numbers, but is not very useful for longer sequences.  However, instead of a builtin with a meaning that is not immediately obvious, what about having something like

for 0 <= number < 4:
print number

It is both explicit and highly adaptable to other types of sequences. Its meaning would also be clear to my students.

# Historical note 1: A similar proposal gave rise to SPAM 284

# Historical note 2: the use of print both as a keyword and as a function in this contribution by The_Teacher helps us to identify the period at which it was made.  Monty went through what is now referred to as The Great Schism where the language originally known as the Monty Python Programming Language, and later referred to simply as Monty, introduced some backward incompatible changes, the most visible of which was removing print from the keyword list and making it a function. For a while, people referred to the two incompatible versions as Monty (the original version) and New Monty; this eventually changed to Monty Classic (the original version) and New Monty became referred to simply as Monty.
During the period of upheaval, some rather heated discussions took place which lead some people to describe the Monty-Musing IRC channel as Monty Python’s F-ing Circus. A small troupe of British comedians adopted a sanitized version of this name, and based their comedy act on short plays with nonsensical dialogues, no doubt inspired by the Monty-Musings IRC channel discussions. To this day, the Monty Python’s Flying Circus has essentially disappeared save for a tiny cult following, unlike the ubiquitous Monty programming language.

# Historical note 3. Monty programs are easily recognized by their “.my” extension. Monty enthusiasts, aka Montinistas, have been known to name packages meant to be shared widely with a name also starting with “my”, enjoying the apparent contradiction of having something named “My…” and meant to be shared with all.  For example, one of the best known of such packages is Mygame, the famous gaming framework; another one is Mycroft (https://github.com/MycroftAI), an open source artificial intelligence package.  It has been rumoured that Mynecraft’s name has been inspired by Monty’s tradition. Thus, it is understandable that Montinistas, from both sides of the Classic vs New debate, jointly became upset when Mypy (http://mypy-lang.org/) became public as it was seen clearly to be an attempt to do something perceived as going against the spirit of Monty and with a clear indication that it was not meant to be used except by a small elite group -- this, even though its name started with “My”. Eventually, a united front against Mypy, seen as a common enemy, contributed to the now near-unanimous adoption of New Monty as the true successor to the original Monty Python Programming Language.

The_Unusual_Syntax_Enthusiast: What about using an even simpler new notation and drop the square brackets
for x in lower .. upper

# Since this was clearly never going to be considered seriously, another very long and pointless side discussion ensued.

The_Grammarian: While the latest suggestion by The_Teacher is interesting, it does not conform to the established pattern which, in its simplest form starts with

for_stmt: 'for' looping_variable ...

whereas what is proposed does not follow. This could lead to some confusion and would require some non-trivial changes to the parser.

The_Unknown_Coder: What about something like this?

# According to the people we’ve been able to interview, The_Unknown_Coder had the unique habit of including screenshots of the code he ran, instead of using text like all other Montinistas. Since the Monty-Musing IRC Channels logs only preserved the text, we are unable to confirm that what we are presenting is faithfully accurate. We can only guess as to what code was submitted based from comments of other contributors and some minimal recollections from people we interviewed.  There is no evidence that The_Unknown_Coder made any positive contributions to Monty. However, during the discussion that we are documenting, The_Unknown_Coder did make several interventions which seemed to garner some interest from at least one respectable Montinista, even though there is little evidence that those musings had any influence on the final outcome.

The_Grammarian: What you are proposing would still require some difficult adjustments to the parser. It would be much better to use a new keyword instead of “in”, perhaps something like “between” or “over”.

The_Self_Appointed_Gate_Keeper: New keywords should only be added when there is a strong case to be made for it. Given the very weak motivation for this addition, there really is no point in continuing this discussion.

The_Minimalist: This suggestion, which requires one to write the looping variable twice seems wasteful compare to the suggestion from The_Teacher.

What about using a bare slice notation?

# Historical note: this last suggestion from The_Minimalist, and the discussion that followed, gave rise to SPAM 204. The fact that a suggestion by The_Minimalist gave rise to a SPAM was very unusual and is worth noting.

The_Unknown_Coder: (# Apparently in reply to The_Grammarian, although we cannot confirm that the following image reflects the code shown in the IRC channel.)
What about something like this instead, where the keyword is meant to be a short form of in_sequence?

The_Grammarian: That would work.  And I can see how your suggestion would naturally include both types of half-closed intervals, as well as open or closed intervals.

The_NASA_Engineer: In our rocket_launch.my script, we have the need for a sequence in reverse order. Could the proposal be made to accommodate this?

The_Old_Timer: We introduced some syntax to handle this case many years ago; it would be

for i in reversed(half_close_interval(n)):

The_OP: I would prefer a single iterator to handle all cases. Perhaps, like we do for slices, we can use -1 as an optional third argument, to indicate that we are using a reverse order.

The_Dutch_Guy: Call it range.

# Note: we dug through all the available records and, as far as we could tell, this terse statement, seemingly coming out of nowhere, was the only contribution from The_Dutch_Guy to this discussion.  It was also the only time that the word “range” was mentioned.

The_Unknown_Coder: My code can handle this case quite naturally.

The_Practical_One: Actually, using a third argument could allow one have integer sequences with steps greater than 1:

for i in half_close_sequence(0, 5, 2):
print(i)
Which would print the first 3 even integers.

The_Unknown_Coder: My code can handle such cases naturally as well:

The_English_Litt_Graduate: I don’t like it. There are too many symbols and not enough words.

# The discussion continued for many more months but, as far as we could tell, nothing of any significance was added until a suggestion to write a SPAM was made.

The_Friendly_Dev: Dear OP, I think this discussion has been fruitful and deserves to be recorded.  Please, write your idea following the template for a Standard Proposal for an Addition to Monty. When you have finished, upload your SPAM to the DorkCentre repository and post an announcement on 4chan asking people to vote. A month later, we’ll do a vote count, with a weight given to the various 4chan boards following the historical precedent established before most people were able to access 4chan; some people refers to this as Monty’s Electoral College. This vote will decide if your SPAM is accepted or rejected.

Oh, and make sure to submit in triplicate a signed Hacking Agreement for Monty, which we require for anything that someone contribute which may be incorporated in Monty.. Submitting a HAM is required before a SPAM can be made public and a request for votes is posted.

The_OP: Is it wise to ask known trolls to vote on a SPAM like this?

The_Old_Timer: Look, this is how we’ve always proceeded so far, and it has worked out well. You don’t want to have a Committee decide on what to add to the language, otherwise you’ll end up with things like SimpleBeanFactoryAwareAspectInstanceFactory as typical names in the standard library. At the other extreme, you don’t want to have a single person decide or you might end up with something like the Oystr language.  No, democracy using the 4chan weighted Board Electoral System with Trolls  is clearly and tautologically the BEST  way to decide on improvements to Monty. Just trust the wisdom of the crowds. In fact, some countries have adopted a similar practice, also calling it an Electoral College.

The_Friendly_Dev: One more thing, The_OP.  Make sure to include any contribution made by The_Dutch_Guy in your SPAM, however insignificant that contribution seems to you. For some reason, whenever The_Dutch_Guy writes anything, the folks on 4chan get really interested in the outcome and vote in large numbers, all converging towards a sensible solution. Nobody knows why, but it is so striking that it has become immortalized in the Zen of Monty.

# After a hugely popular and positive vote on 4chan, range() became a Monty builtin.