Report bugs | Sign in
Powered by Melange
Release 0-6-20100201

Student Name: Joe Amenta
Mentor Name: Benjamin Peterson
Title: 3to2 tool for backporting Py3 code
Abstract: This proposal focuses on implementing a set of fixers "lib3to2" for code currently written for Python 3.x to convert that code into a format that can be run in a Python 2.x environment.

3to2 will be a tool that will encourage developers programming in Python to go forward confidently with developing in 3.x without worrying about backwards compatibility. Possible future improvements will backport code to earlier versions, as many third-party packages are supported as far back as version 2.3.
Public info:

Here is the proposal I submitted (When submitting my proposal, the "public info" box was not available):

 

Summary

I propose to create a tool, similar to 2to3 [1] that refactors code written for Python 3.x into code compatible with Python 2.x. This tool complements 2to3 for two reasons:

  1. Whereas 2to3 is a tool that helps migrate code on a 2.x branch to a 3.x branch, 3to2 will be a tool that encourages developers to learn the 3.x syntax without worrying about it being incompatible with existing code that has not yet been migrated to 3.x for whatever reason

  2. Code written in Python 3.x is cleaner than in 2.x, and therefore preferable to maintain. Instead of modifying code for projects developed in version 2.x and converting with 2to3 every time to maintain maximum compatibility, developers can start taking advantage of the 3.x features while still keeping support for the 2.x branches of their projects.

[1] http://wiki.python.org/moin/SummerOfCode/2009

With both a 2to3 tool to ease the migration of large amounts of code to the newer standards and a 3to2 tool to encourage developers to get used to the newer standards, I feel confident that developers who rely on Python for a large portion of their code will feel comfortable moving forward with 3.x.

As 2to3 is written in such a way as to be extensible for more fixers, I plan to reuse much of the code for 2to3 that handles fixers, so as not to reinvent the wheel, and keep my focus on writing the fixers (lib3to2) themselves. This is important, because there are several potential problems that will have to be addressed before 3to2 can be considered comprehensive:

  • Python 3.0 came with completely reworked bytes / text / Unicode handling. The 2.x method of handling the difference was a unified str type, which causes more problems for 2to3 than for 3to2 (“Foo Bar” → 3to2 → u”Foo Bar” with high confidence), but the split must be addressed as a major portion of this proposal.

  • Versioning is a major issue with this proposal, because many various application suites are supported at versions at least as far back as 2.4 (and scattered support for 2.3). However, fixers for changes as far back as 2.4 are beyond the scope of this lib3to2 proposal, as the purpose is to bridge the 3.x → 2.x gap, with the understanding that 2.6 was the branch designed to make the 2.x → 3.x gap easier. The target version will be the 2.5 branch.

  • Function annotations, added in 3.0, are not syntactically relevant to core Python, so they can be safely removed, but a command-line option, disabled by default, will permit encapsulating them in a local dictionary, named func_annotations, to permit the functionality of the feature without allocating unnecessarily an object in the namespace.

Schedule

Start of program:

By May 23 (start coding date), I plan to be familiar with the following:

  • Python 3.x: the different incompatibilities that code written for 3.x have when running them on a 2.x interpreter.

  • 2to3: the methods for implementing code fixers

  • Mercurial [2]: keeping my project up-to-date following the “Commit early, commit often” mantra.

  • Bitbucket [3]: hosting my project for review by others

[2] http://www.selenic.com/mercurial/wiki/

[3] http://www.bitbucket.org

Midterm evaluation:

By July 6 (midterm evaluation start date), I plan to have the following completed:

  • Fixers for the most common code problems, such as the print function, class/metaclass, int / long, str/bytes

  • Framework for command-line arguments to direct potentially ambiguous fixers.

  • By this point, because of the nature of Python 2.6, 2.6 should be able to run a large chunk of code converted using 3to2.

Final evaluation:

By August 17 (“pencils down” date), I hope to have:

  • A module that successfully converts code written for 3.x into functionally equivalent code compatible with the latest version of the 2.5 branch.

  • The tool accepts command-line arguments to direct ambiguous fixers for the most current Python 3.x and 2.x versions.

  • The code is documented such that any future developer, at some point, will be able to understand what each fixer does and how to write their own fixer modules to either customize their experience or bring the lib3to2 project up-to-date with the working versions of Python.

  • The major goal for this project is to successfully encourage developers to move on with 3.x without worrying about backwards incompatibility. No current stable GNU/Linux distribution listed on distrowatch.com [4] has a 3.x interpreter as the default. [5] 3to2 will allow for the newest tools to run on these systems without developers having to actively maintain two branches of their code until the end of support life of 2.x.

[4] http://distrowatch.com/search.php?pkg=Python&pkgver=3

[5] Exception: Source Mage GNU/Linux, which actively downloads and configures its packages based on the latest developer source versions. See http://www.sourcemage.org/New_to_Source_Mage_GNU/Linux

About Me

My name is Joe Amenta. I am a 19-year-old sophomore majoring in Economics at Michigan State University. In the fall semester of 2008, I took an introductory programming course taught in Python. The dates of the 3.0 release schedule fell within the dates of my course: Python 3.0rc1 and Python 2.6rc1 were released early in the semester, and right at the end of the semester, Python 3.0 final was released. As the course was taught in 2.5 syntax, the code examples we were given would no longer work in Python 3.0, and there was little to no instruction about how the things that we were learning would become obsolete in the new version. I took it upon myself to learn what was new and report the results.

I started up a blog [6] with the following goals: Document the changes that would affect the specific things that we were learning, and report the results of converting our class code examples [7] to 3.0 syntax with 2to3. I was successful in these goals.

[6] http://python30.blogspot.com/

[7] http://web.cse.msu.edu/~cse231/Examples/CoursePack/Python/

I am enrolled in a physics course for this summer so that I can be eligible for admission into the MSU College of Engineering and majoring in Computer Science. The course is a 4-credit course and fully online. It will not affect the amount of time that I devote to the project proposed in this document. After I graduate with my Computer Science degree, I will finish my degree in Economics, to have expertise in a field where my computer knowledge will be applicable in the real world.

Contact Info

The most reliable way to contact me is through e-mail: amentajo@msu.edu

I can also be contacted via XMPP: joe.amenta@gmail.com

Urgent requests can be sent to me via phone. Contact me via the above electronic methods to get my number if you want that option.

My mailing address is likewise obtainable.

Things to do if I finish early:

  • Implement confidence rankings for fixers

  • Add support for directives

  • Write tests for some untested fixers in 2to3

The first two of these are current project ideas for the 2to3 tool for GSoC 2009 [8], so collaboration with the students that pick up those ideas would help if I get to this stage.

[8] http://wiki.python.org/moin/2to3