15-year Python flaw found in ‘over 350,000’ projects • The Register

At least 350,000 open source projects are believed to be potentially vulnerable to exploitation via a flaw in the Python module that has not been fixed for 15 years.

On Tuesday, security firm Trellix said its threat researchers found a vulnerability in Python tarfile module, which provides a way to read and write compressed packages of files known as tarballs. Initially, insect hunters thought they stumbled upon a zero day.

It turned out to be a 5,500-day problem – the bug lived its best life for the past decade and a half awaiting extinction.

Identified as CVE-2007-4559, the vulnerability emerged on August 24, 2007, in a post on the Python mailing list by Jan Matejek, who was the maintainer of the Python package for SUSE at the time. It can be exploited to overwrite and potentially hijack files on a victim’s computer, when a vulnerable application opens a malicious tar archive via tarfile.

“The vulnerability is basically like this: if you tar a file named "../../../../../etc/passwd" and then you do the administrator untar it, / etc / passwd is overwritten, “Matejek explained at the time.

The tarfile directory traversal flaw was reported on August 29, 2007 by Tomas Hoger, a software engineer at Red Hat.

But it had already been dealt with, more or less. A day earlier, Lars Gustabel, maintainer of the tarfile module, made a change to the code that adds a default true value check_paths parameters and a helper function for the TarFile.extractall() method that generates an error if the path of a tar archive file is not secure.

But the fix didn’t address the TarFile.extract() method – which according to Gustabel “shouldn’t be used at all” – and left open the possibility that extracting data from untrusted archives could cause problems.

In a comment thread, Gustabel explained that he no longer considers this a security issue. “tarfile.py does nothing wrong, its behavior conforms to the definition of pax and the guidelines for resolving the path in POSIX,” he wrote.

“There is no known or possible practical exploit 2022-09-22T01:16:12Z documentation with a warning that it could be dangerous to extract archives from untrusted sources. This is the only thing to do IMO “.

In fact, the documentation describes this gun:

Warning: Never extract archives from untrusted sources without prior inspection. Files may be created outside of pathwayfor example members that have absolute filenames starting with "/" or filenames with colons "..".

Yet here we are, with both extract() Other extractall() still posing the threat of an arbitrary crossing of the route.

“The vulnerability is a path crossing attack in the extract Other extractall functions in the tarfile module that allow an attacker to overwrite arbitrary files by appending the sequence ‘..’ to the filenames in a tar archive, “Kasimir Schulz, a vulnerability researcher for Trellix, explained in a blog post.

The sequence “..” changes the current working path to the parent directory. So, using code like the six-line snippet below, says Schulz, the tarfile the module can be told to read and modify the file’s metadata before it is added to the tar archive. And the result is an exploit.

import tarfile

def change_name(tarinfo):
    tarinfo.name = "../" + tarinfo.name
    return tarinfo

with tarfile.open("exploit.tar", "w:xz") as tar:
    tar.add("malicious_file", filter=change_name)

According to Schulz, Trellix has created a free tool called Creosote to search for CVE-2007-4559. The software has already found the lurking bug in applications like Spyder IDE, an open source science environment written for Python, and Polemarch, an IT infrastructure management service for Linux and Docker.

The company estimates the tarfile the flaw can be found “in over 350,000 open source projects and prevalent in closed-source projects”. He also points this out tarfile it is a default module in any Python project and is present in frameworks created by AWS, Facebook, Google and Intel and in applications for machine learning, automation and Docker containers.

Trellix says it is working on making the repaired code available to affected projects.

“Using our tools, we currently have patches for 11,005 repositories, ready for pull requests,” said Charles McFarland, a vulnerability researcher for Trellix, in a blog post. “Each patch will be added to a fork repository and a pull request made in time. This will help both individuals and organizations become aware of the problem and fix the problem with one click.

“Due to the size of the vulnerable projects, we plan to continue this process in the coming weeks. This is expected to reach 12.06% of all vulnerable projects, just over 70,000 projects upon completion.”

The remaining 87.94% of affected projects may wish to consider other possible options. ®