Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Python Open Source Security

How Python is Fighting Open Source's 'Phantom' Dependencies Problem (blogspot.com) 15

Since 2023 the Python Software Foundation has had a Security Developer-in-Residence (sponsored by the Open Source Security Foundation's vulnerability-finding "Alpha-Omega" project). And he's just published a new 11-page white paper about open source's "phantom dependencies" problem — suggesting a way to solve it.

"Phantom" dependencies aren't tracked with packaging metadata, manifests, or lock files, which makes them "not discoverable" by tools like vulnerability scanners or compliance and policy tools. So Python security developer-in-residence Seth Larson authored a recently-accepted Python Enhancement Proposal offering an easy way for packages to provide metadata through Software Bill-of-Materials (SBOMs). From the whitepaper: Python Enhancement Proposal 770 is backwards compatible and can be enabled by default by tools, meaning most projects won't need to manually opt in to begin generating valid PEP 770 SBOM metadata. Python is not the only software package ecosystem affected by the "Phantom Dependency" problem. The approach using SBOMs for metadata can be remixed and adopted by other packaging ecosystems looking to record ecosystem-agnostic software metadata...

Within Endor Labs' [2023 dependencies] report, Python is named as one of the most affected packaging ecosystems by the "Phantom Dependency" problem. There are multiple reasons that Python is particularly affected:

- There are many methods for interfacing Python with non-Python software, such as through the C-API or FFI. Python can "wrap" and expose an easy-to-use Python API for software written in other languages like C, C++, Rust, Fortran, Web Assembly, and more.

- Python is the premier language for scientific computing and artificial intelligence, meaning many high-performance libraries written in system languages need to be accessed from Python code.

- Finally, Python packages have a distribution type called a "wheel", which is essentially a zip file that is "installed" by being unzipped into a directory, meaning there is no compilation step allowed during installation. This is great for being able to inspect a package before installation, but it means that all compiled languages need to be pre-compiled into binaries before installation...


When designing a new package metadata standard, one of the top concerns is reducing the amount of effort required from the mostly volunteer maintainers of packaging tools and the thousands of projects being published to the Python Package Index... By defining PEP 770 SBOM metadata as using a directory of files, rather than a new metadata field, we were able to side-step all the implementation pain...

We'll be working to submit issues on popular open source SBOM and vulnerability scanning tools, and gradually, Phantom Dependencies will become less of an issue for the Python package ecosystem.

The white paper "details the approach, challenges, and insights into the creation and acceptance of PEP 770 and adopting Software Bill-of-Materials (SBOMs) to improve the measurability of Python packages," explains an announcement from the Python Software Foundation. And the white paper ends with a helpful note.

"Having spoken to other open source packaging ecosystem maintainers, we have come to learn that other ecosystems have similar issues with Phantom Dependencies. We welcome other packaging ecosystems to adopt Python's approach with PEP 770 and are willing to provide guidance on the implementation."

How Python is Fighting Open Source's 'Phantom' Dependencies Problem

Comments Filter:
  • It sounds and looks like this just means "files at a specific path that weren't included in the package."

    Why is this not a bug?

    • No, it is system libraries bundled with a Wheel without proper listing in the SBOM, such CVE scanning tools won't find them and therefore can't give a security warning. It looks like a missing feature in the distribution format. To me the root course seems to be that there is so many package managers, that making a propor interopperable system is impossible. Each language have their own packages manager. Linux have at least 3 commonly used packages systems. MacOS have some. Etc. When a Python package needs
  • FTA:
    " Python is the premier language for scientific computing and artificial intelligence, meaning many high-performance libraries written in system languages need to be accessed from Python code."

    Which has for a long time made me wonder why the entire program isn't written in system languages such as C++. Given how critical performance is to these paradigms and the amount spent on GPU hardware for acceleration you'd think ditching a slow language such a Python would be a no brainer even if maybe as a wrapp

    • Tensorflow was written in C++ and Python, so you can choose which language you want to use.

      Data scientists and machine learning specialists tend to know Python and not C++, so that's why all these libraries are written for use with Python.
      • by Viol8 ( 599362 )

        Frankly if they can learn the maths around data science and AI then learning a new progamming language such as C++ shouldn't be much of a problem for them (thought the C++ steering committee seem to be doing their best to turn it into a dogs dinner write only language).

        • It's not that C++ is a problem, it's that they all know Python, and they would rather put their effort into studying improvements in activation functions than learning C++. And actually C++ is not trivial, it does take time to learn.
          • by Viol8 ( 599362 )

            I guess if they or their organisation want to spend money on extra compute power to offset using python instead of spending a few weeks getting up to speed on C++ thats up to them. Sure, C++ isn't trivial but its hardly Klingon either and they wouldn't need to learn all the obscure dusty parts of its cupboard to use it in this arena.

  • by Anonymous Coward

    There is no reason, none whatsoever for Python to impose this crap on its package managers. Yes the government and big banks want SBOMs from their software vendors, but that sounds like a "them" problem not a "Python" problem. They can use Python or not, makes no difference to the Python maintainers. Just rolling over and doing what the feds want on a hobby project. Awesome.

    The bigger problem is that SBOMs in no way shape or form actually provide for security which is the goal. The security problems are fro

The price one pays for pursuing any profession, or calling, is an intimate knowledge of its ugly side. -- James Baldwin

Working...