You can't avoid the audit of your project packages (you just can't)...

Fantastic Vulnerabilities and Where To Find Them

You might already heard about CVE or Common Vulnerabilities or Exposures, which is an initiative sponsored by DHS (Department of Homeland and Security) and CISA (Cybersecurity and Infrastructure Security Agency), overseen by MITRE Corporation . All CVEs are listed on this website maintained by MITRE. If you want to know more about CVEs, how they are discovered, registered and how they are evaluated and approved, this document from RedHat has one or two words about it. Also, PYPA maintains the advisory-db project, for searching vulnerabilities of Python packages (this is where pip-audit comes into play).

Security is a limitless topic

It's not just about having an armored IPTABLES/Netfilter, and Nginx well configured with HSTS and other secure headers, and tunned Linux Kernel, and your Backend application demanding authenticated requests of 99,9% of your endpoints: from the packages that you download through APT, to PyPi packages that belong to your Python project, one of them could have a vulnerability, or a CVE registered and document.

In terms of security, we can go forever. It's a topic always on the table, from Infrastructure to Software, and it will always be like that.

Before pip-audit

You might be just like me, reviewing package by package from your requirements.txt with 50 different packages on it, searching on CVE listing websites like the one maintained by MITRE or even this one. What a pain, right? To review all packages (one by one) and document the vulnerabilities (if any) on your JIRA card, but, there's no other way. True that some services are available for searching PYSECs, but still not that pragmatic, in my opinion. I always felt more confident on searching CVE lists, manually. If you protect your API endpoints with all that you can, you can't avoid the audit of your project packages. You just can't..

After pip-audit

I think that everyone who cares about security, was craving for something like this. It was announced by Dustin Ingram yesterday, the stable release of pip-audit.

From time to time, I'm involved in security audits on companies that I work (I was a SysAdmin before being a Software Developer), so I can guarantee, that this comes from heaven:

# Release Dec 1st, the stable release: https://pypi.org/project/pip-audit/
# Announced by Dustin Ingram: https://twitter.com/di_codes/status/1466109133711724551
# Output of executing pip-audit
$ pip-audit
WARNING:pip_audit._service.pypi:Warning: pip 20.0.2 doesn't support the `cache dir` subcommand, unable to reuse the `pip` HTTP cache and using "/home/ivanleoncz/.pip-audit-cache" instead
\ Auditing webencodings (0.5.1)
Found 26 known vulnerabilities in 4 packages
Name Version ID Fix Versions
------------------- ------- -------------- ------------
django-filter 2.2.0 PYSEC-2021-64 2.4.0
djangorestframework 3.11.0 PYSEC-2020-263 3.11.2
pillow 7.0.0 PYSEC-2020-78 7.1.0
pillow 7.0.0 PYSEC-2020-76 7.1.0
pillow 7.0.0 PYSEC-2021-137 8.2.0
pillow 7.0.0 PYSEC-2021-138 8.2.0
pillow 7.0.0 PYSEC-2021-70 8.1.0
pillow 7.0.0 PYSEC-2021-331 8.3.0
pillow 7.0.0 PYSEC-2021-41 8.1.1
pillow 7.0.0 PYSEC-2020-80 7.1.0
pillow 7.0.0 PYSEC-2021-71 8.1.0
pillow 7.0.0 PYSEC-2021-69 8.1.0
pillow 7.0.0 PYSEC-2021-38 8.1.1
pillow 7.0.0 PYSEC-2021-139 8.2.0
pillow 7.0.0 PYSEC-2021-94 8.2.0
pillow 7.0.0 PYSEC-2021-39 8.1.1
pillow 7.0.0 PYSEC-2021-36 8.1.1
pillow 7.0.0 PYSEC-2020-77 7.1.0
pillow 7.0.0 PYSEC-2021-40 8.1.1
pillow 7.0.0 PYSEC-2021-37 8.1.1
pillow 7.0.0 PYSEC-2021-317 8.3.2
pillow 7.0.0 PYSEC-2021-35 8.1.1
pillow 7.0.0 PYSEC-2021-93 8.2.0
pillow 7.0.0 PYSEC-2021-42 8.1.1
pillow 7.0.0 PYSEC-2021-92 8.2.0
pip 20.0.2 PYSEC-2021-437 21.1
Name Skip Reason
------------- ----------------------------------------------------------------------------
pkg-resources Dependency not found on PyPI and could not be audited: pkg-resources (0.0.0)
# All project packages:
$ pip3 freeze
asgiref==3.4.1
CacheControl==0.12.10
certifi==2021.10.8
charset-normalizer==2.0.8
cyclonedx-python-lib==0.11.1
Django==3.2.9
django-filter==2.2.0
djangorestframework==3.11.0
html5lib==1.1
idna==3.3
lockfile==0.12.2
msgpack==1.0.3
packageurl-python==0.9.6
packaging==21.3
Pillow==7.0.0
pip-api==0.0.23
pip-audit==1.0.0
progress==1.6
pyparsing==3.0.6
pytz==2021.3
requests==2.26.0
requirements-parser==0.2.0
resolvelib==0.8.1
six==1.16.0
sqlparse==0.4.2
toml==0.10.2
types-setuptools==57.4.4
types-toml==0.10.1
urllib3==1.26.7
webencodings==0.5.1
view raw file.txt hosted with ❤ by GitHub

Final Words

I'm pretty sure that this tool, will become a standard for all of us who develop software or maintain Python projects (open source or not). It's true that there's a great dependency on security at infrastructure level, being from network traffic to webservers and Operating Systems, but it's common to observe Software Developers not being concerned on reviewing security of packages, even though they are really worried about how protected is the API from a variety of vectors.

Your platform is also secure, by having components without vulnerabilities (keep in mind). If you don't audit your packages, it's time to make a change, now being more efficient. Cheers.

Greatest Common Factor, Factors of a Number, and Juggling with List Comprehensions

It's always a good time, to open up a terminal and expressing calculations through code, just as any scientist would do, with the exception that I'm no scientist (at all), although I kind love Computer Science. I'm Software Developer so, it is expected, I guess.

Being from another country, speaking another language and having a different education at school when I was young, it feels a little bit different to read and identify mathematics content like Greatest Common Factor, Factors, and so on. Reading this article, I throught that would be nice to make some code that finds the factors of a number, and even better, the GCF from a list of numbers. I'm still wondering on what would be the best code, in terms of efficiency and with less time complexity, though.

Factors


Well, I started to perform a procedural code, 2 or 3 lines, but remembering how powerful are List Comprehensions (good source btw), many iterations and data strcutures can be performed with a single line.

Mission accomplished, with an easy and elegant code:

def get_factors(n: int) -> list:
return [i for i in range(1, n + 1) if n % i == 0]
view raw factors.py hosted with ❤ by GitHub

GCF (Greatest Common Factor) - 1° Round


This takes me back to my childhood, when I was 12, I guess.
Easy to put on a paper. To express the solution with code? Somewhat, at the beginning.
Here's the 1st approach, relying on the return of get_factors() function (list of factors).

Mission accomplished, but not meaningful (why using a list of lists of factors ?):

def get_gcf(list_of_factors: list) -> int:
highest_number = max([l[-1] for l in list_of_factors])
common_factors = list()
for n in range(1, highest_number + 1):
if all([True if factors[-1] % n == 0 else False for factors in list_of_factors]):
common_factors.append(n)
return common_factors[-1]
# In [24]: get_gcf([get_factors(20), get_factors(8), get_factors(6), get_factors(16)])
# Out[24]: 2
# In [25]: get_gcf([get_factors(6), get_factors(15)])
# Out[25]: 3
# In [26]: get_gcf([get_factors(9), get_factors(20), get_factors(25)])
# Out[26]: 1
view raw gcf.py hosted with ❤ by GitHub

GCF (Greatest Common Factor) - 2° Round


No need of dealing with lists. It's just about numbers, just as you do on a paper.
A couple of variables were changed for the sake of readability, and the rest stills the same.
Mission accomplished and much better, isn't it?

def get_gcf(numbers: list) -> int:
common_factors = list()
for i in range(1, max(numbers) + 1):
if all([True if n % i == 0 else False for n in numbers]):
common_factors.append(i)
return common_factors[-1]
# In [34]: get_gcf([get_factors(20)[-1], get_factors(8)[-1], get_factors(6)[-1], get_factors(16)[-1]])
# Out[34]: 2
# In [35]: get_gcf([20, 8, 6, 16])
# Out[35]: 2
view raw gcf_improved.py hosted with ❤ by GitHub

Was an interesting task. It's a mix of childhood memories of mathematics from high school with my perspective nowadays as Software Developer, scratching the surface of Computer Science, but passionate on finding answers through programming languages, which in my case, is Python, most of the time.

I had fun today, indeed.

Thanks, Diary.

Quick Commands for Docker Containers

It's being a while that I work with Docker, and some commands are on top of my head, some don't. These commands are commonly used on a normal container management routine (not including networking layer). If you are getting started with Docker or it's the situation like "can't remember that command...", this might be handy.


List Running Containers


docker ps --all
# Output example
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# 183188078a50 ubuntu "bash" 4 minutes ago Created ubuntu
# 1b61fd72fefc web_web "bash -c 'pip instal…" 2 hours ago Up 2 hours 0.0.0.0:8000->8000/tcp django
# 58e2d4a52190 postgres:latest "docker-entrypoint.s…" 2 weeks ago Up 2 hours 5432/tcp postgres

Create Container


# Creates and interactive container (-it) with name (--name) ubuntu_test
docker create --name ubuntu -it ubuntu bash

Rename Container


docker ps --all
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# fc8c543eb199 web_web "bash -c 'pip3 insta…" 42 hours ago Exited (0) 39 hours ago django
# 35ba4e983483 ubuntu "bash" 4 weeks ago Exited (0) 4 weeks ago ubuntu_test_x11
# 58e2d4a52190 postgres:latest "docker-entrypoint.s…" 6 weeks ago Exited (0) 39 hours ago postgres
docker rename ubuntu_test_x11 ubuntu
docker ps --all
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# fc8c543eb199 web_web "bash -c 'pip3 insta…" 42 hours ago Exited (0) 39 hours ago django
# 35ba4e983483 ubuntu "bash" 4 weeks ago Exited (0) 4 weeks ago ubuntu
# 58e2d4a52190 postgres:latest "docker-entrypoint.s…" 6 weeks ago Exited (0) 39 hours ago postgres

Run Container


docker start ubuntu_test

List Container Processes


docker top ubuntu
# UID PID PPID C STIME TTY TIME CMD
# root 16396 16373 0 12:44 pts/0 00:00:00 bash

List Container Directory


docker exec -it ubuntu sh -c "ls -l /var/log"
# total 416
# -rw-r--r-- 1 root root 17514 Jun 16 20:43 alternatives.log
# drwxr-xr-x 1 root root 4096 Jun 16 20:43 apt
# -rw-r--r-- 1 root root 58592 Apr 16 00:11 bootstrap.log
# -rw-rw---- 1 root utmp 0 Apr 16 00:11 btmp
# -rw-r--r-- 1 root root 276622 Jun 16 20:43 dpkg.log
# -rw-r--r-- 1 root root 3328 Jun 16 20:39 faillog
# -rw-r--r-- 1 root root 1321 Jun 16 20:39 fontconfig.log
# -rw-rw-r-- 1 root utmp 30368 Jun 16 20:39 lastlog
# drwxr-xr-x 2 root adm 4096 Jun 16 20:39 nginx
# drwxrwxr-t 2 root postgres 4096 Jun 16 20:39 postgresql
# drwxr-xr-x 2 rabbitmq rabbitmq 4096 Apr 6 05:18 rabbitmq
# drwxr-xr-x 2 root root 4096 Dec 23 2019 sysstat
# -rw-rw-r-- 1 root utmp 0 Apr 16 00:11 wtmp

Copy From Container


# Need to pass container name and fully qualified path
# for the file and the destination dir on host machine.
docker cp ubuntu_test:/test.txt /root

Attach to Container


# For example, if the container is interactive, this will drop you inside the container
# with a shell ready to type commands inside the containers environment.
docker attach ubuntu
# The normal operation of commands inside the container is allowed,
# just as running on your machine terminal.
#
# ivanleoncz@ilex-an5:~ $ docker attach ubuntu_test_x11
# root@183188078a50:/# apt-get update
# Get:1 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
# Get:2 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
# Get:3 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [328 kB]
# ...
# root@183188078a50:/# apt-get install python3 python3-dev python3-pip python3-venv
# root@183188078a50:/# touch test.txt

Open BASH Inside Container


docker exec -it ubuntu bash
# This is different than attaching your stdio, stdout and stderr to the container.

Stop Container


docker stop ubuntu

Iterators: brief words about them

Generally, we all iterate through a variety of objects in Python, like:

Tuples
t = ('apple', 'banana', 'orange')
for fruit in t:
print(fruit)
Lists
l = ['apple', 'banana', 'orange']
for fruit in l:
print(fruit)
Dictionaries
d = {'apple': 400, 'banana': 850, 'orange': 660}
# Without items() method, the iteration over dict 'd', it's on its keys, only.
for fruit in d:
print(fruit)
Strings
s = 'apple'
for letter in s:
print(letter)
Sets
s = {"apple", "banana", "orange", "coconut", "mango"}
# set data structure is unordered. Therefore, the order in which
# the data is displayed by the iteration, is different
# of how it's initially defined.
#
# For more information: https://docs.python.org/3/tutorial/datastructures.html#sets
for fruit in s:
print(fruit)

But besides theses objects, you can create your own iterable object! Cool huh?

Not that often you might run on this situation, but it's good to know how to build your own iterator and also, you will learn a bit more of Python internals and how the iteration works behind the curtains.

The __iter__ and __next__ methods

Data structure objects like the ones listed above, are all iterable objects. You can get an iterator for any of them, by using the iter() method, and then iterating over it with next() method. If all elements from the iterator were called, then a StopIteration exception will be thrown.

t = ('apple', 'banana', 'orange')
# From Python documentation:
# iter() : Get an iterator from an object.
t_iter = iter(t)
print(next(t_iter))
print(next(t_iter))
print(next(t_iter))
print(next(t_iter))
# [Output]
#
# $ python3 iterable.py
# apple
# banana
# orange
# Traceback (most recent call last):
# File "iterable.py", line 8, in <module>
# print(next(t_iter))
# StopIteration


Your Iterator

Built as class, your iterator should have __iter__() and __next__() methods implemented: one for initializing your iterator object, and the other for providing the current iterator value, also calculating the next iteration:

class MyIterator:
def __iter__(self):
self.a = 1
return self
def __next__(self):
x = self.a
self.a += 1
return x
obj = MyIterator()
my_iter = iter(obj)
print(next(my_iter))
print(next(my_iter))
print(next(my_iter))
print(next(my_iter))
print(next(my_iter))

The problem here is that, without a condition, this iteration can go forever:

class MyIterator:
def __iter__(self):
self.a = 1
return self
def __next__(self):
x = self.a
self.a += 1
return x
obj = MyIterator()
my_iter = iter(obj)
for i in my_iter:
print(i)
if i == 10:
print("This iteration will go forever if we don't stop it...")
break
Depending on your code, you might want to have a condition externally expressed, but most of the cases, the condition of how many interations will be supported by the iterator, are defined on the class which provides the iterator.


Max Number of Iterations

Just as any class, you can define the __init__() method for the iterator class, where you can define the limit of the iterator:

class MyIterator:
def __init__(self, limit):
self.limit = limit
def __iter__(self):
self.a = 1
return self
def __next__(self):
if self.a <= self.limit:
x = self.a
self.a += 1
return x
else:
raise StopIteration
obj = MyIterator(10)
my_iter = iter(obj)
for x in my_iter:
print(f"Step: {x}")


Final Words

Hope you had fun while reviewing this topic and hope that it might help you some day. I decided to write it here, for I went thtough some situation where implementing an iterator was necessary, and here's a record of something that I initially tried, in order to understand how to build one.

From now on, everytime that you iterate through an iterable object, you can have an idea of what's going on with this object, how the data is being processed, stored, and understand that, there might be very specific scenarios where you would like to implement your own iterator.

For more examples and resources, here's a cool document from Python official documentation.

Mastodon