Data Type Enforcement at Runtime Level in Python

Such a wild dream, isn't it?

Python is quite dynamic. Variables can reference strings, then lists, then sets... It's an intrinsic characteristic of the language.

I was there, thinking that it could be possible. Never found a safe trail for achieving this, and for a long time, I thought it was just me. Some think that Type Hints could be a starting point of having such feature, without understanding the concept of Type Hints and annotations.

Some people might think that Data Classes could provide some solution, and nope: Data Classes are just a more elegant way of creating Classes, implementing __init__ and __repr__ methods by default, among other functionalities, but still, it does not take care of type enforcement at runtime level.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
#!/usr/bin/python3.12

from dataclasses import asdict, dataclass

@dataclass
class User:
    username: str
    skills: list[str]  # Expecting a list of strings...


if __name__ == "__main__":
    user = User(username="ivanleoncz", skills="Python")  # ...but got a string.
    print(user)
    print(asdict(user))

"""
$ ./dataclasses_python_3.12.py
User(username='ivanleoncz', skills='Python')
{'username': 'ivanleoncz', 'skills': 'Python'}
"""

The best that you could have is type enforcement at instantiation level, being a Data Class or not, but still, you can pass the expected data while creating the object, later on changing the instance variable, without any validation penalty:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
#!/usr/bin/python3.12
from dataclasses import dataclass, field

@dataclass
class User:
    username: str
    skills: list[str] = field(default_factory=list)

    def validation(self) -> None:

        wrong_data_types = []
        for field in self.__dataclass_fields__.keys():
            """
            Inspecting __dataclass__fields, obtaining direct or original data type,
            verifying if field value had the expected data type.
            """
            data_type_detected = getattr(self.__dataclass_fields__[field].type, "__origin__",
                                         self.__dataclass_fields__[field].type)
            if not isinstance(getattr(self, field), data_type_detected):
                wrong_data_types.append(field)

        if wrong_data_types:
            raise AttributeError(f"data type violation detected: {wrong_data_types}")

    def __post_init__(self):
        self.validation()

if __name__ == "__main__":
    # username must be a str...
    user = User(username=123, skills=["docker", "aws", "azure"])

"""
$ ./dataclasses_instantiation_validation_python_3.12.py
Traceback (most recent call last):
  File "/tmp/test_2.py", line 30, in <module>
    user = User(username=123, skills=["docker", "aws", "azure"])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 5, in __init__
  File "/tmp/test_2.py", line 26, in __post_init__
    self.validation()
  File "/tmp/test_2.py", line 23, in validation
    raise AttributeError(f"data type violation detected: {wrong_data_types}")
AttributeError: data type violation detected: ['username']
"""

Pydantic does this kind of type enforcement gracefully at object creation, not runtime:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
#!/usr/bin/python3.12

from pydantic import BaseModel
 
class User(BaseModel):
    """
    Any wrong data type (e.g. 'int' instead of 'str' when creating an object of User (say "name"),
    would result in an exception like:

        ValidationError: 1 validation error for User
        name
    """
    id: int
    name: str
  

if __name__ == "__main__":

    external_data = {
        'id': 48,
        'name': 'jack62',
    }

    user = User(**external_data)
    print("name (before): ", user.name)

    # Instance variable changed to other data type, without any constraint...
    user.name = 123
    print("name (after):  ", user.name)


"""
$ ./pydantic_python_3.12.py
name (before):  jack62
name (after):   123
"""

It's unlikely to happen in the near future, to be honest. And if it happens, it would be a great change for the specification of the language. Data type enforcement is a great advantage in compiled languages (not our case here), then accidents of defining an integer to a char type variable, not being allowed by the compiler.

To keep a variable consistent in regards with its data type, is something we must do by default while developing in Python. IDEs and linters can help on that. And that's the best that you can have.

I shared something on this thread a couple of weeks ago:

If type enforcement at runtime level is something you really need on your project, you need a different programming language, considering the trade-offs of not using Python.

It would be nice to have such thing in Python, one day. Hope that it will not remain as a wild dream that we all kind of have.

Mastodon