A module implementing custom literal suffixes using pure Python. custom-literals
mimics C++’s user-defined literals (UDLs) by defining literal suffixes that can
be accessed as attributes of literal values, such as numeric constants, string
literals and more.
(c) RocketRace 2022-present. See LICENSE file for more.
See the examples/
directory for more.
Function decorator syntax:
from custom_literals import literal from datetime import timedelta @literal(float, int, name="s") def seconds(self): return timedelta(seconds=self) @literal(float, int, name="m") def minutes(self): return timedelta(seconds=60 * self) print(30 .s + 0.5.m) # 0:01:00
Class decorator syntax:
from custom_literals import literals from datetime import timedelta @literals(float, int) class Duration: def s(self): return timedelta(seconds=self) def m(self): return timedelta(seconds=60 * self) print(30 .s + 0.5.m) # 0:01:00
Removing a custom literal:
from custom_literals import literal, unliteral @literal(str) def u(self): return self.upper() print("hello".u) # "HELLO" unliteral(str, "u") assert not hasattr("hello", "u")
Context manager syntax (automatically removes literals afterwards):
from custom_literals import literally from datetime import timedelta with literally(float, int, s=lambda x: timedelta(seconds=x), m=lambda x: timedelta(seconds=60 * x) ): print(30 .s + 0.5.m) # 0:01:00
Currently, three methods of defining custom literals are supported:
The function decorator syntax @literal
, the class decorator syntax @literals
, and the
context manager syntax with literally
. (The latter will automatically unhook the literal
suffixes when the context is exited.) To remove a custom literal, use unliteral
.
Custom literals are defined for literal values of the following types:
Type | Example | Notes |
---|---|---|
int |
(42).x |
The Python parser interprets 42.x as a float literal followed by an identifier. To avoid this, use (42).x or 42 .x instead. |
float |
3.14.x |
|
complex |
1j.x |
|
bool |
True.x |
Since bool is a subclass of int , int hooks may influence bool as well. |
str |
"hello".x |
F-strings (f"{a}".x ) are also supported. The string will be formatted before the literal suffix is applied. |
bytes |
b"hello".x |
|
None |
None.x |
|
Ellipsis |
....x |
Yes, this is valid syntax. |
tuple |
(1, 2, 3).x |
Generator expressions ((x for x in ...) ) are not tuple literals and thus won’t be affected by literal suffixes. |
list |
[1, 2, 3].x |
List comprehensions ([x for x in ...] ) may not function properly. |
set |
{1, 2, 3}.x |
Set comprehensions ({x for x in ...} ) may not function properly. |
dict |
{"a": 1, "b": 2}.x |
Dict comprehensions ({x: y for x, y in ...} ) may not function properly. |
In addition, custom literals can be defined to be strict, that is, only allow the given
literal suffix to be invoked on constant, literal values. This means that the following
code will raise a TypeError
:
@literal(str, name="u", strict=True) def utf_8(self): return self.encode("utf-8") my_string = "hello" print(my_string.u) # TypeError: the strict custom literal `u` of `str` objects can only be invoked on literal values
By default, custom literals are not strict. This is because determining whether a suffix was
invoked on a literal value relies on bytecode analysis, which is a feature of the CPython
interpreter, and is not guaranteed to be forwards compatible. It can be enabled by passing
strict=True
to the @literal
, @literals
or literally
functions.
Stability
This library relies almost entirely on implementation-specific behavior of the CPython
interpreter. It is not guaranteed to work on all platforms, or on all versions of Python.
It has been tested on common platforms (windows, ubuntu, macos) using python 3.7 through
to 3.10, but while changes that would break the library are quite unlikely, they are not
impossible either.
That being said, custom-literals
does its absolute best to guarantee maximum
stability of the library, even in light of possible breaking changes in CPython internals.
The code base is well tested. In the future, the library may also expose multiple
different backends for the actual implementation of builtin type patching. As of now,
the only valid backend is forbiddenfruit
, which uses the forbiddenfruit
library.
Feature | Stability |
---|---|
Hooking with the forbiddenfruit backend |
Quite stable, but may be affected by future releases. Relies on the ctypes module. |
Strict mode using the strict=True kwarg |
Quite stable, but not forwards compatible. Relies on stack frame analysis and opcode checks to detect non-literal access. |
Type safety
The library code, including the public API, is fully typed. Registering and unregistering
hooks is type-safe, and static analysis tools should have nothing to complain about.
However, accessing custom literal suffixes is impossible to type-check. This is because
all major static analysis tools available for python right now (understandably) assume
that builtins types are immutable. That is, the attributes and methods builtin types
cannot be dynamically modified. This goes against the core idea of the library, which
is to patch custom attributes on builtin types.
Therefore, if you are using linters, type checkers or other static analysis tools, you
will likely encounter many warnings and errors. If your tool allows it, you should disable
these warnings (ideally on a per-diagnostic, scoped basis) if you want to use this library
without false positives.
Should I use this in production?
Emphatically, no! But I won’t stop you.
Nooooooo (runs away from computer)
I kind of disagree: yessss (dances in front of computer)
Why?
Python’s operator overloading allows for flexible design of domain-specific languages.
However, Python pales in comparison to C++ in this aspect. In particular, User-Defined
Literals (UDLs) are a powerful feature of C++ missing in Python. This library is designed
to emulate UDLs in Python, with syntactic sugar comparable to C++ in elegance.
But really, why?
Because it’s possible.
How? (please keep it short)
custom-literals
works by patching builtin types with custom objects satisfying the
descriptor protocol, similar to
the builtin property
decorator. The patching is done through a “backend”, which
is an interface implementing functions to mutate the __dict__
of builtin types.
If strict=True
mode is enabled, the descriptor will also traverse stack frames
backwards to the invocation site of the literal suffix, and check the most recently
executed bytecode opcode to ensure that the literal suffix was invoked on a literal value.
How? (I love detail)
Builtin types in CPython are implemented in C, and include checks to prevent
mutation at runtime. This is why the following lines will each raise a TypeError
:
int.x = 42 # TypeError: cannot set 'x' attribute of immutable type 'int' setattr(int, "x", 42) # TypeError: cannot set 'x' attribute of immutable type 'int' int.__dict__["x"] = 42 # TypeError: 'mappingproxy' object does not support item assignment
However, these checks can be subverted in a number of ways. One method is to use
CPython’s APIs directly to bypass the checks. For the sake of stability, custom-literals
calls the curse()
and reverse()
functions of the forbiddenfruit
library
to implement these bypasses. Internally, forbiddenfruit
uses the ctypes
module
to access the C API and use the ctypes.pythonapi.PyType_Modified()
function to
signal that a builtin type has been modified. Other backends may also be available in the future,
but are not implemented at the moment. (As an example, there is currently a bug
in CPython that allows mappingproxy
objects to be mutated without using ctypes
.
This was deemed too fragile to be included in the library.)
Python’s @property
decorator
implements the descriptor protocol.
This is a protocol that allows for custom code to be executed when accessing specific
attributes of a type. For example, the following code will print 42
:
class MyClass: @property def x(self): print(42) MyClass().x
custom-literals
patches builtin types with objects implementing the same protocol,
allowing for user-defined & library-defined code to be executed when invoking a literal
suffix on a builtin type. It cannot however use @property
directly, as elaborated
below.
The descriptor protocol is very flexible, used as the backbone of bound methods,
class methods, and static methods and more. It is defined by the presence of one
of the following methods*:
__set_name__
The descriptor methods can be invoked from an instance (some_instance.x
) or from
a clas