代码之家 › 专栏 › 技术社区 › DevilsThumb

赋值时的Pydantic类型转换

pydantic-v2 validation python-3.x python

DevilsThumb · 技术社区 · 1 年前

我想使用Pydantic在分配时使用特定的哈希函数将明文字符串密码转换为字节类型的哈希值。

这是一个最小的例子,显示了我目前(不起作用)的方法。然而,我对Pydantic还没有很深的了解。

import bcrypt
from pydantic import BaseModel, field_validator

def hash_password(password: str) -> bytes:
    return bcrypt.hashpw(password.encode('utf-8'), salt = bcrypt.gensalt())

class Password(BaseModel):
    hash_value: bytes
    
    @field_validator("hash_value")
    @classmethod
    def set_password(cls, plain_password: str) -> bytes:
        return hash_password(plain_password)
    
class Settings:
    DEFAULT_PASSWORD = "my_plain_password"
    
settings = Settings()
    
password_doc = Password(
    hash_value = settings.DEFAULT_PASSWORD
)

起初,我意外地将hash_values声明为str,没有意识到 hashpw 类型为字节。这不知怎么奏效了 hash_password 函数在赋值时被调用。然而,所有发生的隐式类型转换都使我的哈希密码无效。

现在的问题是,Pydantic在赋值时需要一个字节值,并隐式转换字符串 settings.DEFAULT_PASSWORD 在将其传递给 set_password 方法,即使这个方法需要字符串类型。

我的错误消息:

Traceback (most recent call last):
  File "xxx", line 20, in <module>
    password_doc = Password(
                   ^^^^^^^^^
  File "xxx", line 176, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
  File "xxx", line 13, in set_password
    return hash_password(plain_password)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "xxx", line 5, in hash_password
    return bcrypt.hashpw(password.encode('utf-8'), salt = bcrypt.gensalt())
                         ^^^^^^^^^^^^^^^
AttributeError: 'bytes' object has no attribute 'encode'. Did you mean: 'decode'?

编辑:

非常感谢Dunes的回答,这解决了我的大部分问题。然而,我注意到 set_password 这种方法比我想象的要频繁。

我正在使用Beanie模型存储一个配置文档,其中链接了密码文档:

class Password(base.Password, Document):
    hash_value: bytes
    
    @field_validator("hash_value", mode="before")
    @classmethod
    def set_password(cls, plain_password: str | bytes) -> bytes:
        # new implementation

class Config(base.Config, Document):
    password: Link[Password]
    
    def get_password(self) -> Link[Password]:
        return self.password

在获取所有链接的同时检索Config文档,调用 set_password 方法: models.Config.find_one(fetch_links=True) 在这种情况下,使用实际存储的二进制hash_value调用该方法。所以我必须返回值,以防它是二进制的。

这就是应该发生的事情吗?或者更确切地说,我的代码中有一些bug。

编辑2

当从数据库访问哈希密码时,Beanie可能会自动进行第二次验证,这也是完全有意义的,因为数据库可以从python应用程序外部进行更改。

我最初不想为set_password方法允许二进制参数,因为pedantic隐式转换了我的输入。在“before”模式下,情况不再如此,我应该允许数据库验证的二进制参数。

1 回复 | 直到 1 年前

Dunes 1 年前

问题是,默认情况下,field_validator使用“after”验证器。也就是说,它们在pydantic自己的内部验证器之后运行。Pydantic知道你想要一个 bytes 它已经通过了 str 它知道如何转换 str 和 字节 ,因此它在将结果传递给验证器之前会自动进行编码。

如果你添加 mode='before' 然后你的验证器会先运行。然而,它必须能够接受任何输入,并且如果给它一个int或列表或其他什么,它就不会崩溃。

如。

@field_validator("hash_value", mode="before")
@classmethod
def set_password(cls, plain_password: str | bytes) -> bytes:
    if isinstance(plain_password, bytes):
        try:
            plain_password = plain_password.decode("utf8")
        except UnicodeDecodeError as ex:
            # not strictly necessary as UnicodeDecodeError subclasses ValueError
            # but shows that you must handle possible errors and raise ValueErrors
            # when an input is invalid. And how to provide a supplementary
            # error message.
            raise ValueError("password is not a valid utf8 byte-string")
    elif not isinstance(plain_password, str):
        raise ValueError
    return hash_password(plain_password)