# Python Usage

## Installation

```bash
poetry add ska-ser-skuid
```

If you are using [Pydantic](https://docs.pydantic.dev/) in your project and want to include
SKUID types directly in your models:

```bash
poetry add ska-ser-skuid[pydantic]
```

---

## Creating a SKUID

{func}`mint_skuid <ska_ser_skuid.mint_skuid>` is the primary function for generating
new identifiers.

```python
from ska_ser_skuid import EntityType, mint_skuid

skuid = mint_skuid(EntityType.SBD)
print(skuid)  # e.g.  sbd-6txs9jhxnk7
```

### Long form

You can pass `form="long"` (or the `Form.LONG` enum value) to get the four-part long form,
which exposes the generator ID and creation date as human-readable fields:

```python
from ska_ser_skuid import EntityType, mint_skuid

skuid = mint_skuid(EntityType.EB, form="long")
print(skuid)  # e.g.  eb-986-20260218-6txs9jhxnk7
```

Both forms carry identical information. A short SKUID can always be converted to
long form (see [Conversion functions](#conversion-functions) below) and vice-versa.

---

## Setting a specific Generator ID

The generator ID is a 10 bit component of the SKUID that helps reduce the chance of collisions between separate instances producing IDs at the same millisecond. By default, the generator ID is derived from a uniform hash of `gethostname()`. However, in some situations, you might want to explicitly set the generator ID.

For example, if you know you will always have a certain number of shards creating IDs, you can pass a generator IDs when calling mint_skuid to make certain your shards will always produce distinct identifiers from one another. The value of generator_id can be any integer from 0 to 1023, inclusive.

```python
from ska_ser_skuid import mint_skuid

shard_id = 7
mint_skuid("txn", generator_id=shard_id, form="long")
# txn-7-20260225-7zdy52y0enp
```

If you want to partition your IDs according to some non-integer parameter (say, a user id) you can hash it to a 10 bit integer using {func}`make_generator_id <ska_ser_skuid.make_generator_id>`.

```python
from ska_ser_skuid import make_generator_id, mint_skuid

user_id = "user777"
mint_skuid("prj", generator_id=make_generator_id(user_id.encode()))
# prj-7zfyy31jdab
```

---

## Generating Scan IDs

Use {func}`get_scan_id <ska_ser_skuid.get_scan_id>` to generate 48-bit integer
scan IDs. These are not SKUIDs [because SPEAD requires integers.](https://confluence.skatelescope.org/display/SWSI/Scan+ID+-+number+of+bits)

Use {data}`LOW_SCAN <ska_ser_skuid.LOW_SCAN>` or
{data}`MID_SCAN <ska_ser_skuid.MID_SCAN>` to avoid collisions between
scan IDs on the two telescopes.

```python
from ska_ser_skuid import LOW_SCAN, MID_SCAN, get_scan_id

scan_id_low = get_scan_id(LOW_SCAN)
scan_id_mid = get_scan_id(MID_SCAN)

print(scan_id_low % 2)  # 0
print(scan_id_mid % 2)  # 1
```

---

## Conversion functions

Three helper functions accept either a SKUID string **or** a `(prefix, snowflake)`
pair and return a transformed representation.

### `short_skuid`

```python
from ska_ser_skuid import short_skuid

# From a long-form SKUID
short_skuid("sbd-986-20260218-6txs9jhxnk7")
# sbd-6txs9jhxnk7

# From a (prefix, integer) pair
short_skuid("sbd", 7702948232484455)
# sbd-6txs9jhxnk7

# Passing a short form SKUID is a no-op
short_skuid("sbd-6txs9jhxnk7")
# sbd-6txs9jhxnk7
```

### `long_skuid`

```python
from ska_ser_skuid import long_skuid

# From a short-form string
long_skuid("sbd-6txs9jhxnk7")
# sbd-986-20260218-6txs9jhxnk7

# From a (prefix, integer) pair
long_skuid("sbd", 7702948232484455)
# sbd-986-20260218-6txs9jhxnk7

# Passing a long form SKUID is a no-op
long_skuid("sbd-986-20260218-6txs9jhxnk7")
# sbd-986-20260218-6txs9jhxnk7
```

### `int_skuid`

Converts a SKUID to an `(entity_type, integer)` named tuple, suitable for
storing in e.g. an `int64` database column.

```python
from ska_ser_skuid import int_skuid

result = int_skuid("sbd-6txs9jhxnk7")
print(result.prefix)  # sbd
print(result.uid)  # 7702948232484455
```

### Backwards-compatibility

Prior to ADR-129, the SKUID specification called for a four-component SKUID consisting of a prefix, generator ID, date field, and suffix for uniqueness. The long form representation of the current SKUIDs is designed to be backwards-compatible with this format.

If you have code that expects to work with pre-ADR-129 SKUIDs, you can call `long_skuid()` on any received IDs and it will convert any short form SKUIDs and pass along any long ones.

```python
from ska_ser_skuid import long_skuid

api_post1 = {"id": "txn-7zjs6gbyfbp", "data": {}}
api_post2 = {"id": "txn-999-20260225-7zjs6gbyfbp", "data": {}}


def to_internal(api_data):
    return dict(api_data, id=long_skuid(api_data["id"]))


assert to_internal(api_post1) == to_internal(api_post2)
```

If you need to accept noncompliant IDs, for example older values with non-digit generator ID components, you can use `long_skuid(enforce_validation=False)` that will still convert short SKUIDs to the long format, but it will silently pass along any legacy IDs without an error.

```python
from ska_ser_skuid import long_skuid

noncompliant_id = "sbd-legacy-20260107-123"
long_skuid(noncompliant_id, enforce_validation=False)
# 'sbd-legacy-20260107-123'
```

---

## Pydantic Integration

If your project uses [Pydantic](https://docs.pydantic.dev/), you can type your ID fields with `ShortSkuid` or `LongSkuid` to automatically
validate input data and normalise it to your prefered SKUID format.

If you have a field that can represent any type of SKUID (for example in a polymorphic status model)
you can use these types directly in your models. In this example, the model will accept either short or
long SKUIDs with any entity type prefix and normalise them to the short form.

```python
from pydantic import BaseModel

from ska_ser_skuid import ShortSkuid


class Status(BaseModel):
    entity_id: ShortSkuid
    status: str = "NEW"


s1 = Status(entity_id="sbd-8nv7sbhxm3e")
s2 = Status(entity_id="sbd-986-20260301-8nv7sbhxm3e")
assert s1 == s2
```

On the other hand, if you have fields that always represent SKUIDs for a particular entity type, you can use the type-specific variant of `ShortSkuid` or `LongSkuid`, e.g. `ShortSkuid[Literal[EntityType.SBD]]`. These will normalise inputs the same as the generic classes, but they will also validate that any SKUIDs input have a matching entity type prefix. Additionally, they can accept bare integers as input (i.e. snowflake integers loaded from `bigint` columns in a database) because they already know the correct prefix to apply for the SKUID.

```python
from contextlib import suppress
from typing import Literal

from pydantic import BaseModel, Field, ValidationError

from ska_ser_skuid import EntityType, ShortSkuid, mint_skuid


class Transaction(BaseModel):
    transaction_id: ShortSkuid[Literal[EntityType.TXN]] = Field(
        default_factory=lambda: mint_skuid(EntityType.TXN)
    )


transaction = Transaction()
print(transaction.transaction_id)  # e.g. txn-6txs9jhxnk7

with suppress(ValidationError):  # Raises for mismatched prefix.
    Transaction(transaction_id="sbd-6txs9jhxnk7")

# Accepts an integer and adds the known prefix:
txn_from_int = Transaction(transaction_id=1234)
print(txn_from_int.transaction_id)  # txn-16j
```

---

### Validation

The conversion utilities will throw an `InvalidSkuidError` unless you set `enforce_validation=False` but in some situations, you may want to explicitly check whether a SKUID is valid or not before accepting it.

```python
from ska_ser_skuid import is_valid_skuid

is_valid_skuid("sbd-6txs9jhxnk7")  # True
is_valid_skuid("sbd-986-20260218-6txs9jhxnk7")  # True
is_valid_skuid("not-a-skuid")  # False
is_valid_skuid(
    "sbd-986-20260218-ZZZZZZZZ"
)  # False  (invalid uppercase characters)
```

### Parsing (advanced)

{class}`SnowflakeSkuid.parse <ska_ser_skuid.skuid.SnowflakeSkuid>` gives access to
the individual components of a SKUID. Treat SKUIDs as opaque identifiers wherever
possible — parse only when you genuinely need the constituent fields.

```python
from ska_ser_skuid.skuid import SnowflakeSkuid

sk = SnowflakeSkuid.parse("sbd-986-20260218-6txs9jhxnk7")
print(sk.entity_type)  # EntityType.SBD
print(sk.generator_id)  # 986
print(sk.timestamp_ms)  # Unix epoch milliseconds
print(sk.datetime)  # datetime(2026, 2, 18, ..., tzinfo=timezone.utc)
print(sk.snowflake_id)  # 7702948232484455
```