Skip to content

Incorrect padding in Market-1501 filenames on export #99

Open
@archibald1418

Description

@archibald1418

Based on CVAT docs and Datumaro docs

when exporting to Market-1508 format, Datumaro should save annotations to files with certain naming convention:

images_<any_subset_name>.txt

query/image_name_1.jpg
bounding_box_<any_subset_name>/image_name_2.jpg
bounding_box_<any_subset_name>/image_name_3.jpg

image_name = 0001_c1s1_000015_00.jpg

0001 - person id
c1 - camera id (there are totally 6 cameras)
s1 - sequence
000015 - frame number in sequence
00 - means that this bounding box is the first one among the several

but person_id is not being zero-padded on export.

Solution: add f'{pid:04d}'

Notes to consider

  • Fixing this in CVAT could lead to ValueError: Unknown format code 'd' for object of type 'float'. Seems that values of label attributes with {"input_type":"number"} turn into floats inside DatasetItem.attributes, need to investigate. Likely culprit is Datumaro because the floated values appear after conversion operations during export, but I couldn't find exactly where.

  • CVAT engine/test_rest_api.py unit tests on import-export that use POST api/tasks/{id}/data (e.g. these ones) crash, because of the assertion that the frame filenames will stay the same on import, which is not the case. Solution for now: skip Market-1501 format altogether since it's the only format that has strict naming conventions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions