Utility for managing metadata.
Project description
MetadataMagic
MetadataMagic is a command line utility for managing media and their respective metadata. This includes cleaning up naming for .json metadata as downloaded by applications such as gallery-dl and yt-dlp, as well as packaging media and metadata together into archive formats such as .cbz, epub, and .mkv.
Installation
MetadataMagic can be downloaded from its PyPI package using pip:
pip3 install Metadata-Magic
If you are installing from source, the following python packages are required:
Configuration
Configuration files for Metadata-Magic are stored in JSON format. Documentation for all the available configuration options can be found at /docs/configuration.md
A default config file can be found at /docs/metadata-magic.json
Metadata-Magic looks for config files in the following locations:
WINDOWS:
%APPDATA%\metadata-magic\config.json%USERPROFILE%\metadata-magic\config.json%USERPROFILE%\metadata-magic.json
(%USERPROFILE% usually refers to a user's home directory, i.e. C:\Users<username>)
LINUX/MAC/UNIX-BASED:
/etc/metadata-magic.json${HOME}/.config/metadata-magic/config.json${HOME}/.metadata-magic.json
Scripts
All scripts contain a [directory] field, which tells the script which directory to search. If left empty, [directory] defaults to the current working directory.
mm-archive
The mm-archive command can be used to archive the media files and .json metadata in a directory into a single media archive. Currently, the mm-archive command supports .cbz, .epub, and mkv formatting. If the given directory contains .txt or .html files, these text files will be archived and converted into the .epub ebook file. If the given directory contains only .png, .jpeg/.jpg, or .gif image files and no text, the files will be archived into a .cbz comic book archive file. If the given directory contains only video files (.mkv, .webm, .mp4, .m4v, .avi), the video will be reformatted into a .mkv file with metadata.
mm-archive [directory] [OPTIONS]
Metadata for the archive will be pulled from existing .json metadata contained in the given directory, and you will be prompted for any metadata fields that could not be found in the existing .json files. Additionally, you can choose to override the automatic metadata generation for any field and enter metadata fields manually using the following options.
-s, --summary Manually input the media's summary/description
-d, --date Manually input the media's publication date
-a, --artists Manually input the artist(s), cover artist(s), and writer(s)
-p, --publisher Manually input the media's publisher
-u, --url Manually input the media's source URL
-t, --tags Manually input the media tags
-r, --rating Manually input the age rating for the media
-g, --grade Manually input the grade/user score
Finally, you can add the -x, --xxxxx option, which will delete the original media files once the archived version is created.
A note on generated MKVs
Besides title and sometimes creation date, there isn't really a formal or even community standard for metadata in the .mkv video container format. So for metadata, MetadataMagic simply attaches a .xml file using the ComicInfo.xml format: the same metadata format used for .cbz files. The original .json metadata file corresponding to the video is also added as an attachment, and will be untouched by other functions of MetadataMagic. All other functions in MetadataMagic will read and edit the included VideoInfo.xml file embedded in the .mkv when doing manipulations.
I'm aware this isn't ideal, since no other programs are configured to be able to read this metadata, and are unlikely to be able to read it in the future. But there is no metadata format for video that is widely supported by video players to my knowledge, so there isn't really a better option. At the very least, video players will recognize the title metadata.
Options specific to .epub
When creating an .epub ebook out of text files, you will be prompted with the option to edit how each of the original files will be structured within the new ebook. By default, each of the original files will be treated as its own section with an entry in the table of contents, with a title based on the original filename. You can then choose to alter this structure for your own needs:
ENTRY TITLE FILES
-----------------------------------------------
001 cover [01] cover.html
002 contents [02] contents.html
003 ch1-1 [03] ch1-1.txt
004 illustration [04] illustration.png
005 ch1-2 [05] ch1-2.txt
006 ch2 [06] ch2.txt
* - Not Included in Table of Contents
OPTIONS:
T - Edit Chapter Title
C - Toggle Inclusion in Contents
G - Group Files
S - Separate Files
W - Commit Changes
Firstly, you can edit the title of any given entry, which will be the title used for that entry in the table of contents.
You can also toggle whether to include an entry in the table of contents. Any entry with this flag will still be included in the ebook, but it will not be listed as a separate entry in the contents.
Finally, you can choose to group entries together so they will be included as one entry in the final ebook. This is useful if for example, you want to include an illustration at the start of a chapter, but don't want the image to be treated as a separate page.
After altering the structure, it may look like this:
ENTRY TITLE FILES
----------------------------------------------------------------------------
001 Cover [01] cover.html
002* Contents [02] contents.html
003 Chapter 1 [03] ch1-1.txt, [04] illustration.png, [05] ch1-2.txt
004 Chapter 2 [06] ch2.txt
* - Not Included in Table of Contents
OPTIONS:
T - Edit Chapter Title
C - Toggle Inclusion in Contents
G - Group Files
S - Separate Files
W - Commit Changes
You will also be prompted to generate a cover image:
Generate Cover Image? (Y/N):
If you choose YES, a generic cover image will be generated for the ebook including the book's title and author, based on the metadata.
mm-bulk-archive
The mm-bulk-archive command is used to bulk archive or extract a large number of files.
Bulk Archiving
mm-bulk-archive [directory] [--format-titles] [--description-length LENGTH]
This will archive every eligible file in a given directory into .cbz comic archives for images and .epub ebooks for text, replacing the original files. Files will only be archived if they have a corresponding .json metadata file, and that metadata will be used for the metadata of the newly created archives. Each individual text and image file will be turned into its own archive file.
Image files whose accompanying metadata have excessively long descriptions will be assumed to have stories in the description, and will be converted into .epub ebook files with the image as the cover and description as the text. You can use the --description-length option to specify how long descriptions can be before being used as .epub text. The default is 1000 characters
If the --format-titles option is included, the titles of the archives will be automatically formatted to remove page number references and use proper capitalization.
NOTE: Video files will NOT be automatically formatted to .mkv files. While the conversion process used by the mm-archive command copies the video and audio streams exactly so there is no loss of quality, it does remux the video into a new container format in a way that is not totally reversible. My goal for this project is to pack media into new formats in ways that are convenient, but that are also non-destructive, allowing the user to still have the exact originals of the media and metadata. That is unfortunately impossible for video, so I've elected to only allow packaging it on an individual basis, ensuring no media is accidentally destroyed.
Bulk Extracting
mm-bulk-archive --extract [directory]
This will read every .cbz and .ebub archive file in the given directory and extract the files within, replacing the archives. You will be prompted on whether to preserve the structure of the archive:
Remove archive structure? (Y/[N])
If you choose to remove the structure, only the original files used to create the archive will be extracted (images/text and .json metadata), and all other files and folders will be discarded. The original files will be extracted directly into the given directory with no containing folder.
If you choose NOT to remove the structure, all the files of the archive will be extracted, including metadata and structural files for the .cbz and epub formats. The files for each archive will be contained in their own folders, extracted into the given directory.
.mkv files will also be "extracted", with Metadata magic restoring the original .json metadata stored within to an external file, if present. However as stated before, the exact original file cannot be recreated, so the .mkv files will remain and not be replaced by the original .mp4, .webm, etc.
mm-update
mm-update [path] [--cover]
The mm-update command allows you to update the metadata fields of .cbz and .epub archives. If you enter a directory as the file path, every media archive in that directory and its subdirectories will be updated with the new metadata. Otherwise if you enter a file path for a specific .cbz or .epub file, only that single archive will be updated. You will be prompted to give metadata for several fields, which can either be altered or left blank. The archive metadata for any field left blank will not be altered, and while fields you responded to will be updated to match your response.
If the --cover option is added, any auto-generated cover images for the archive(s) will be regenerated. Manually created or existing cover images that weren't auto-generated by MetadataMagic will not be affected.
mm-series
mm-series [directory]
The mm-series command allows you to add metadata to .cbz, .epub, and .mkv archives indicating their place in a book/comic series. All the files in the given directory should be part of the same series and in alpha-numerical order. You will then be prompted to give a series title and edit the placement in the series for each of the archive files:
ENTRY FILE
----------------------------------------
1.0 [01A] Example Vol 1.cbz
2.0 [01B] Example Special Issue.cbz
3.0 [02] Example Vol 2.epub
[E] Edit
[R] Reset
[W] Write Series Info
[Q] Quit Without Saving
By default, archives will be given series entry numbers starting at 1.0 and incrementing by 1 with each entry. If any of the archives already contain series information, that order will be given priority first, but the order can be reset to being ordered alpha-numerically.
The user can choose to edit the series number for any entry, including using decimals, useful for special issues and side-entries that don't fit the normal mainline numbering of a series. When you edit an entry, all subsequent entries will automatically be altered to come after the edited entry.
After altering the series numbering, it may look like this:
ENTRY FILE
----------------------------------------
1.0 [01A] Example Vol 1.cbz
2.5 [01B] Example Special Issue.cbz
2.0 [02] Example Vol 2.epub
[E] Edit
[R] Reset
[W] Write Series Info
[Q] Quit Without Saving
Single Series
Some comic and ebook readers don't play well with one-shot comics that contain no metadata about being part of a series. For these readers, it can be useful to mark standalone comics and books instead as being part of a series that only has one entry. For these standalone archives, you can use the --standalone option for the mm-series command:
mm-series --standalone [directory]
With this command the .cbz, .epub, and .mkv files in the given directory will be marked as being entry one of one in their own series, with their series titles being the same as the archive titles.
mm-rename
The mm-rename command allows you to easily bulk rename .cbz, .epub, and .mkv archives, as well media files with associated .json metadata. For media files with a separate metadata file, the mm-rename commands will rename both the media and .json file together so they keep the same filename and continue being associated with each other.
Sort Rename
mm-rename --sort-rename [directory]
The --sort-rename option allows you to rename all the files in a directory to share one template filename with the only differences being an index number indicating what order the files should appear in. The files should already be in alpha-numerical order before being renamed.
This is useful for preparing image files (with or without .json metadata) for being archived into a .cbz comic book archive. Files in a .cbz file are ordered alphabetically, and while Metadata-Magic can sort more irregularly formatted filenames alphanumerically, many comic book readers use extremely simple sorting that can order "1" after "10" or get confused if one file contains an extra space character.
When using --sort-rename you will first be prompted for a template:
Sort Rename Template (Default is "Example"):
This is for the base template that files will be renamed to. By default, the index number will be appended to the end of the template, but you can also indicate where you want the index number to appear using "#" characters. Additionally the index number for each file will be padded with zeros to match the number of "#" characters in the template (Again, useful for simple .cbz sorting).
For example, a template of "Title! (Page ###)" will result in files "Title! (Page 001).jpg", "Title! (Page 002).jpg", "Title! (Page 003).png", etc.
Next, you will be prompted for an starting index number:
Starting Index (Default is "1"):
This is just the starting index number that will be used in place of the index number in the template. You will almost always leave this at 1, but it can be useful if you want the file order to start with 0 for example. You CAN start with indexes less than zero, but that isn't recommended, as most programs won't sort negative numbers properly.
Finally, you will be prompted for a file pattern:
File pattern (Default is ".*"):
This is asking for a regex pattern of the files you want to rename. Any filename that contains a string matched by the pattern will be renamed, while anything that isn't won't. The regex is based on the filename only, not the full path, and the matches are not case sensitive. By default, all files in the given directory will be selected and renamed.
Metadata Rename
mm-rename --metadata-rename [directory]
The --metadata-rename option allows you to bulk rename files in a given directory based on their metadata. This can be either embedded metadata as in .cbz, .epub, or .mkv archives, or associated metadata in a separate .json metadata file. Any files that do not have embedded or associated metadata will be ignored. When running this command, you will be prompted on what template to use:
Rename in the format "[options] title"
A - Artist
D - Date
Options or "C" for Custom:
You are given the option of using the default template which names files based on the "title" field in the metadata as well as the "artist" and/or "date" fields in the metadata. Inputting nothing will just name files based on the title, while inputting "AD" would include the artist and date as well, resulting in filenames like "[2020-01-01 - Person] Example.epub", "[2020-01-02 - Person] Title.cbz", etc.
Additionally, you can also input "C" to use a custom template, and you will be prompted to enter one:
Include Metadata Fields in format "{key}"
Template:
With this, you can format the files any way you wish, using any metadata fields you wish. Keys for metadata fields you wish to include are put between "{}", and all other characters will be used as is. For example, if you wanted to name files based on their title and publisher, you could use the template "<{publisher}>_{title}", resulting in filenames like "<BookCo>_Example.epub", "<ArtCo> Title.cbz", etc.
If the metadata for a file does not contain metadata for one of the fields requested in the template, that file will be ignored and not renamed.
mm-error
The mm-error command allows you to search for errors and abnormalities with files and their metadata.
Long Description
mm-error [directory] --long-description [LENGTH]
The --long-description option allows you to search a given directory for archive and metadata files with excessively long descriptions. This is especially useful for finding files that contain entire stories in the description to accompany the associated image, as these are probably better formatted as an ebook with images.
You can optionally specify the cutoff point at which descriptions will be considered too long with the LENGTH option. LENGTH is determined by number of characters, and the default value is 1000 characters if no LENGTH option is given.
Missing Media
mm-error --missing-media [directory]
The --missing-media option allows you to search a given directory and subdirectory for .json metadata files that don't match with any associated media, and lists the orphaned .json in the terminal. Useful for finding cases where the media and metadata have been separated or the media file failed to download in the first place.
Missing JSON
mm-error --missing-json [directory]
The --missing-json option allows you to search for the opposite of the --missing-media option, searching for media files that do not have an associated .json metadata files. This will NOT list media files in directories that contain no .json files at all, as the command will assume that this is a directory containing files with embedded metadata or no metadata to begin with. This will only list media files with missing .json metadata in directories where otherwise, all other files DO have .json metadata.
Missing Fields
mm-error --missing-fields [directory]
The --missing-fields allows you to search for .cbz, .epub, and .mkv archives that do not contain information for a specific metadata field. When you run this command, you will be prompted for which metadata field to search for:
[T] Missing title
[A] Missing artist
[D] Missing date
[S] Missing summary
[P] Missing publisher
[U] Missing URL
[R] Missing Age Rating
[G] Missing Grade/Score
[L] Missing Labels/Tags
[C] Missing Chain/Series
Which Missing Metadata Field?:
After your response, the command will list every media archive missing information for the particular field in question. This is useful for checking for fields you may have forgotten to add manually or checking which archives you haven't reviewed and scored yet, for example.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metadata_magic-0.14.3-py3-none-any.whl.
File metadata
- Download URL: metadata_magic-0.14.3-py3-none-any.whl
- Upload date:
- Size: 120.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
733eecf4991ccc21dd48fadbc133c91f6f0d2294ba66b3f9bbaf04fe32726e70
|
|
| MD5 |
bb3d865f3d234f494f28566381559a34
|
|
| BLAKE2b-256 |
d28b1322da0841268032da4db6bc5d162619a71e20e8d20068e98331349949d8
|