Skip to main content

BO4E Migration Framework

Project description

BO4E Migration Framework (bomf)

BOMF is the BO4E Migration Framework. This repository contains the code of the Python package bomf.

Unittests status badge Coverage status badge Linting status badge Black status badge PyPi Status Badge

Rationale

bomf is a framework, that allows its users to migrate data

  • from source systems (starting with the raw data extracts)
  • into an intermediate, common BO4E based data layer.
  • From there map data to individual target system data models
  • and finally create records in target systems (aka "loading").

The framework

  • encourages user to program consistent data processing pipelines from any source to any target system
  • enforces users to adapt to structured and consistent patterns
  • and by doing so will lead to higher chances for maintainable and reusable code.

Architeture / Overview

The overall setup for a migration from 1-n source systems (A, B, C...) to 1-m target systems (1,2, 3...) might look like this:

graph TD
    A[Source System A] -->|System A DB Dump| A2[Source A Data Model: A JSON Extract]
    B[Source System B] -->|System B CSV Export| B2[Source B Data Model: B CSV Files]
    A2 -->|SourceAToBo4eDataSetMapper| C{Intermediate BO4E Layer aka DataSets}
    B2 -->|SourceBToBo4eDataSetMapper| C
    C -->|validations| C
    C -->|Bo4eDataSetToTarget1Mapper| D1[Target 1 Data Model]
    C -->|Bo4eDataSetToTarget2Mapper| D2[Target 2 Data Model]
    C -->|Bo4eDataSetToTarget3Mapper| D3[Target 3 Data Model]
    D1 -->L1[Target 1 Loader]
    D2 -->L2[Target 2 Loader]
    D3 -->L3[Target 3 Loader]
    L1 -->M1[Target System 1]
    L2 -->M2[Target System 2]
    L3 -->M3[Target System 3]

The Intermediate BO4E Layer (that consists of different so called DataSets) is kind of a contract between the code that maps from the source data model and the code that maps to the target data model.

Data Migration Flow

The migration of specific data from source to target is always the same:

graph TD
    A1{Source Data 1} -->|Export| B1(All source data 1 extracts)
    B1 -->C1[Filter on source data 1 model aka Pre-Select 1]
    A2{Source Data 2} -->|Export| B2(All source data 2 extracts)
    B2 -->C2[Filter on source data 2 model aka Pre-Select 2]
    C1 -->|do not match filter predicate| Z{discarded data}
    C1 -->|match filter criteria| M(Custom Logic: SourceDataSetToBo4EDataSetMapper) 
    C2 -->|do not match filter predicate| Z
    C2 -->|match filter criteria| M
    M -->|mapping| E(BO4E Data Sets)
    E -->F[Validation]
    F -->|obeys a validation rule|E
    F -->|violate any validation rule|Z
    F -->|passes all validations| G[BO4E to Target Mapper]
    G -->|mapping| H(target data model)
    H -->I[Target Loader]
    I -->|load target model|L1[Loader: 1. load to target]
    L1 -->|first: load to|T{Target System}
    L1 -->|then|L2[Loader: 2 optionally poll until target has processed data]
    L2 -->|second: poll until|T
    L2 -->|then|L3[Loader: 3 optionally verify the data have been processed correctly]
    L3 -->|finally: verify|T
    L3 -->|verification failed|Z
    L1 -->|loading failed|Z
    L3 -->|verification successful|Y[The End.]
    Z-->Z1[Monitoring and Logging]
    Z1-->Z2[Human Analyst]
    Z2 -.->|manually checks| T
    Z2 -.->|feedback: heuristically define new rules for|F
    Z2 -.->|feedback: heurisically define new filters for|C

How to use this Repository on Your Machine (Development)

Please follow the instructions in our Python Template Repository. tl;dr: tox.

Contribute

You are very welcome to contribute to this template repository by opening a pull request against the main branch.

Project details


Release history Release notifications | RSS feed

This version

0.1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bomf-0.1.0.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bomf-0.1.0-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file bomf-0.1.0.tar.gz.

File metadata

  • Download URL: bomf-0.1.0.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for bomf-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9d62a840ea80a1ed36a9b8d930f3f4ea180b8b4d3d2e1dd9d5e5d529658f6f2d
MD5 579ec1bd3ceaced83b7ff08493825f94
BLAKE2b-256 8b54d94c4f3465bd9ac51ef60c7c59c766a7a7ab64a4e0f232ec3d8b176c2fbd

See more details on using hashes here.

File details

Details for the file bomf-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bomf-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for bomf-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a41f35cb491052cf183b55fec76120beef87047dec8684ebeb81b1039ebcf33
MD5 3e6838fcbf91b59446a290434a519f81
BLAKE2b-256 c2b4aa3d9447a7870dcaff6841d457aab0041df37e8430a2ad791683d773fb2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page