# FromConfig
Project description
FromConfig
A library to instantiate any Python object from configuration files.
Thanks to Python Fire, fromconfig
acts as a generic command line interface from configuration files with absolutely no change to the code.
Table Of Content
Install
pip install fromconfig
Quickstart
fromconfig
can configure any Python object, without any change to the code.
As an example, let's consider a foo.py
module
class Model:
def __init__(self, learning_rate: float):
self.learning_rate = learning_rate
def train(self):
print(f"Training model with learning_rate {self.learning_rate}")
with the following config files
config.yaml
model:
_attr_: foo.Model
learning_rate: "@params.learning_rate"
params.yaml
params:
learning_rate: 0.1
In a terminal, run
fromconfig config.yaml params.yaml - model - train
which prints
Training model with learning_rate 0.1
Here is a step-by-step breakdown of what is happening
- Load the yaml files into dictionaries
- Merge the dictionaries
- After parsing the resulting dictionary with a default parser (resolving references as
@params.learning_rate
, etc.), it recursively instantiate sub-dictionaries, using the_attr_
key to resolve the Python class / function as an import string. - Finally, the
- model - train
part of the command is a Python Fire syntax, which translates into "get themodel
key from the instantiated dictionary and execute thetrain
method".
This example can be found in docs/examples/quickstart
.
To learn more about FromConfig
features, see the Usage Reference and Examples sections.
Cheat Sheet
fromconfig.fromconfig
special keys
Key | Value Example | Use |
---|---|---|
"_attr_" | "foo.bar.MyClass" | Full import string of a class, function or method |
"_args_" | [1, 2] | Positional arguments |
fromconfig.parser.DefaultParser
syntax
Key | Value | Use |
---|---|---|
"_singleton_" | "my_singleton_name" | Creates a singleton identified by name |
"_eval_" | "call", "import", "partial" | Evaluation modes |
"@params.model" | Reference | |
"${params.url}:${params.port} | Interpolation via OmegaConf |
Why FromConfig ?
fromconfig
enables the instantiation of arbitrary trees of Python objects from config files.
It echoes the FromParams
base class of AllenNLP.
It is particularly well suited for Machine Learning (see examples). Launching training jobs on remote clusters requires custom command lines, with arguments that need to be propagated through the call stack (e.g., setting parameters of a particular layer). The usual way is to write a custom command with a reduced set of arguments, combined by an assembler that creates the different objects. With fromconfig
, the command line becomes generic, and all the specifics are kept in config files. As a result, this preserves the code from any backwards dependency issues and allows full reproducibility by saving config files as jobs' artifacts. It also makes it easier to merge different sets of arguments in a dynamic way through references and interpolation.
fromconfig
is based off the config system developed as part of the deepr library, a collections of utilities to define and train Tensorflow models in a Hadoop environment.
Other relevant libraries are:
- fire automatically generate command line interface (CLIs) from absolutely any Python object.
- omegaconf YAML based hierarchical configuration system with support for merging configurations from multiple sources.
- hydra A higher-level framework based off
omegaconf
to configure complex applications. - gin A lightweight configuration framework based on dependency injection.
- thinc A lightweight functional deep learning library that comes with an integrated config system
Usage Reference
The fromconfig
library relies on two independent components.
- A lightweight syntax to instantiate any Python object from dictionaries (using special keys
_attr_
and_args_
). - A composable, flexible, and customizable framework to parse configs before instantiation. This allows configs to remain short and readable with syntactic sugar to define singletons, references, etc.
Command Line
Usage : call fromconfig
on any number of paths to config files.
fromconfig config.yaml params.yaml
Supported formats : YAML, JSON, and JSONNET.
The command line loads the different config files into Python dictionaries and merge them (it fails in case of any key conflict). It parses the resulting dictionary with the DefaultParser
before calling fromconfig.fromconfig
to instantiate the object.
As the fromconfig
command is wrapped in a Python Fire call, you can manipulate the resulting instantiated dictionary via the command line by using the fire syntax.
For example fromconfig config.yaml - name
instantiates the dictionary defined in config.yaml
and gets the value associated with the key name
.
Config syntax
The fromconfig.fromconfig
function recursively instantiates objects from dictionaries.
It uses two special keys
_attr_
: (optional) full import string to any Python object._args_
: (optional) positional arguments.
For example
import fromconfig
config = {"_attr_": "str", "_args_": [1]}
fromconfig.fromconfig(config) # '1'
FromConfig
resolves the builtin type str
from the _attr_
key, and creates a new string with the positional arguments defined in _args_
, in other words str(1)
which return '1'
.
If the _attr_
key is not given, then the dictionary is left as a dictionary (the values of the dictionary may be recursively instantiated).
If other keys are available in the dictionary, they are treated as key-value arguments (kwargs
).
For example
import fromconfig
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
config = {
"_attr_": "Point",
"x": 0,
"y": 0
}
fromconfig.fromconfig(config) # Point(0, 0)
Note that during instantiation, the config object is not modified. Also, any mapping-like container is supported (there is no special "config" class in fromconfig
).
Parsing
Default
FromConfig
comes with a default parser which sequentially applies
OmegaConfParser
: can be practical for interpolationReferenceParser
: resolves referencesEvaluateParser
: syntactic sugar to configurefunctool.partial
or simple importsSingletonParser
: syntactic sugar to define singletons
For example, let's see how to create singletons, use references and interpolation
import fromconfig
class Model:
def __init__(self, model_dir):
self.model_dir = model_dir
class Trainer:
def __init__(self, model):
self.model = model
config = {
"model": {
"_attr_": "Model",
"_singleton_": "my_model", # singleton
"model_dir": "${data.root}/${data.model}" # interpolation
},
"data": {
"root": "/path/to/root",
"model": "subdir/for/model"
},
"trainer": {
"_attr_": "Trainer",
"model": "@model", # reference
}
}
parser = fromconfig.parser.DefaultParser()
parsed = parser(config)
instance = fromconfig.fromconfig(parsed)
id(instance["model"]) == id(instance["trainer"].model) # True
instance["model"].model_dir == "/path/to/root/subdir/for/model" # True
OmegaConf
OmegaConf is a YAML based hierarchical configuration system with support for merging configurations from multiple sources. The OmegaConfParser
wraps some of its functionality (for example, variable interpolation).
For example
import fromconfig
config = {
"host": "localhost",
"port": "8008",
"url": "${host}:${port}"
}
parser = fromconfig.parser.OmegaConfParser()
parsed = parser(config)
parsed["url"] # 'localhost:8008'
Learn more on the OmegaConf documentation website.
References
To make it easy to compose different configuration files and avoid deeply nested config dictionaries, you can use the ReferenceParser
.
For example,
import fromconfig
parser = fromconfig.parser.ReferenceParser()
config = {"params": {"x": 1}, "y": "@params.x"}
parsed = parser(config)
parsed["y"]
The ReferenceParser
looks for values starting with a @
, then split by .
, and navigate from the top-level dictionary.
In practice, it makes configuration files more readable (flat) and avoids duplicates.
It is also a convenient way to dynamically compose different configs.
For example
import fromconfig
param1 = {
"params": {
"x": 1
}
}
param2 = {
"params": {
"x": 2
}
}
config = {
"model": {
"x": "@params.x"
}
}
parser = fromconfig.parser.ReferenceParser()
parsed1 = parser({**config, **param1})
parsed1["model"]["x"] # 1
parsed2 = parser({**config, **param2})
parsed1["model"]["x"] # 2
Evaluate
The EvaluateParser
makes it possible to simply import a class / function, or configure a constructor via a functools.partial
call.
The parser uses a special key _eval_
with possible values
call
: standard behavior, results inattr(kwargs)
.partial
: delays the call, results in afunctools.partial(attr, **kwargs)
import
: simply import the attribute, results inattr
call
import fromconfig
config = {"_attr_": "str", "_eval_": "call", "_args_": ["hello world"]}
parser = fromconfig.parser.EvaluateParser()
parsed = parser(config)
fromconfig.fromconfig(parsed) == "hello world" # True
partial
import fromconfig
config = {"_attr_": "str", "_eval_": "partial", "_args_": ["hello world"]}
parser = fromconfig.parser.EvaluateParser()
parsed = parser(config)
fn = fromconfig.fromconfig(parsed)
isinstance(fn, functools.partial) # True
fn() == "hello world" # True
import
import fromconfig
config = {"_attr_": "str", "_eval_": "import"}
parser = fromconfig.parser.EvaluateParser()
parsed = parser(config)
fromconfig.fromconfig(parsed) is str # True
Singleton
To define singletons (typically an object used in multiple places), use the SingletonParser
.
For example,
import fromconfig
config = {
"x": {
"_attr_": "dict",
"_singleton_": "my_dict",
"x": 1
},
"y": {
"_attr_": "dict",
"_singleton_": "my_dict",
"x": 1
}
}
parser = fromconfig.parser.SingletonParser()
parsed = parser(config)
instance = fromconfig.fromconfig(parsed)
id(instance["x"]) == id(instance["y"])
Without the _singleton_
entry, two different dictionaries would have been created.
Note that using references is not a solution to create singletons, as the reference mechanism only copies missing parts of the configs.
The parser uses the special key _singleton_
whose value is the name associated with the instance to resolve singletons at instantiation time.
Examples
Manual
It is possible to manipulate configs directly in the code without using the fromconfig
CLI.
For example,
"""Manual Example."""
import fromconfig
class Model:
def __init__(self, learning_rate: float):
self.learning_rate = learning_rate
def train(self):
print(f"Training model with learning_rate {self.learning_rate}")
if __name__ == "__main__":
# Create config dictionary
config = {
"model": {"_attr_": "Model", "learning_rate": "@params.learning_rate"},
"params": {
"learning_rate": 0.1
}
}
# Parse config (replace "@params.learning_rate" by its value)
parser = fromconfig.parser.DefaultParser()
parsed = parser(config)
# Instantiate model and call train()
model = fromconfig.fromconfig(parsed["model"])
model.train()
This example can be found in docs/examples/manual
Custom Parser
One of fromconfig
's strength is its flexibility when it comes to the config syntax.
To reduce the config boilerplate, it is possible to add a new Parser
to support a new syntax.
Let's cover a dummy example : let's say we want to replace all empty strings with "lorem ipsum".
from typing import Dict
import fromconfig
class LoremIpsumParser(fromconfig.parser.Parser):
"""Custom Parser that replaces empty string by a default string."""
def __init__(self, default: str = "lorem ipsum"):
self.default = default
def __call__(self, config: Dict):
def _map_fn(value):
if isinstance(value, str) and not value:
return self.default
return value
# Utility to apply a function to all nodes of a nested dict
# in a depth-first search
return fromconfig.utils.depth_map(_map_fn, config)
cfg = {
"x": "Hello World",
"y": ""
}
parser = LoremIpsumParser()
parsed = parser(cfg)
print(parsed) # {"x": "Hello World", "y": "lorem ipsum"}
This example can be found in docs/examples/custom_parser
Custom FromConfig
The logic to instantiate objects from config dictionaries is always the same.
It resolves the class, function or method attr
from the _attr_
key, recursively call fromconfig
on all the other key-values to get a kwargs
dictionary of objects, and call attr(**kwargs)
.
It is possible to customize the behavior of fromconfig
by inheriting the FromConfig
class.
For example
import fromconfig
class MyClass(fromconfig.FromConfig):
def __init__(self, x):
self.x = x
@classmethod
def fromconfig(cls, config):
if "x" not in config:
return cls(0)
else:
return cls(**config)
config = {}
got = MyClass.fromconfig(config)
isinstance(got, MyClass) # True
got.x # 0
One custom FromConfig
class is provided in fromconfig
which makes it possible to stop the instantiation and keep config dictionaries as config dictionaries.
For example
import fromconfig
config = {
"_attr_": "fromconfig.Config",
"_config_": {
"_attr_": "list"
}
}
fromconfig.fromconfig(config) # {'_attr_': 'list'}
Machine Learning
fromconfig
is particularly well suited for Machine Learning as it is common to have a lot of different parameters, sometimes far down the call stack, and different configurations of these hyper-parameters.
Given a module ml.py
defining model, optimizer and trainer classes
from dataclasses import dataclass
@dataclass
class Model:
"""Dummy Model class."""
dim: int
@dataclass
class Optimizer:
"""Dummy Optimizer class."""
learning_rate: float
class Trainer:
"""Dummy Trainer class."""
def __init__(self, model, optimizer):
self.model = model
self.optimizer = optimizer
def run(self):
print(f"Training {self.model} with {self.optimizer}")
And the following config files
trainer.yaml
: configures the training pipeline
trainer:
_attr_: "training.Trainer"
model: "@model"
optimizer: "@optimizer"
model.yaml
: configures the model
model:
_attr_: "training.Model"
dim: "@params.dim"
optimizer.yaml
: configures the optimizer
optimizer:
_attr_: "training.Optimizer"
learning_rate: @params.learning_rate
params/small.yaml
: hyper-parameters for a small version of the model
params:
dim: 10
learning_rate: 0.01
params/big.yaml
: hyper-parameters for a big version of the model
params:
dim: 100
learning_rate: 0.001
It is possible to launch two different trainings with different set of hyper-parameters with
fromconfig trainer.yaml model.yaml optimizer.yaml params/small.yaml - trainer - run
fromconfig trainer.yaml model.yaml optimizer.yaml params/big.yaml - trainer - run
which should print
Training Model(dim=10) with Optimizer(learning_rate=0.01)
Training Model(dim=100) with Optimizer(learning_rate=0.001)
This example can be found in docs/examples/ml
.
Note that it is encouraged to save these config files with the experiment's files to get full reproducibility. MlFlow is an open-source platform that tracks your experiments by logging metrics and artifacts.
Hyper-Parameter Search
To launch an hyper-parameter search, generate config files on the fly if using the fromconfig
CLI, or config dictionaries.
For example,
import fromconfig
if __name__ == "__main__":
config = {
"model": {
"_attr_": "ml.Model",
"dim": "@params.dim"
},
"optimizer": {
"_attr_": "ml.Optimizer",
"learning_rate": "@params.learning_rate"
},
"trainer": {
"_attr_": "ml.Trainer",
"model": "@model",
"optimizer": "@optimizer"
}
}
parser = fromconfig.parser.DefaultParser()
for dim in [10, 100]:
for learning_rate in [0.01, 0.1]:
params = {
"dim": dim,
"learning_rate": learning_rate
}
parsed = parser({**config, "params": params})
trainer = fromconfig.fromconfig(parsed)["trainer"]
trainer.run()
which prints
Training Model(dim=10) with Optimizer(learning_rate=0.01)
Training Model(dim=10) with Optimizer(learning_rate=0.1)
Training Model(dim=100) with Optimizer(learning_rate=0.01)
Training Model(dim=100) with Optimizer(learning_rate=0.1)
This example can be found in docs/examples/ml
(run python hp.py
).
Development
To install the library from source in editable mode
git clone https://github.com/criteo/fromconfig
cd fromconfig
make install
To install development tools
make install-dev
To lint the code (mypy, pylint and black)
make lint
To format the code with black
make black
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.