Skip to main content

Converts MS Excel formulas to Python and evaluates them.

Project description

Excel Calculator

https://travis-ci.org/bradbase/xlcalculator.png?branch=master https://coveralls.io/repos/github/bradbase/xlcalculator/badge.svg?branch=master https://img.shields.io/pypi/v/xlcalculator.svg https://img.shields.io/pypi/pyversions/xlcalculator.svg https://img.shields.io/pypi/status/xlcalculator.svg

In short xlcalculator is an attempted re-write of the koala2 library with intent to modernize the code, increase maintainability and keep compatibility with external software solutions (self-indulgently one is - FlyingKoala ).

Koala2’s heritage has been “spreadsheet replacement” in order to get performant calculation. In essence replacing the calculation engine in Excel with one which works faster. And, all credit to the civic hacker crew at we are ants, they absolutely succeeded. In various iterations of the project since it appears as though the koala2 has tried moving toward the idea of something like openpyxl with a “calculate” button.

This is not the direction I’ve taken.

Koala2 had a few features which enabled me to use koala2 quite differently. I admit these features may not have been core to the wider audience. The features I am specifically interested in are the “advanced” features of setting output cells on load and saving/loading the state of the Python representation of the worksheet. With setting the output cell on load it was possible to load only one formula, and/or the related parent formulas and the input cells involved with the network of formulas. In essence focusing the resulting “loaded Excel” to far less than the entire workbook. And, once in Python, it could be persisted (saved) and re-loaded. This created a mathematical model defined in the language of Excel formulas but executable (“evaluated”) and iterable in Python without the need for a person to translate it.

This use case provides the ability to view Excel workbooks as definitions of a calculation (a model or part thereof). Where the model is an abstraction that just so happens to be expressed in the form of an Excel spreadsheet. So, in that sense, koala2 wasn’t simply replacing the Excel calculation engine but was more a toolkit which uses Excel as an interface for defining parts or whole mathematical systems. I saw that as very useful.

These features I so covet have broken after some (much needed) code clean-up and I’ve struggled to re-implement them. With that, I have decided to use some parts of koala2 and implement a library which will have the features I’m interested in. I’m happy if this implementation (or something similar) becomes adopted by the koala2 project (I’d prefer to not split a universe) but, equally, I accept my goals may not overlap enough with that project.

Moving forward, this project, if used in one way, would achieve the purposes of replacing the Excel calculation engine or even be analogous to openpyxl with a “go” button. But would also offer significantly more for those who are inextricably bound to Excel but need more from their math. Which is what FlyingKoala is intended to facilitate.

We have some very basic functionality working. That said, the project is still evolving and has some way to go before it can claim to be feature compatible with koala2.

xlcalculator currently supports:

  • Loading an Excel file into a Python compatible state

  • Saving Python compatible state

  • Loading Python compatible state

  • Ignore worksheets

  • Extracting sub-portions of a model. “focussing” on provided cell addresses or defined names

  • Evaluating

    • Individual cells

    • Defined Names (a “named cell” or range)

    • Ranges

    • Shared formulas not an Array Formula * Operands (+, -, /, *, ==, <>, <=, >=) * on cells only

    • Set cell value

    • Get cell value

    • Parsing a dict into the Model object
      • Code is in examples\third_party_datastructure

    • Functions as implemented in xlfunctions <https://github.com/bradbase/xlfunctions>_.
      • ABS

      • AVERAGE

      • CHOOSE

      • CONCAT

      • COUNT

      • COUNTA

      • DATE

      • IRR

      • LN
        • Python Math.log() differs from Excel LN. Currently returning Math.log()

      • MAX

      • MID

      • MIN

      • MOD

      • NPV

      • PMT

      • POWER

      • RIGHT

      • ROUND

      • ROUNDDOWN

      • ROUNDUP

      • SLN

      • SQRT

      • SUM

      • SUMPRODUCT

      • TODAY

      • VLOOKUP - Exact match only

      • XNPV

      • YEARFRAC - Basis 1, Actual/actual, is only within 3 decimal places

Not currently supported:

Run tests

Setup your environment:

virtualenv -p 3.7 ve
ve/bin/pip install -e .[test]

From the root xlcalculator directory:

ve/bin/py.test -rw -s --tb=native

Or simply use tox:

tox

Run Example

From the examples/common_use_case directory:

python use_case_01.py

How to add Excel functions

Excel function support can be easily added.

Fundamental function support is supplied by [xlfunctions](https://github.com/bradbase/xlfunctions), so to add the “recipe for calculation” please submit a pull request to that project. There are instructions in that project. Please be conscientious with writing tests in that project as they are the tests for _how_ the calculation operates.

It is also best for your submission to have an evaluation test here in xlcalculator so we can ensure that the results of the xlfunction implementation are aligning with what we see in Excel.

Excel number precision

Excel number precision is a complex discussion. There is further detail on the README at [xlfunctions](https://github.com/bradbase/xlfunctions).

Unit testing Excel formulas directly from the workbook.

If you are interested in unit testing formulas in your workbook, you can use [FlyingKoala](https://github.com/bradbase/flyingkoala). An example on how can be found here.

TODO

  • Do not treat ranges as a granular AST node ut instead as an operation “:” of two cell references to create the range. That will make implementing features like A1:OFFSET(...) easy to implement.

  • Support for alternative range evaluation: by ref (pointer), by expr (lazy eval) and current eval mode.

    • Pointers would allow easy implementations of functions like OFFSET().

    • Lazy evals will allow efficient implementation of IF() since execution of true and false expressions can be delayed until it is decided which expression is needed.

  • Implement array functions. It is really not that hard once a proper RangeData class has been implemented on which one can easily act with scalar functions.

  • Set up a travis continuous integration service

  • Improve testing

  • Refactor model and evaluator to use pass-by-object-reference for values of cells which then get “used”/referenced by ranges, defined names and formulas

  • Handle multi-file addresses

  • Improve integration with pyopenxl for reading and writing files xample of problem space

CHANGES

0.1.0 (2020-05-25)

  • Refactored xlcalculator types to be more compact.

  • Reimplemented evaluation engine to not generate Python code anymore, but build a proper AST from the AST nodes. Each AST node supports an eval() function that knows how to compute a result.

    This removes a lot of complexities around trying to determine the evaluation context at code creation time and encoding the context as part of the generated code.

  • Removal of all special function handling.

  • Use of new xlfunctions implementation.

  • Use Openpyxl to load the Excel files. This provides shared formula support for free.

0.0.1b (2020-05-03)

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlcalculator-0.1.0.tar.gz (255.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page