Skip to main content

(Karvy/Kfintech/CAMS) Consolidated Account Statement (CAS) PDF parser

Project description

CASParser

code style: black GitHub GitHub Workflow Status codecov PyPI - Python Version

Parse Consolidated Account Statement (CAS) PDF files generated from CAMS/KFINTECH

casparser also includes a command line tool with the following analysis tools

  • summary- print portfolio summary
  • (BETA) gains - Print capital gains report (summary and detailed)
    • with option to generate csv files for ITR in schedule 112A format

Installation

pip install -U casparser

with faster PyMuPDF parser

pip install -U 'casparser[fast]'

Note: Enabling this dependency could result in licensing changes. Check the License section for more details

Usage

import casparser
data = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password")

# Get data in json format
json_str = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password", output="json")

# Get transactions data in csv string format
csv_str = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password", output="csv")

Data structure

{
    "statement_period": {
        "from": "YYYY-MMM-DD",
        "to": "YYYY-MMM-DD"
    },
    "file_type": "CAMS/KARVY/UNKNOWN",
    "cas_type": "DETAILED/SUMMARY",
    "investor_info": {
        "email": "string",
        "name": "string",
        "mobile": "string",
        "address": "string"
    },
    "folios": [
        {
            "folio": "string",
            "amc": "string",
            "PAN": "string",
            "KYC": "OK/NOT OK",
            "PANKYC": "OK/NOT OK",
            "schemes": [
                {
                    "scheme": "string",
                    "isin": "string",
                    "amfi": "string",
                    "advisor": "string",
                    "rta_code": "string",
                    "rta": "string",
                    "open": "number",
                    "close": "number",
                    "close_calculated": "number",
                    "valuation": {
                      "date": "date",
                      "nav": "number",
                      "value": "number"
                    },
                    "transactions": [
                        {
                            "date": "YYYY-MM-DD",
                            "description": "string",
                            "amount": "number",
                            "units": "number",
                            "nav": "number",
                            "balance": "number",
                            "type": "string",
                            "dividend_rate": "number"
                        }
                    ]
                }
            ]
        }
    ]
}

Notes:

  • Transaction type can be any value from the following
    • PURCHASE
    • PURCHASE_SIP
    • REDEMPTION
    • SWITCH_IN
    • SWITCH_IN_MERGER
    • SWITCH_OUT
    • SWITCH_OUT_MERGER
    • DIVIDEND_PAYOUT
    • DIVIDEND_REINVESTMENT
    • SEGREGATION
    • STAMP_DUTY_TAX
    • TDS_TAX
    • STT_TAX
    • MISC
  • dividend_rate is applicable only for DIVIDEND_PAYOUT and DIVIDEND_REINVESTMENT transactions.

CLI

casparser also comes with a command-line interface that prints summary of parsed portfolio in a wide variety of formats.

Usage: casparser [-o output_file.json|output_file.csv] [-p password] [-s] [-a] CAS_PDF_FILE

  -o, --output FILE               Output file path. Saves the parsed data as json or csv
                                  depending on the file extension. For other extensions, the
                                  summary output is saved. [See note below]

  -s, --summary                   Print Summary of transactions parsed.
  -p PASSWORD                     CAS password
  -a, --include-all               Include schemes with zero valuation in the
                                  summary output
  -g, --gains                     Generate Capital Gains Report (BETA)
  --gains-112a ask|FY2020-21      Generate Capital Gains Report - 112A format for
                                  a given financial year - Use 'ask' for a prompt
                                  from available options (BETA)
  --force-pdfminer                Force PDFMiner parser even if MuPDF is
                                  detected

  --version                       Show the version and exit.
  -h, --help                      Show this message and exit.

CLI examples

# Print portfolio summary
casparser /path/to/cas.pdf -p password

# Print portfolio and capital gains summary
casparser /path/to/cas.pdf -p password -g

# Save parsed data as a json file
casparser /path/to/cas.pdf -p password -o pdf_parsed.json

# Save parsed data as a csv file
casparser /path/to/cas.pdf -p password -o pdf_parsed.csv

# Save capital gains transactions in csv files (pdf_parsed-gains-summary.csv and
# pdf_parsed-gains-detailed.csv)
casparser /path/to/cas.pdf -p password -g -o pdf_parsed.csv

Note: casparser cli supports two special output file formats [-o file.json / file.csv]

  1. json - complete parsed data is exported in json format (including investor info)
  2. csv - Summary info is exported in csv format if the input file is a summary statement or if a summary flag (-s/--summary) is passed as argument to the CLI. Otherwise, full transaction history is included in the export. If -g flag is present, two additional files '{basename}-gains-summary.csv', '{basename}-gains-detailed.csv' are created with the capital-gains data.
  3. any other extension - The summary table is saved in the file.

Demo

demo

ISIN & AMFI code support

Since v0.4.3, casparser includes support for identifying ISIN and AMFI code for the parsed schemes via the helper module casparser-isin. If the parser fails to assign ISIN or AMFI codes to a scheme, try updating the local ISIN database by

casparser-isin --update

If it still fails, please raise an issue at casparser-isin with the failing scheme name(s).

License

CASParser is distributed under MIT license by default. However enabling the optional dependency mupdf/fast would imply the use of PyMuPDF / MuPDF and hence the licenses GNU GPL v3 and GNU Affero GPL v3 would apply. Copies of all licenses have been included in this repository. - IANAL

Resources

  1. CAS from CAMS
  2. CAS from Karvy/Kfintech

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

casparser-0.7.4.tar.gz (27.9 kB view hashes)

Uploaded Source

Built Distribution

casparser-0.7.4-py3-none-any.whl (32.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page