Skip to main content

Powerful easy to use library for accessing edgar filings

Project description

edgartools

PyPI - Version PyPI - Python Version GitHub Workflow Status CodeFactor GitHub

Table of Contents
  1. About The Project
  2. Installation
  3. Usage
  4. Contributing
  5. License
  6. Contact

About the project

edgartools is a library for working Edgar filings in analytic workflows.

Demo

Get the Common Shares Issued amount from Snowflake's latest 10-Q filing

(Company.for_ticker("SNOW")
        .get_filings(form="10-Q")
        .latest()
        .xbrl()
        .to_duckdb().execute(
        """select fact, value, units, end_date from facts 
           where fact = 'CommonStockSharesIssued' 
           order by end_date desc limit 1
        """
    ).df()
)

Common Shares Issued

This example shows what can be done with edgartools.

Under the hood the code does the following

  1. Use the ticker "SNOW" to get the company's cik from the Company Tickers JSON
  2. From the cik get the company's filings from the submissions endpoint https://data.sec.gov/submissions/CIK{cik:010}.json
  3. Select the latest 10-Q filing
  4. Download the XBRL file for that filing
  5. Convert the XBRL data into a pandas dataframe
  6. Register the dataframe as a DuckDB table
  7. Execute the SQL and convert to a dataframe

You might not want to chain the operations like this, and strictly speaking it might not be the most efficient, given how much work happens within those lines of code. This guide will show you step by step how to easily get SEC filing data and text into your analytic workflows.

Features

  • Download listings of Edgar filing by year, quarter since 1994
  • Select an individual filing and download the html, XML or content of any attached file
  • View a filing XBRL as a dataframe and query it with SQL
  • Search for company by ticker or CIK
  • Get a company's filings
  • Get a dataset of company's facts e.g. CommonSharesOutstanding
  • Query a company's facts as SQL using an in-memory DuckDB database

Installation

pip install edgartools

Usage

Set your Edgar user identity

Before you can access the SEC Edgar API you need to set the identity that you will use to access Edgar. This is usually your name and email, or a company name and email.

Sample Company Name AdminContact@<sample company domain>.com

The user identity is sent in the User-Agent string and the Edgar API will refuse to respond to your request without it.

EdgarTools will look for an environment variable called EDGAR_IDENTITY and use that in each request. So, you need to set this environment variable before using it.

export EDGAR_IDENTITY="Michael Mccallum mcalum@gmail.com"

Alternatively, you can call set_identity which does the same thing.

from edgar import set_identity
set_identity("Michael Mccallum mcalum@gmail.com")

For more detail see https://www.sec.gov/os/accessing-edgar-data

Using the Company API

With the company API you find a company using the cik or ticker. From the company you can access all their historical filings, and a dataset of the company facts. The SEC's company API also supplies a lot more details about a company including industry, the SEC filer type, the mailing and business address and much more.

Find a company using the cik

The cik is the id that uniquely identifies a company at the SEC. It is a number, but is sometimes shown in SEC Edgar resources as a string padded with leading zero's. For the edgar client API, just use the numbers and omit the leading zeroes.

company = Company.for_cik(1318605)

expe

Find a company using ticker

You can get a company using a ticker e.g. SNOW. This will do a lookup for the company cik using the ticker, then load the company using the cik. This makes it two calls versus one for the cik company lookup, but is sometimes more convenient since tickers are easier to remember that ciks.

Note that some companies have multiple tickers, so you technically cannot get SEC filings for a ticker. You instead get the SEC filings for the company to which the ticker belongs.

The ticker is case-insensitive so you can use Company.for_ticker("snow") or Company.for_ticker("SNOW")

snow = Company.for_ticker("snow")

snow inspect

Company.for_cik(1832950)

Get filings for a company

To get the company's filings use get_filings(). This gets all the company's filings that are available from the Edgar submissions endpoint.

company.get_filings()

Filtering filings

You can filter the company filings using a number of different parameters.

class CompanyFilings:
    
    ...
    
    def get_filings(self,
                    *,
                    form: str | List = None,
                    accession_number: str | List = None,
                    file_number: str | List = None,
                    is_xbrl: bool = None,
                    is_inline_xbrl: bool = None
                    ):
        """
        Get the company's filings and optionally filter by multiple criteria
        :param form: The form as a string e.g. '10-K' or List of strings ['10-Q', '10-K']
        :param accession_number: The accession number that uniquely identifies an SEC filing e.g. 0001640147-22-000100
        :param file_number: The file number e.g. 001-39504
        :param is_xbrl: Whether the filing is xbrl
        :param is_inline_xbrl: Whether the filing is inline_xbrl
        :return: The CompanyFiling instance with the filings that match the filters
        """

The CompanyFilings class

The result of get_filings() is a CompanyFilings class. This class contains a pyarrow table with the filings and provides convenient functions for working with filings. You can access the underlying pyarrow Table using the .data property

filings = company.get_filings()

# Get the underlying Table
data: pa.Table = filings.data

Get a filing by index

To access a filing in the CompanyFilings use the bracket [] notation e.g. filings[2]

filings[2]

Get the latest filing

The CompanyFilings class has a latest function that will return the latest Filing. So, to get the latest 10-Q filing, you do the following

# Latest filing makes sense if you filter by form  type e.g. 10-Q
snow_10Qs = snow.get_filings(form='10-Q')
latest_10Q = snow_10Qs.latest()

# Or chain the function calls
snow.get_filings(form='10-Q').latest()

Get company facts

Facts are an interesting and important dataset about a company accumlated from data the company provides to the SEC. Company facts are available for a company on the Company Factsf"https://data.sec.gov/api/xbrl/companyfacts/CIK{cik:010}.json" It is a JSON endpoint and edgartools parses the JSON into a structured dataset - a pyarrow.Table.

Getting facts for a company

To get company facts, first get the company, then call company.get_facts()

company = Company.for_ticker("SNOW")
company_facts = company.get_facts()

The result is a CompanyFacts object which wraps the underlying facts and provides convenient ways of working with the facts data. To get access to the underyling data use the facts property.

You can get the facts as a pandas dataframe by calling to_pandas

df = company_facts.to_pandas()

Facts differ among companies. To see what facts are available you can use the facts_meta property.

Getting the facts as a DuckDB table

Ypu can convert the facts to a DuckDB database which allows you to query the facts using SQL.

    company_facts: CompanyFacts = get_company_facts(1318605)
    db = company_facts.to_duckdb()
    df = db.execute("""
    select * from facts
    """).df()

Working with a Filing

Once you have a filing you can do many things with it including getting the html text of the filing, get xbrl or xml, or list all the files in the filing.

Getting the html text of a filing

html = filing.html()

To get the html text of the filing call filing.html()

Get the Homepage Url

filing.homepage_url returns the homepage url of the filing. This is the main index page which lists all the files attached in the filing

Get the filing homepage

To get access to all the documents on the filing you would call filing.get_homepage(). This gives you access to the FilingHomepage class that you can use to list all the documents and datafiles on the filing.

Working with XBRL filings

Some filings are in XBRL (eXtensible Business Markup Language) format. These are mainly the newer filings, as the SEC has started requiring this for newer filings.

If a filing is in XBRL format then it opens up a lot more ways to get structured data about that specific filing and also about the company referred to in that filing.

The Filing class has an xbrl function that will download, parse and structure the filing's XBRL document if one exists. If it does not exist, then filing.xbrl() will return None.

The function filing.xbrl() returns a FilingXbrl instance, which wraps the data, and provides convenient ways of working with the xbrl data.

filing_xbrl = filing.xbrl()

Using the Filings API

The Filings API allows you to get the Edgar filing indexes published by the SEC. You would use it to get a bulk dataset of SEC filings for a given time period. With this dataset, you could filter by form type, by date or by company, though if you intend to filter by a singe company, you should use the Company API.

The get_filings function

The main way to use the Filings API is by get_filings

get_filings accepts the following parameters

  • year a year 2015, a List of years [2013, 2015] or a range(2013, 2016)
  • quarter a quarter 2, a List of quarters [1,2,3] or a range(1,3)
  • index this is the type of index. By default it is "form". If you want only XBRL filings use "xbrl". You can also use "company" but this will give you the same dataset as "form", sorted by company instead of by form

Get filings for 2021

filings = get_filings(2021)

Get filings for 2021 quarter 4

filings = get_filings(2021, 4)

Get filings between 2010 and 2019

filings = get_filings(range(2010, 2020))

Get XBRL filings for 2022

filings = get_filings(2022, index="xbrl")

The Filings class

The get_filings returns a Filings class, which wraps the data returned and provide convenient ways for working with filings.

Convert the filings to a pandas dataframe

The filings data is stored in the Filings class as a pyarrow.Table. You can get the data as a pandas dataframe using to_pandas

df = filings.to_pandas()

Use DuckDB to query the filings

A conveient way to query the filings data is to use DuckDB. If you call the to_duckdb function, you get an in-memory DuckDB database instance, with the filings registered as a table called filings. Then you can work directy with the DuckDB database, and run SQL against the filings data.

In this example, we filter filings for S-1 form types.

db = filings.to_duckdb()
# a duckdb.DuckDBPyConnection

# Query the filings for S-1 filings and return a dataframe
db.execute("""
select * from filings where Form == 'S-1'
""").df()

Contributing

License

edgartools is distributed under the terms of the MIT license.

Contact

LinkedIn

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgartools-1.0.0.tar.gz (1.0 MB view hashes)

Uploaded Source

Built Distribution

edgartools-1.0.0-py3-none-any.whl (18.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page