Skip to main content

Tool for querying natural language on tabular data

Project description

tableQA

Tool for querying natural language on tabular data like csvs,excel sheet,etc.

Build Status Open In Colab

Features

  • Supports detection from multiple csvs
  • Support FuzzyString implementation. i.e, incomplete csv values in query can be automatically detected and filled in the query.
  • Open-Domain, No training required.
  • Add manual schema for customized experience
  • Auto-generate schemas in case schema not provided

Configuration:

install via pip:

pip install tableqa

installing from source:

git clone https://github.com/abhijithneilabraham/tableQA

cd tableqa

python setup.py install

Quickstart

Do sample query

from tableqa.agent import Agent
agent=Agent(df) #input your dataframe
response=agent.query_db("Your question here")
print(response)

Get an SQL query from the question

sql=agent.get_query("Your question here")  
print(sql) #returns an sql query

Adding Manual schema

Schema Format:
{
    "name": DATABASE NAME,
    "keywords":[DATABASE KEYWORDS],
    "columns":
    [
        {
        "name": COLUMN 1 NAME,
        "mapping":{
            CATEGORY 1: [CATEGORY 1 KEYWORDS],
            CATEGORY 2: [CATEGORY 2 KEYWORDS]
        }

        },
        {
        "name": COLUMN 2 NAME,
        "keywords": [COLUMN 2 KEYWORDS]
        },
        {
        "name": "COLUMN 3 NAME",
        "keywords": [COLUMN 3 KEYWORDS],
        "summable":"True"
        }
    ]
}

  • Mappings are for those columns whose values have only few distinct classes.
  • Include only the column names which need to have manual keywords or mappings.Rest will will be autogenerated.
  • summable is included for Numeric Type columns whose values are already count representations. Eg. Death Count,Cases etc. consists values which already represent a count.

Example (with manual schema):

Database query
from tableqa.agent import Agent
agent=Agent(df,schema) #pass the dataframe and schema objects
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
SQL query
sql=agent.get_query("How many people died of stomach cancer in 2011")
print(sql)
#sql query: SELECT SUM(Death_Count) FROM cancer_death WHERE Cancer_site = "Stomach" AND Year = "2011"

Multiple CSVs

Pass the path of the directories containing the csvs and schemas respectively. Refer cleaned_data and schema for examples.

Example
csv_path="/content/tableQA/tableqa/cleaned_data"
schema_path="/content/tableQA/tableqa/schema"
agent=Agent(csv_path,schema_path)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tableqa-0.0.7.tar.gz (926.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tableqa-0.0.7-py3-none-any.whl (928.2 kB view details)

Uploaded Python 3

File details

Details for the file tableqa-0.0.7.tar.gz.

File metadata

  • Download URL: tableqa-0.0.7.tar.gz
  • Upload date:
  • Size: 926.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for tableqa-0.0.7.tar.gz
Algorithm Hash digest
SHA256 46985544fe88523061c3b5461642e7ac5bb69e5e5b61580daf52f2abecf4dcd4
MD5 d1757964dde350264ac0c9e2973c61b7
BLAKE2b-256 91fb13076e0ebd2511ca2e4d6edbd0d1c3790013c33b23c5873da4c938634f62

See more details on using hashes here.

File details

Details for the file tableqa-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: tableqa-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 928.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for tableqa-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 03e56d0e53d37f02417d8df14618f8b433d5b3d97729272543b8b2936ddd8e8a
MD5 55860679bae3cacd64ce39c3369b1ef4
BLAKE2b-256 15447e7faf60c98c20f381dd55ef8db0514083c239961df250189ff0621dfd1f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page