Sematic AI RAG System

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Semantic AI Lib

An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Large Language Model).

Requirements

Python 3.10+ asyncio

Installation

# Using pip
$ python -m pip install semantic-ai

# Manual install
$ python -m pip install .

Set the environment variable

Put the all credentials in .env file

# Default
FILE_DOWNLOAD_DIR_PATH= # default directory name 'download_file_dir'
EXTRACTED_DIR_PATH= # default directory name 'extracted_dir'

# Connector
CONNECTOR_TYPE="connector_name" # sharepoint
SHAREPOINT_CLIENT_ID="client_id"
SHAREPOINT_CLIENT_SECRET="client_secret"
SHAREPOINT_TENANT_ID="tenant_id"
SHAREPOINT_HOST_NAME='<tenant_name>.sharepoint.com'
SHAREPOINT_SCOPE='https://graph.microsoft.com/.default'
SHAREPOINT_SITE_ID="site_id"
SHAREPOINT_DRIVE_ID="drive_id"
SHAREPOINT_FOLDER_URL="folder_url" # /My_folder/child_folder/

# Indexer
INDEXER_TYPE="vector_db_name" # elasticsearch, qdrant
ELASTICSEARCH_URL="elasticsearch_url" # give valid url
ELASTICSEARCH_INDEX_NAME="index_name"
ELASTICSEARCH_SSL_VERIFY="ssl_verify" # True or False

Method 1: To load the .env file. Env file should have the credentials

%load_ext dotenv
%dotenv
%dotenv relative/or/absolute/path/to/.env

(or)

dotenv -f .env run -- python

Method 2:

from semantic_ai.config import Settings
settings = Settings()

1. Import the module

import asyncio
import semantic_ai

2. To download the files from given source, extract the content from the downloaded files and index the extracted data in the given vector db.

await semantic_ai.download()
await semantic_ai.extract()
await semantic_ai.index()

Suppose the job is running in longtime, we can watch the number of file processed, number of file failed and that filename stored in text file which are processed and failed in the 'EXTRACTED_DIR_PATH/meta' directory.

Example

To connect the source and get the connection object. We can see that in examples folder. Example: Sharepoint connector

from semantic_ai.connectors import Sharepoint

CLIENT_ID = '<client_id>'  # sharepoint client id
CLIENT_SECRET = '<client_secret>'  # sharepoint client seceret
TENANT_ID = '<tenant_id>'  # sharepoint tenant id
SCOPE = 'https://graph.microsoft.com/.default'  # scope
HOST_NAME = "<tenant_name>.sharepoint.com"  # for example 'contoso.sharepoint.com'

# Sharepoint object creation
connection = Sharepoint(client_id=CLIENT_ID,
                        client_secret=CLIENT_SECRET,
                        tenant_id=TENANT_ID,
                        host_name=HOST_NAME,
                        scope=SCOPE)

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.5.1

Mar 13, 2024

0.0.5

Feb 15, 2024

0.0.4

Dec 20, 2023

0.0.3

Dec 20, 2023

0.0.2

Nov 9, 2023

This version

0.0.1

Nov 3, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_ai-0.0.1.tar.gz (17.3 kB view hashes)

Uploaded Nov 3, 2023 Source

Built Distribution

semantic_ai-0.0.1-py3-none-any.whl (21.3 kB view hashes)

Uploaded Nov 3, 2023 Python 3

Hashes for semantic_ai-0.0.1.tar.gz

Hashes for semantic_ai-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`50ee9f25e03af1fb678134a3d62fe666ddaf05863982c5a61d166be412cfa857`
MD5	`8417d26c1aeea51de85fbba7f8bc8eba`
BLAKE2b-256	`994d72d1bc4ecb3e2acff79a8da93fb777a48366057396c7530abc03679c2490`

Hashes for semantic_ai-0.0.1-py3-none-any.whl

Hashes for semantic_ai-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`64052dc30587e4d73eed89e3e96f0c998cdd79ebee918f7636823bd52fde1a51`
MD5	`2a3b79f49c740873c0ddfca58a4b007e`
BLAKE2b-256	`a2c36e3b1c68d177ed361115817d1f3724e3b187025783d6be80656eab4ac5f0`