Skip to main content

A package to make NLP easy, fast, and fun

Project description

Subtext 0.0.2

A package to make NLP fast and easy for beginners.

  • Efficient text prediction
  • Text pairing, equivalent to that of NLTK's n-gram.
  • Syllable Identification
  • Find frequencies of words in given text
  • Find matching words in two arrays

I still have a lot of plans for this package, for that reason, there would be a lot of frequent updates in the near future. The updates would include optimizations & more functions, so stay tuned.

Install

pip install subtext

Usage

First import the program using:

import subtext

Predict

A function that predicts the next x number of words based on the given string and phrase

Parameters

The function's parameters are:

subtext.predict(string, phrase, n=0, case_insensitive=False)
  • String: Main text
  • Phrase: The key phrase (prompt). The function would try to predict what would come after the given phrase.
  • n: The number of words it would return. It's automomatically set to 0, which would return all predictions regardless of their corresponding word counts.
  • case_insensitive: Set this to True if you want to.

Actual usage

So, let's try to use this.

string="I am a string. I am also a human being, but most importantly, I am a string."
print(predict(string, "I am", n=1))

This would output

{'a': 2, 'also': 1}

But, if you change the n value,

print(predict(string, "I am", n=2))

It would output

{'a string.': 2, 'also a': 1}

Pair

This function splits a string into pairs of strings.

Parameters

subtext.pair(string, n)
  • string is the string you're trying to split into pairs
  • n stands for the number of strings in each pair. (Equivalent to that of the n value in n-gram)

Usage

Let's set our string to:

string="Sometimes, I just go out and eat sand. I don't know why"

Don't ask. Let's turn this into pairs of 2:

print(pair(string, 2))

Which outputs

[['Sometimes,', 'I'], ['I', 'just'], ['just', 'go'], ['go', 'out'], ['out', 'and'], ['and', 'eat'], ['eat', 'sand.'], ['sand.', 'I'], ['I', "don't"], ["don't", 'know'], ['know', 'why']]

Identify Syllables

subtext.syllables("carbonmonoxide")

This outputs:

car-bon-mon-ox-ide

But take note that this only works with lowercase strings.

Countwords

Parameters

The function's parameters are:

subtext.countwords(string, case_insensitive=False)

Change that to True if you want it to be case-insensitive.

Actual usage

Get yourself a nice string

string = "Sometimes I wonder, 'Am I stupid?' then I realize, yeah. yeah, I am stupid."

Then put it in the function:

x = subtext.countwords(string)
print(x)

It should print:

{'I': 4, 'Sometimes': 1, 'wonder,': 1, "'Am": 1, "stupid?'": 1, 'then': 1, 'realize,': 1, 'yeah.': 1, 'yeah,': 1, 'am': 1, 'stupid.': 1}

Matchingwords

A function that finds & counts matching words in two strings

Actual usage

So in this case, our strings are:

string1, string2 = "God, I love drawing, drawing is my favourite thing to do", "God, I hate drawing, drawing is my least favourite thing to do"

If we run this through matchingwords, we would get:

{'God,': 1, 'I': 1, 'drawing,': 1, 'drawing': 1, 'is': 1, 'my': 1, 'favourite': 1, 'thing': 1, 'to': 1, 'do': 1}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Subtext-0.0.2.tar.gz (4.8 kB view hashes)

Uploaded Source

Built Distribution

Subtext-0.0.2-py3-none-any.whl (4.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page