Skip to main content

A collection of code I have either made or found that helps streamline things. For Data Analysis.

Project description

===================================================================
----------------------- larkinlab 0.0.14 ------------------------
===================================================================


This library contains the functions I have created or come accross that I find myself using often.

I will be adding functions as I need them, so be sure to update to the latest version.

Check the CHANGELOG for release info.


======== In The Future ========

- a new plot_ex() function for larkinlab.explore, to return various graphs for quick analysis. Will work by returning plots for each column in dataframe at once.
- a set of colex (column explore) functions to do some of the stuff over columns rather than the entire dataframe
- a set of functions to perform basic machine learning algorithms over dataframes and return evaluation metrics
also with a colex function to come later.

========================================================================================
------------------------- Code Descriptions ------------------------------------------
========================================================================================


----- to install/update ------

pip3 install larkinlab
pip3 install --upgrade larkinlab

-------- to import -----------

import larkinlab as ll

--------- Subpackages --------

larkinlab.explore
larkinlab.machinelearning

--------------------------------

========================= ll.explore =============================

This is built for exploring data. Contains functions that help you get an understanding of the data at hand quickly.

Import
> from larkinlab import explore as llex
> import larkinlab.explore as llex

Dependencies
> pandas
> numpy
> matplotlib.pyplot
> seaborn

--------------------------------
-- functions --
--------------------------------

-------------------------------------
* llex.dframe_ex(df, head_val) *

The dframe_ex function takes a dataframe and returns a few things
- The number of rows, columns, and total data points
- The names of the columns, limited to the first 60 if more than 60 exist
- Displays up to the first n rows of the dataframe via the df.head method, set by head parameter.

* Parameter Default Values *
> df :: pandas DataFrame
> head_val =5 :: Sets the number of rown to display in the dataframe preview. Works via the pandas .head method. Set to 'all' for all rows

-------------------------------------
* llex.vcount_ex(df, print_count) *

The vcount_ex function returns the value counts and normalized value counts for all of columns in the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame
> print_count =5 :: sets the number of value counts to print for each column. Set to 'all' for all of them, for example - (df, print_count='all')

-------------------------------------
* llex.missing_ex(df) *

The missing_ex function prints the number of missing values in each column of the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame

-------------------------------------
* llex.scat_ex(df) *

The scat_ex function returns a scatterplot representing the value counts and thier respective occurances for each column in the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame

-------------------------------------
* llex.corr_ex(df, min_corr, min_count, fig_size, colors) *

The corr_ex function returns either a pearson correlation values chart and a heatmap of said correlation values, or only the heatmap, for all of the columns in the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame
> min_corr =0.2 :: minimum correlation value to appear on heatmap
> min_count =1 :: minimum number of observations required per pair of columns to have a valid result(pandas.df.corr(min_periods) argument)
> fig_size =(8, 10) :: heatmap size, 2 numbers
> colors ='Reds' :: color of the heatmap. Heatmap from seaborn, so uses thier color codes

-------------------------------------
* *


-------------------------------------
* *


-------------------------------------



========================= ll.machinelearning =============================

This package contains streamlined machine learning models and evaluation tools

Import
> from larkinlab import machinelearning as llml

Dependencies
> pandas
> numpy
> matplotlib.pyplot

--------------------------------
-- functions --
--------------------------------

-------------------------------------
* *


-------------------------------------
* *


-------------------------------------
* *


-------------------------------------



=========================================================================================================================
-------------------------------------------------------------------------------------------------------------------------
=========================================================================================================================


Created By: Conor Larkin

email: conor.larkin16@gmail.com
GitHub: github.com/clarkin16
LinkedIn: linkedin.com/in/clarkin16

Thanks for checking this out!

_______________________________________________________________


====================================

----------- CHANGE LOG -----------

====================================
------ Latest Release: 0.0.14 -----
====================================


0.0.14 (11/2/2020)
----------------------
- changed long description content type to text/plain instead of test/markdown.
- fixed code issue llex.vcount_ex() function
- readme style changes


0.0.13 (11/2/2020)
----------------------
- readme updates
- added print_count param to llex.vcount_ex() function
- added head_val and max_col param to .explore's dframe_ex function. Default max columns printed is now 50


0.0.12 (10/29/2020)
----------------------
- changed error in .explore's missing_ex() function's code
- updated .explore corr_ex() function to include min_count arg
- changed .explore.corr_ex() arg hm_only to map_only() with True or False keywords


0.0.11 (10/29/2020)
----------------------
- changed "install_required" values in setup.py


0.0.10 (10/29/2020)
----------------------
- fixed an error in corr_ex() function's code


0.0.9 (10/29/2020)
----------------------
- readme improvements
- added function missing_ex() to .explore
- added function corr_ex() to .explore
- .explore added seaborn dependency
- description change


0.0.8 (10/29/2020)
----------------------
- readme improved
- changed description


0.0.7 (10/29/2020)
----------------------
- updated name to larkinlab from clarklib
- added 2 subpackages: explore, machinelearning
- changed explore.frame_ex to explore.dframe_ex
- deleted clarklib (v0.0.0 - v0.0.6) from pypi, v0.0.7 and onward will be known as larkinlab


0.0.6 (10/29/2020)
----------------------
- Changed README to larkinlab format, with subpackages.
- In The Future section
- commented out long_description in setup.py
- changed check_df() to frame_ex()
- changed vcount_examine() to vcount_ex()
- changed scat_examine() to scat_ex()


0.0.5 (10/28/2020)
----------------------
- Changed the ghangelog to be in descending chronological order
- Changed description in setup.py
- updated the readme to contain details on using the functions and contact info


0.0.4 (10/28/2020)
----------------------
- Changed check_df() function to only display up to 60 column names.
- Changed check_df() to print "Rows:", "Columns:", and "Total Data Points:" instead of just print(df.shape, df.size)


0.0.3 (10/28/2020)
----------------------
- Added the 'import' section to code in clarklib init file. Works now!


0.0.2 (10/27/2020)
----------------------
- Moved init file into folder


0.0.1 (10/27/2020)
----------------------
- First release
- Added 3 functions: check_df(), vcount_examine(), scat_examine()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

larkinlab-0.0.14.tar.gz (8.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page