Skip to main content

A collection of code assembled to help streamline things. For Data Analysis.

Project description

===================================================================
----------------------- larkinlab 0.0.20 ------------------------
===================================================================


This library contains the functions I have created or come accross that I find myself using often.

I will be adding things as I see fit, so be sure to update to the latest version.

Check the CHANGELOG for release info.


======== In The Future ========

- v0.1 in the works

========================================================================================
------------------------- Code Descriptions ------------------------------------------
========================================================================================


----- to install/update ------

pip3 install larkinlab
pip3 install --upgrade larkinlab

--------- Subpackages --------

larkinlab.explore
larkinlab.machinelearning

--------------------------------

========================= ll.explore =============================

This is built for exploring data. Contains functions that help you get an understanding of the data at hand quickly.

Import
> from larkinlab import explore as llex
> import larkinlab.explore as llex

Dependencies
> pandas
> numpy
> matplotlib.pyplot
> seaborn

--------------------------------
-- functions --
--------------------------------

-------------------------------------
* llex.df_ex(df, head_val) *

The df_ex (dataframe explore) function takes a dataframe and returns a few basic things
- The number of rows, columns, and total data points
- The names of the columns, limited to the first 60 if more than 60 exist
- Displays up to the first n rows of the dataframe via the df.head method, set by head parameter.

Parameter Default Values
> df :: pandas DataFrame
> head_val =5 :: Sets the number of rown to display in the dataframe preview. Works via the pandas .head method. Set to 'all' for all rows

-------------------------------------
* llex.vcount_ex(df, print_count) *

The vcount_ex function returns the value counts and normalized value counts for all of columns in the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame
> print_count =5 :: sets the number of value counts to print for each column. Set to 'all' for all of them, for example - (df, print_count='all')

-------------------------------------
* llex.missing_ex(df) *

The missing_ex function prints the number of missing values in each column of the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame

-------------------------------------
* llex.scat_ex(df) *

The scat_ex function returns a scatterplot representing the value counts and thier respective occurances for each column in the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame

-------------------------------------
* llex.corr_ex(df, min_corr, min_count, fig_size, colors) *

The corr_ex function returns either a pearson correlation values chart and a heatmap of said correlation values, or only the heatmap, for all of the columns in the dataframe passed through it.

Parameter Default Values
> df :: pandas DataFrame
> min_corr =0.2 :: minimum correlation value to appear on heatmap
> min_count =1 :: minimum number of observations required per pair of columns to have a valid result(pandas.df.corr(min_periods) argument)
> fig_size =(8, 10) :: heatmap size, 2 numbers
> colors ='Reds' :: color of the heatmap. Heatmap from seaborn, so uses thier color codes

-------------------------------------
* llex.help(desc=False) *

A function to list all of the functions in the subpackage, with a description of them an optional argument

Parameter Default Values
> desc =False :: Description. A True value will list function along with description and perameters

-------------------------------------
* *


-------------------------------------



========================= ll.machinelearning =============================

This package contains streamlined machine learning models and evaluation tools

Import
> from larkinlab import machinelearning as llml
> import larkinlab.machinelearning as llml

Dependencies
> pandas
> numpy
> matplotlib.pyplot

--------------------------------
-- functions --
--------------------------------

-------------------------------------
* *


-------------------------------------
* *


-------------------------------------
* *


-------------------------------------



=========================================================================================================================
-------------------------------------------------------------------------------------------------------------------------
=========================================================================================================================


Created By: Conor E. Larkin

email: conor.larkin16@gmail.com
GitHub: github.com/clarkin16
LinkedIn: linkedin.com/in/clarkin16

Thanks for checking this out!

_______________________________________________________________


====================================

----------- CHANGE LOG -----------

====================================
------ Latest Release: 0.0.20 -----
====================================


( Current Version )

0.0.20 (9/30/2021)
----------------------
- fixed explore module issues


====================================
OLD RELEASES
====================================


0.0.19 (9/30/2021)
----------------------
- another quick fix to a file name causing errors


0.0.18 (9/30/2021)
----------------------
- quick fix to a file name causing errors


0.0.17 (9/30/2021)
----------------------
- changelog formatting
- function name changes: dframe_ex to df_ex, func_list to help,
- readme changes
- added an "_help" variable to quickly print an individual function's readme section in a pinch, removed previous desc and params approach
- v0.1 coming


0.0.16 (11/2/2020)
----------------------
- fixed typo in explore's explore_info_list


0.0.15 (11/2/2020)
----------------------
- added func_list() to explore subpackage, with desc arg (defaulted to False)
- added function_desc and function_params lists to code, as well as a dictionary with all functions and descriptions


0.0.14 (11/2/2020)
----------------------
- changed long description content type to text/plain instead of test/markdown.
- fixed code issue llex.vcount_ex() function
- readme style changes


0.0.13 (11/2/2020)
----------------------
- readme updates
- added print_count param to llex.vcount_ex() function
- added head_val and max_col param to .explore's dframe_ex() function. Default max columns printed is now 50


0.0.12 (10/29/2020)
----------------------
- changed error in .explore's missing_ex() function's code
- updated .explore corr_ex() function to include min_count arg
- changed .explore.corr_ex() arg hm_only to map_only() with True or False keywords


0.0.11 (10/29/2020)
----------------------
- changed "install_required" values in setup.py


0.0.10 (10/29/2020)
----------------------
- fixed an error in corr_ex() function's code


0.0.9 (10/29/2020)
----------------------
- readme improvements
- added function missing_ex() to .explore
- added function corr_ex() to .explore
- .explore added seaborn dependency
- description change


0.0.8 (10/29/2020)
----------------------
- readme improved
- changed description


0.0.7 (10/29/2020)
----------------------
- updated name to larkinlab from clarklib
- added 2 subpackages: explore, machinelearning
- changed explore.frame_ex to explore.dframe_ex
- deleted clarklib (v0.0.0 - v0.0.6) from pypi, v0.0.7 and onward will be known as larkinlab


0.0.6 (10/29/2020)
----------------------
- Changed README to larkinlab format, with subpackages.
- In The Future section
- commented out long_description in setup.py
- changed check_df() to frame_ex()
- changed vcount_examine() to vcount_ex()
- changed scat_examine() to scat_ex()


0.0.5 (10/28/2020)
----------------------
- Changed the ghangelog to be in descending chronological order
- Changed description in setup.py
- updated the readme to contain details on using the functions and contact info


0.0.4 (10/28/2020)
----------------------
- Changed check_df() function to only display up to 60 column names.
- Changed check_df() to print "Rows:", "Columns:", and "Total Data Points:" instead of just print(df.shape, df.size)


0.0.3 (10/28/2020)
----------------------
- Added the 'import' section to code in clarklib init file. Works now!


0.0.2 (10/27/2020)
----------------------
- Moved init file into folder


0.0.1 (10/27/2020)
----------------------
- First release
- Added 3 functions: check_df(), vcount_examine(), scat_examine()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

larkinlab-0.0.20.tar.gz (9.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page