Package to extract binary files into pandas dataframes
Project description
RPH extraction
Contains a tool to read a .rph file into a RphData structure.
Usage
A simple example is given below:
from AmiAutomation import RphData
data = RphData.rphToDf(path = "path_to_rph_file")
# Table data inside a dataframe
dataframe = data.dataframe
Binaries extraction
This package contains the tools to easily extract binary data from PX3's:
- Heat Log
- 2 Second Log
- Wave Log
- Composite
- Histogram
Into a pandas dataframe for further processing
Usage
Importing a function is done the same way as any python package:
from AmiAutomation import PX3_Bin, LogData
From there you can call a method with the module prefix:
dataFrame = PX3_Bin.file_to_df(path = "C:\\Binaries")
or
dataFrame = LogData.binFileToDF(path = "C:\\Binaries")
LogData Methods
You can get Binary log data in a LogData format that contains useful data about the binary file, including samples inside a pandas dataframe
LogData.binFileToDF
Unpacks binary file into LogData
-
Parameters:
-
path : str Complete file path
-
extension : str, optional Explicitly enforce file extension. ex: 'bin'
-
null_promoting : dict, optional A dictionary with a .NET Source Type key and a value of either one of the following (default, object, float, Int64, string, error).
The possible dictionary keys are the .NET simple types:
- "SByte" : Signed Byte
- "Byte" : Unsigned Byte
- "Int16" : 16 bit integer
- "UInt16" : 16 bit unsigned integer
- "Int32" : 32 bit integer
- "UInt32" : 32 bit unsigned integer
- "Int64" : 64 bit integer
- "UInt64" : 64 bit unsigned integer
- "Char" : Character
- "Single" : Floating point single precision
- "Double" : Floating point double precision
- "Boolean" : bit
- "Decimal" : 16 byte decimal precision
- "DateTime" : Date time
This dictionary values determines how null values in deserialization affect the resulting LogData dataframe column:
- "default" : use pandas automatic inference when dealing with null values on a column
- "object" : The returned type is the generic python object type
- "float" : The returned type is the python float type
- "Int64" : The returned type is the pandas Nullable Integer Int64 type
- "string" : Values are returned as strings
- "error" : Raises and exception when null values are encountered
-
-
Returns:
- LogData
- Structure containing most file data
- LogData
Examples
Simple file conversion
from AmiAutomation import LogData
#This returns the whole data
logData = LogData.binFileToDF("bin_file_path.bin")
#To access samples just access the dataframe inside the LogData object
dataFrame = logData.dataFrame
Conversion with null promoting
from AmiAutomation import LogData
#Adding null promoting to handle missing values in these types of data as object
logData = LogData.binFileToDF("bin_file_path.bin", null_promoting={"Int32":"object", "Int16":"object", "Int64":"object"})
#To access samples just access the dataframe inside the LogData object
dataFrame = logData.dataFrame
This method can also be used to retrive the data table from inside a ".cpst" or ".hist" file, detection is automatic based on file extension, if none is given, ".bin" is assumed
PX3_Bin Methods
This method returns a single pandas dataframe containing extracted data from the provided file, path or path with constrained dates
-
file_to_df ( path, file, start_time, end_time, verbose = False )
-
To process a single file you need to provide the absolute path in the file argument
dataFrame = PX3_Bin.file_to_df(file = "C:\\Binaries\\20240403T002821Z$-4038953271967.bin")
- To process several files just provide the directory path where the binaries are (binaries inside sub-directories are also included)
dataFrame = PX3_Bin.file_to_df(path = "C:\\Binaries\\")
- You can constrain the binaries inside a directory (and sub-directories) by also providing a start-date or both a start date and end date as a python datetime.datetime object
import datetime
time = datetime.datetime(2020,2,15,13,30) # February 15th 2020, 1:30 PM
### This returns ALL the data available in the path from the given date to the actual time
dataFrame = PX3_Bin.file_to_df(path = "C:\\Binaries\\", start_time=time)
import datetime
time_start = datetime.datetime(2020,2,15,13,30) # February 15th 2020, 1:30 PM
time_end = datetime.datetime(2020,2,15,13,45) # February 15th 2020, 1:45 PM
### This returns all the data available in the path from the given 15 minutes
dataFrame = PX3_Bin.file_to_df(path = "C:\\Binaries\\", start_time=time_start, end_time=time_end )
Tested with package version
- pythonnet 2.5.1
- pandas 1.1.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for AmiAutomation-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca48cf7c80525a401b195e6847d63145d13634ed535e4cdd5dd2b4b2f9e6dd14 |
|
MD5 | 419a0c2dba43bef2e6d0279ba1791d0f |
|
BLAKE2b-256 | 05a030a4360c86d431fe84eab9c4ca121adb5c4c7176fa0e7dd417ac526d7a90 |