Skip to main content

generate alpha factors

Project description

This programme is to automatically generate alpha factors and filter relatively good factors with back-testing methods. Time consuming parts are optimized with numba package.

Dependencies

  • python >= 3.5

  • pandas >= 0.22.0

  • numpy >= 1.14.0

  • RNWS >= 0.2.1

  • numba >= 0.38.0

  • single_factor_model>=0.3.0

  • IPython 5.1.0

  • empyrical

  • alphalens

Note: It is best to use the latest version of llvmlite in order to make numba work properly. Otherwise it may couse a kernel-dies situation.

Example

load packages and read in data

from alpha_factory import generator_class,get_memory_use_pct,clean
from RNWS import read
import numpy as np
import pandas as pd
start=20180101
end=20180331
factor_path='.'
frame_path='.'

df=pd.read_csv(frame_path+'/frames.csv')

## read in data

re=read.read_df('./re',file_pattern='re',start=start,end=end)
cap=read.read_df('./cap',file_pattern='cap',header=0,dat_col='cap',start=start,end=end)
open_price,close,vwap,adj,high,low,volume,sus=read.read_df('./mkt_data',file_pattern='mkt',start=start,end=end,header=0,dat_col=['open','close','vwap','adjfactor','high','low','volume','sus'])
ind1,ind2,ind3=read.read_df('./ind',file_pattern='ind',start=start,end=end,header=0,dat_col=['level1','level2','level3'])
inx_weight=read.read_df('./ZZ800_weight','Stk_ZZ800',start=start,end=end,header=None,inx_col=1,dat_col=3)

Note:frames contains columns as: df_name,equation,dependency,type, where type includes df,cap,group. In this case frames.csv have df_name: re,cap,open_price,close,vwap,high,low,volume,ind1,ind2,ind3.

You can also read data by using pd.read_csv directly depending on how you store your data.

start to generate

parms={'re':close.mul(adj).pct_change()
       ,'cap':cap
       ,'open_price':open_price
       ,'close':close
       ,'vwap':vwap
       ,'high':high
       ,'low':low
       ,'volume':volume
       ,'ind1':ind1
       ,'ind2':ind2
       ,'ind3':ind3}

with generator_class(df,factor_path,**parms) as gen:
    gen.generator(batch_size=3,name_start='a')
    gen.generator(batch_size=3,name_start='a')
    gen.output_df(path=frame_path+'/frames_new.csv')

continue to generate with existing frames and factors

with generator_class(df,factor_path,**parms) as gen:
    gen.reload_df(path=frame_path+'/frames_new.csv')
    gen.reload_factors(align=True)
    clean()
    for i in range(5):
        gen.generator(batch_size=2,name_start='a')
        print('step %d memory usage:\t %.1f%% \n'%(i,get_memory_use_pct()))
        if get_memory_use_pct()>80:
            break
    gen.output_df(path=frame_path+'/frames_new2.csv')

Note: It is very important to align all factors and initial dataframes before generating.

you can also choose how to store your factors by setting store_method

backtesting with stratified sampling approach and ic-ir meansure after generation

data_box_param={'ind':ind1
            ,'price':vwap*adjfactor
            ,'sus':sus
            ,'ind_weight':inx_weight
            ,'path':'./databox'
            }

back_test_param={'sharpe_ratio_thresh':3
                 ,'n':5
                 ,'out_path':'.'
                 ,'back_end':'loky'
                 ,'n_jobs':6
                 ,'detail_root_path':None
                 ,'double_side_cost':0.003
                 ,'rf':0.03
                 }

icir_param={'ir_thresh':0.4
            ,'out_path':'.'
            ,'back_end':'loky'
            ,'n_jobs':6
            }

with generator_class(df,factor_path,**parms) as gen:
    for i in range(5):
        gen.generator(batch_size=2,name_start='a')
        gen.output_df(path=frame_path+'/frames_new.csv')
        gen.getOrCreate_databox(**data_box_param)
        gen.back_test(**back_test_param)
        gen.icir(**icir_param)
        clean()
        if get_memory_use_pct()>90:
            print('Memory exceeded')
            break

generate script of factors

from alpha_factory import write_file
import pandas as pd
df2=pd.read_csv(frame_path+'/frames_new.csv')
write_file(df2,'script.py')

locate a factor

from alpha_factory.utilise import get_factor_path
factor_name='a0'
path=get_factor_path(factor_path,factor_name)

only when storage_method='byTime'

use your own functions

To use your own functions you need to append your code in class functions from basic_functions.py in the sourse file and also append the corresponding names in functions.csv from data file in the sourse file.

After that you can set debug=True in generator function to check if there is any bug from all those functions. If indeed there is, a new embeded ipython would be activated to help you find out what is going on in the loop.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alpha_factory-0.3.5.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alpha_factory-0.3.5-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file alpha_factory-0.3.5.tar.gz.

File metadata

  • Download URL: alpha_factory-0.3.5.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.14.2 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.5.5

File hashes

Hashes for alpha_factory-0.3.5.tar.gz
Algorithm Hash digest
SHA256 5a5cd8968ad371f1535b5155386fed7136f7488ef2bc169231a738db032fe867
MD5 9d37d29973e028d9b3fe6874ff717c1d
BLAKE2b-256 cb5b20a8b5667aafbcb74fbf58b0ce84b8479ba519d76098d2089ad3972b031b

See more details on using hashes here.

File details

Details for the file alpha_factory-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: alpha_factory-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.14.2 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.5.5

File hashes

Hashes for alpha_factory-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8c8994b987483a1011ef570dad3b4d0dc5db006b6bbcfe10916390d9af2526c0
MD5 bd7b15a2a0394b9576c551553d96d78f
BLAKE2b-256 c3fdab7221db5cb08c7bfee7d67bc0401bfdcee45b460b3d1c5913ba48cf6b0b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page