Skip to main content

generate alpha factors

Project description

This programme is to automatically generate alpha factors and filter relatively good factors with back-testing methods

Dependencies

  • python >= 3.5

  • pandas >= 0.22.0

  • numpy >= 1.14.0

  • RNWS >= 0.2.0

  • numba >= 0.38.0

  • single_factor_model>=0.3.0

  • empyrical

  • alphalens

Sample

load packages and read in data

from alpha_factory import generator_class,get_memory_use_pct,clean
from RNWS import read
import numpy as np
import pandas as pd
start=20180101
end=20180331
factor_path='.'
frame_path='.'

df=pd.read_csv(frame_path+'/frames.csv')

# read in data
re=read.read_df('./re',file_pattern='re',start=start,end=end)
cap=read.read_df('./cap',file_pattern='cap',header=0,dat_col='cap',start=start,end=end)
open_price,close,vwap,adj,high,low,volume,sus=read.read_df('./mkt_data',file_pattern='mkt',start=start,end=end,header=0,dat_col=['open','close','vwap','adjfactor','high','low','volume','sus'])
ind1,ind2,ind3=read.read_df('./ind',file_pattern='ind',start=start,end=end,header=0,dat_col=['level1','level2','level3'])
inx_weight=read.read_df('./ZZ800_weight','Stk_ZZ800',start=start,end=end,header=None,inx_col=1,dat_col=3)

Note:frames contains columns as: df_name,equation,dependency,type, where type includes df,cap,group. In this case frames.csv have df_name: re,cap,open_price,close,vwap,high,low,volume,ind1,ind2,ind3

start to generate

parms={'re':close.mul(adj).pct_change()
       ,'cap':cap
       ,'open_price':open_price
       ,'close':close
       ,'vwap':vwap
       ,'high':high
       ,'low':low
       ,'volume':volume
       ,'ind1':ind1
       ,'ind2':ind2
       ,'ind3':ind3}

with generator_class(df,factor_path,**parms) as gen:
    gc.generator(batch_size=3,name_start='a')
    gc.generator(batch_size=3,name_start='a')
    gc.output_df(path=frame_path+'/frames_new.csv')

continue to generate with existing frames and factors

with generator_class(df,factor_path,**parms) as gc:
    gc.reload_df(path=frame_path+'/frames_new.csv')
    gc.reload_factors(align=True)
    clean()
    for i in range(5):
        gc.generator(batch_size=2,name_start='a')
        print('step %d memory usage:\t %.1f%% \n'%(i,get_memory_use_pct()))
        if get_memory_use_pct()>80:
            break
    gc.output_df(path=frame_path+'/frames_new2.csv')

Note: It is very important to align all factors and initial dataframes before generating.

you can also choose how to store your factors by setting store_method

backtesting with stratified sampling approach and ic-ir meansure after generation

data_box_param={'ind':ind1
            ,'price':vwap*adjfactor
            ,'sus':sus
            ,'ind_weight':inx_weight
            ,'path':'./databox'
            }

back_test_param={'sharpe_ratio_thresh':3
                 ,'n':5
                 ,'out_path':'.'
                 ,'back_end':'loky'
                 ,'n_jobs':6
                 ,'detail_root_path':None
                 ,'double_side_cost':0.003
                 ,'rf':0.03
                 }

icir_param={'ir_thresh':1
            ,'out_path':'.'
            ,'back_end':'loky'
            ,'n_jobs':6
            }

with generator_class(df,factor_path,**parms) as gen:
    for i in range(5):
        gen.generator(batch_size=2,name_start='a')
        gen.output_df(path=frame_path+'/frames_new.csv')
        gen.getOrCreate_databox(**data_box_param)
        gen.back_test(**back_test_param)
        gen.icir(**icir_param)
        clean()
        if get_memory_use_pct()>90:
            print('Memory exceeded')
            break

generate script of factors

from alpha_factory import write_file
import pandas as pd
df2=pd.read_csv(frame_path+'/frames_new.csv')
write_file(df2,'script.py')

find a factor

from alpha_factory.utilise import get_factor_path
factor_name='a0'
path=get_factor_path(factor_path,factor_name)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alpha_factory-0.3.0-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file alpha_factory-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: alpha_factory-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 27.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.14.2 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.5.5

File hashes

Hashes for alpha_factory-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 17346a69ee8876172d8071b24e9a802752c34c1a94bbefbee7fe2f7db65a7116
MD5 46eb956cd4818f6c5e69dce37647b20b
BLAKE2b-256 eb4e17bae290ab628851aadb932ffd2afce5ff30e73cffffa508125ac177a44c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page