Skip to main content

Unformatter Tool to allow parsing and analysis of code base.

Project description

tucan

TUCAN (Tool to Unformat, Clean, and Analyse) is a code parser for scientific codebases. Its target languages are:

  • Very old FORTRAN
  • Recent FORTRAN
  • Python (Under development)
  • C/C++ (Early development)

Installation

You can instal it from PyPI with:

pip install tucan

You can also install from the sources from one of our gitlab mirrors.

What is does?

Remove coding archaisms

First it is a code cleaner. For example, this loop in `tranfit.f', a piece of CHEMKIN II package in good'old FORTRAN77. (Do not worry, recent Chemkin is not written that way, probably) :

(547)      DO 2000 K = 1, KK-1
(548)         DO 2000 J = K+1, KK
(549)            DO 2000 N = 1, NO
(550)               COFD(N,J,K) = COFD(N,K,J)
(551) 2000 CONTINUE

Is translated with the command tucan clean tranfit.f as :

(547-547)        do 2000 k  =  1,kk-1
(548-548)           do 2000 j  =  k+1,kk
(549-549)              do 2000 n  =  1,no
(550-550)                 cofd(n,j,k)  =  cofd(n,k,j)
(551-551)              end do ! 2000
(551-551)           end do ! 2000
(551-551)        end do ! 2000

The cleaned version is a simpler code for further analysis passes, like computing cyclomatic complexity, extracting structures, etc...

Extracting code structure

Here we start from a file of neko, an HPC code in recent Fortran, finalist for the Gordon Bell Prize in 2023.

tucan struct htable.f90 provides a nested dictionary of the code structure. Here is a part of the output:

(...)
type htable.h_tuple_t :
    At path ['htable', 'h_tuple_t'], name h_tuple_t, lines 47 -> 52
    6 statements over 6 lines
    Complexity 1
    Refers to 1 callables:
       - class
    Contains no inner structures
    Contains no annotations

type_public_abstract htable.htable_t :
    At path ['htable', 'htable_t'], name htable_t, lines 55 -> 64
    10 statements over 10 lines
    Complexity 1
    Refers to 2 callables:
       - pass
       - t
    Contains no inner structures
    Contains no annotations

function_pure htable.interface_abstract66.htable_hash :
    At path ['htable', 'interface_abstract66', 'htable_hash'], name htable_hash, lines 67 -> 72
    6 statements over 6 lines
    Complexity 1
    Refers to 2 callables:
       - class
       - htable_hash
    Contains no inner structures
    Contains no annotations

interface_abstract htable.interface_abstract66 :
    At path ['htable', 'interface_abstract66'], name interface_abstract66, lines 66 -> 73
    8 statements over 8 lines
    Complexity 1
    Contains no callables
    Contains 1 elements:
    - htable.interface_abstract66.htable_hash
    Contains no annotations
(...)

(This output will change as we update and improve tucan in the next versions!)

This information allows the creation and manipulation of graphs to extract the structure of the code

Interpreting Conditional Inclusions "IF DEFS".

An other example of tucan is the analysis of ifdefs in C or FORTRAN:

#ifdef FRONT
        WRITE(*,*) " FRONT is enabled " ! partial front subroutine
        SUBROUTINE dummy_front(a,b,c)
        WRITE(*,*) " FRONT 1"     ! partial front subroutine
#else                
        SUBROUTINE dummy_front(a,d,e)
        WRITE(*,*) " FRONT 2"       ! partial front subroutine
#endif
        END SUBROUTINE

        SUBROUTINE dummy_back(a,b,c)
#ifdef BACK
        WRITE(*,*) " FRONT is enabled " ! partial front subroutine
        WRITE(*,*) " BACK 1"    ! partial back subroutine
        END SUBROUTINE  
#else
        WRITE(*,*) " BACK 2"    ! partial back subroutine
        END SUBROUTINE  
#endif

Depending on the pre-definition of variables FRONT and BACK, this code snippet can be read in four ways possible. Here are usages:

tucan ifdef-clean templates_ifdef.f yields:

        SUBROUTINE dummy_front(a,d,e)
        WRITE(*,*) " FRONT 2"       ! partial front subroutine
        END SUBROUTINE

        SUBROUTINE dummy_back(a,b,c)


        WRITE(*,*) " BACK 2"    ! partial back subroutine
        END SUBROUTINE

tucan ifdef-clean templates_ifdef.f -v FRONT yields:

        WRITE(*,*) " FRONT is enabled " ! partial front subroutine
        SUBROUTINE dummy_front(a,b,c)
        WRITE(*,*) " FRONT 1"     ! partial front subroutine


        END SUBROUTINE

        SUBROUTINE dummy_back(a,b,c)


        WRITE(*,*) " BACK 2"    ! partial back subroutine
        END SUBROUTINE

tucan ifdef-clean templates_ifdef.f -v FRONT,BACK yields:

         WRITE(*,*) " FRONT is enabled " ! partial front subroutine
        SUBROUTINE dummy_front(a,b,c)
        WRITE(*,*) " FRONT 1"     ! partial front subroutine


        END SUBROUTINE

        SUBROUTINE dummy_back(a,b,c)
        WRITE(*,*) " BACK is enabled " ! partial front subroutine
        WRITE(*,*) " BACK 1"    ! partial back subroutine
        END SUBROUTINE

scanning ifdef variables

A simpler usage of tucan : scan the current ifdefs variables. Still on neko in the /src folder (an old version though) :

/neko/src >tucan ifdef-scan-pkge .
 - Recursive path gathering ...
 - Cleaning the paths ...
 - Analysis completed.
 - Global ifdef variables : HAVE_PARMETIS, __APPLE__
 - Local to device/opencl/check.c : CL_ERR_STR(err)
 - Local to math/bcknd/device/opencl/opr_opgrad.c : CASE(LX), STR(X)
 - Local to math/bcknd/device/opencl/opr_dudxyz.c : CASE(LX), STR(X)
 - Local to common/sighdl.c : SIGHDL_ALRM, SIGHDL_USR1, SIGHDL_USR2, SIGHDL_XCPU
 - Local to math/bcknd/device/opencl/opr_conv1.c : CASE(LX), STR(X)
 - Local to math/bcknd/device/opencl/opr_cfl.c : CASE(LX), STR(X)
 - Local to krylov/bcknd/device/opencl/pc_jacobi.c : CASE(LX), STR(X)
 - Local to math/bcknd/device/opencl/ax_helm.c : CASE(LX), STR(X)
 - Local to bc/bcknd/device/opencl/symmetry.c : MAX(a,
 - Local to gs/bcknd/device/opencl/gs.c : GS_OP_ADD, GS_OP_MAX, GS_OP_MIN, GS_OP_MUL
 - Local to sem/bcknd/device/opencl/coef.c : DXYZDRST_CASE(LX), GEO_CASE(LX), STR(X)
 - Local to math/bcknd/device/opencl/opr_cdtp.c : CASE(LX), STR(X)

This feature is useful to see all potential variables that surcharge your codebase via conditional inclusions.

More about tucan

Tucan is used by anubis, our open-source tool to explore the git repository of a code, and marauder's map our open-source tool to show codes structures by in-depth vizualisation of callgraphs and code circular-packing .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tucan-0.2.0.tar.gz (96.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tucan-0.2.0-py3-none-any.whl (39.5 kB view details)

Uploaded Python 3

File details

Details for the file tucan-0.2.0.tar.gz.

File metadata

  • Download URL: tucan-0.2.0.tar.gz
  • Upload date:
  • Size: 96.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for tucan-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a97a4833f3f2c7a4130eb93936467a86fd274adcdfc86f7a8804508599e174f4
MD5 a7402b504fa5a37a834e919704dc75ab
BLAKE2b-256 015a93aeed078fff618958b745958edfa95928abf1ac89b88b55d31f93f28e2c

See more details on using hashes here.

File details

Details for the file tucan-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: tucan-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 39.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for tucan-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 066251d5ba66704564ce1b8ebecca726f29aa6800edd3edc627f250af40832bb
MD5 135e386eb71c4dc3506f75c2d3045e9a
BLAKE2b-256 7c6a70a65b01cac50ded7cb3a6579795d4a9b32d14d5c1ac0bccd24e507f6304

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page