ARPM Lab - Code

The Code allows the user to absorb hands-on the contents of the ARPM Lab, understanding all the practical implications behind the Theory.
The Code is available in different languages, each embedded in its own coding environment: Python, MATLAB, R.

Python

59,501 lines of code

In collaboration with Mathworks Icon Mathworks

MATLAB

13,622 lines of code

In collaboration with R Foundation R project

4,476 lines of code

Coming soon

Table of contents

Coding style
Docstrings
Variables dimensions
Matrix algebra
Function overloading
Code optimization

Here we describe the Python coding standards that the contributor is strictly committed to follow.

Coding style

Always follow the PEP8 style guide .

For naming rules, follow the Google’s naming convention

Type	Public	Internal
Packages	`lower_with_under`
Modules	`lower_with_under`	`_lower_with_under()`
Classes	`CapWords`	`_CapWords`
Exceptions	`CapWords`
Functions	`lower_with_under()`	`_lower_with_under()`
Global/Class Constants	`CAPS_WITH_UNDER`	`_CAPS_WITH_UNDER`
Global/Class Variables	`lower_with_under`	`_lower_with_under()`
Instance Variables	`lower_with_under`	`_lower_with_under` (protected) or `__lower_with_under` (private)
Method Names	`lower_with_under()`	`_lower_with_under()` (protected) or `__lower_with_under()` (private)
Function/Method Parameters	`lower_with_under`
Local Variables	`lower_with_under`

In general, any variable, function or object names in the code must follow the name presented in the ARPM Lab. For example:

the time series $\{x_{t}\}_{t=1}^{\bar{t}}$ in the ARPM Lab should be called x in the code, indexed by t in range(t_);
the routine $\mathit{fit\_locdisp\_mlfp\_difflength}$ in the ARPM Lab should be called fit_locdisp_mlfp_difflength in the code.

The titles of the scripts are in the format s_script_title. The script_title field should be interpretable and intuitive (e.g. not too short).

The titles of the functions are in the format function_title. The function_title field should be interpretable and intuitive (e.g. not too short).

For inline comments, please see here.

For docstrings (comments on modules, functions and classes), please see section Docstrings.

Scripts must not run other scripts, i.e. the command

from s_script_title1 import *

is not allowed. Rather, a script s_script_title2 should import a database saved by s_script_title1. Databases must be as parsimonious and aggregated as possible, so that the same, few, clean .csv files can be called in all the case studies. See more in section Variables dimension.

Scripts must be as modular as possible: any time there is a copy&paste, the contributor must evaluate the option of creating a function for those operations.

Scripts must be as simple as possible: any time there is a need for advanced optimizations/computations, the contributor must evaluate the option of creating a functions for those operations. See more in section Code optimization.

Docstrings

Functions docstring

The docstring and comments must strictly follow the template below. In particular, the docstring must only contain:

one link to the respective ARPM Lab Code Documentation;
optional “See also” links;
type and shape of the input;
type and shape of the output.

# -*- coding: utf-8 -*-

import numpy as np

def single_output(x, y, z=None, *, option1='a', option2='c'):
    """For details, see here.

    Parameters
    ----------
    x : float
    y : array, shape (i_bar, )
    z : array, optional, shape (i_bar, j_bar)
    option1 : str, optional
    option2 : str, optional

    Returns
    ----------
    g : bool
    """

    # Step 1: Do this
    w = np.sin(x)

    # Step 2: Do that
    g = w+3

    return g

Scripts docstring

The docstring and comments must strictly follow the template below. In particular, the docstring must only contain:

one link to the respective ARPM Lab Code Documentation ;
optional “See also” links.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# ---
# jupyter:
#   jupytext:
#     text_representation:
#       extension: .py
#       format_name: light
#       format_version: '1.4'
#       jupytext_version: 1.1.5
#   kernelspec:
#     display_name: Python 3
#     language: python
#     name: python3
# ---

# # s_script_name
# For details, see here.

# +
import internal_packages

import external_packages
# -

# ## Input parameters

param1 = 1
param2 = 2

# ## Step 1: Compute x

x = param1 + param2

# ## Step 2: Compute y

y = x-1

Variables dimensions

The standards for the NumPy variables and CSV files are given in the table below

Variable	Type	NumPy	DB (CSV)
Univariate realized process	Time series - past $\bar{t}$ steps	`(t_, )`	`(t_, 1)`
Univariate random variable	$\bar{\jmath}$ MC scenarios	`(j_, )`	`(j_, 1)`
Univariate random process	$\bar{\jmath}$ MC scenarios - u future steps	`(j_, u_)`	`(j_*u_, 1)`
$\bar{n}$ -variate realized process	Time series - past $\bar{t}$ steps	`(t_, n_)`	`(t_, n_)`
$\bar{n}$ -variate random variable	$\bar{\jmath}$ MC scenarios	`(j_, n_)`	`(j_, n_)`
$\bar{n}$ -variate random process	$\bar{\jmath}$ MC scenarios - $\bar{u}$ future steps	`(j_, u_, n_)`	`(j_*u_, n_)`
$(\bar{n}\times\bar{k})$ -variate realized process	Time series - past $\bar{t}$ steps	`(t_, n_, k_)`	`(t_, n_*k_)`
$(\bar{n}\times\bar{k})$ -variate random variable	$\bar{\jmath}$ MC scenarios	`(j_, n_, k_)`	`(j_, n_*k_)`
$(\bar{n}\times\bar{k})$ -variate random process	$\bar{\jmath}$ MC scenarios - $\bar{u}$ future steps	`(j_, u_, n_, k_)`	`(j_u_, n_k_)`

To convert a NumPy array into a dataframe, the rule is:

Group all the dimensions in two buckets: first, those you want to be indices, then, those you want to be headers: (ind1*ind2*...*ind_i_, dim1*dim2*...*dim_n_).

Matrix algebra

Static matrix algebra

A vector ( $\bar{n}\times 1$ ) in the ARPM Lab is represented as a 1D Numpy array of shape (n_, ) in Python (see Variables dimension ). For example \[ \boldsymbol{v}_{t_{\mathit{now}}} \equiv ( v_{1,t_{\mathit{now}}}, v_{2,t_{\mathit{now}}} )' = (\$14.24, \$48.61)' \] must read in Python

v_tnow = np.array([14.24, 48.61]).

NumPy handles 1D array with flexibility. Namely

v_tnow @ np.array([[1, 2], [3, 4]])

and

np.array([[1, 2], [3, 4]]) @ v_tnow

are both allowed, because the 1D array is treated both as a row vector $1\times 2$ and as a column vector $2\times 1$.

This allows to use exactly the same order of variables in the code as it is used in the ARPM Lab. For example, if the formula in the ARPM Lab reads \[ \bar{\boldsymbol{x}}^{\mathit{Reg}} = \mathbb{E}\{\boldsymbol{X}\} + \mathbb{C}v\{\boldsymbol{X},\boldsymbol{Z}\}(\mathbb{C}v\{\boldsymbol{Z}\})^{-1}(\boldsymbol{z}-\mathbb{E}\{\boldsymbol{Z}\}) \] then in the code it must read

x_bar_reg = e_x + cv_x_z @ np.linalg.inv(cv_z) @ (z - e_z).

Hence, in the static case, the order of appearance of the variables in the code must exactly follow the order in the ARPM Lab.

However, depending on the situation, optimized techniques may be used such as

np.linalg.solve(cv_z, z - e_z)

to compute $(\mathbb{C}v\{\boldsymbol{Z}\})^{-1}(\boldsymbol{z}-\mathbb{E}\{\boldsymbol{Z}\})$ for a very large and ill-conditioned $\mathbb{C}v\{\boldsymbol{Z}\}$. See Code optimization for details.

Dynamic matrix algebra

Consider the dynamic case where you need to multiply a $\bar{m}\times\bar{n}$ matrix $\boldsymbol{b}$ with $\bar{\jmath}$ scenarios of a $\bar{n}$ -dimensional variable $\boldsymbol{x}^{(j)}$, i.e. you want to modify the scenarios as \[ \{\bar{\boldsymbol{x}}^{(j)}\}_{j=1}^{\bar{\jmath}} \leftarrow \{\boldsymbol{bx}^{(j)}\}_{j=1}^{\bar{\jmath}} \] Then, according to the table in section Variables dimension, the variable x would be a (j_, n_) array and the variable b would be a (m_, n_) array. In such cases, the Python code should read

x_bar = x @ b.T.

A similar rationale applies for other variables dypes, such as time series or paths of multidimensional objects, see section Variables dimension.

Function overloading

Inputs

If a function works with both multivariate and univariate variables, then in the univariate case it must be able to accept scalars as inputs. For example, all of the below should be valid queries for the simulate_normal function

simulate_normal(0, 1, 100)
simulate_normal(np.array([0, 0]), np.array([[1, 0],[0, 1]]), 100)

Outputs

The output of such functions can be divided in 2 classes: scenarios (or time series) and parameters.

The scenarios (or time series) should be of the shape (j_, n_) when n_ > 1 and (j_,) when n_ == 1, as discussed in section Variables dimension. The contributor must make sure that the output is of the shape (j_,) no matter what the shape of the input is (the shape of the output often depends on the shape of the input). A special case should be considered when j_ = 1 and n_ = 1 where the output in such case should be just a scalar, not an array.
When n_== 1, the output parameters should be just scalars, not arrays that contain only one element. For instance, the output of the meancov_sp function

mu, sig2 = meancov_sp(x, p)

where x.shape = (j_, ) must be scalars, not NumPy arrays.

Code optimization

Optimized techniques may be used in cases where there is a clear advantage in speed or accuracy.

Optimized techniques should not be used when the ratio between speed/accuracy gain and clarity is low. For instance, to compute the inverse of a well conditioned 5×5 matrix, the code

np.linalg.solve(sigma2, np.eye(5))

brings little to none speed/accuracy gain, because sigma2 is a small matrix of a known size. In this case, the code must be

np.linalg.inv(sigma2)

On the other hand, to invert a large ill-conditioned matrix $\boldsymbol{\sigma}^2$ and multiply it with a matrix (vector) $\boldsymbol{v}$ , i.e. to compute $(\boldsymbol{\sigma}^2)^{-1}\boldsymbol{v}$ , the optimized technique

np.linalg.solve(sigma2, v)

should be used.

If there is a need for “too much” optimization, then the contributor must evaluate if the optimization in the code is suitable to be discussed in detail in the ARPM Lab and escalate the issue to ARPM.

Here we describe the content of the ARPM coding environments across languages.

All the code environments contain scripts, functions and usage example scripts.

The scripts implement the Case studies and toy examples , following the Theory.
The functions, which are called by the scripts, gather the most frequently used sequences of instructions that perform specific tasks, implementing the algorithms described in the Theory . The ARPM functions are divided by topic, see the Documentation.
Each of the usage example scripts implement a simple use case of a given function: for a given function, they show how the function is called and how to assign what it returns.

In each coding environment users find two main directories: one for the code (scripts and functions) and one for the databases.

The code is in the directory named after the coding language, which has two sub-directories, one for the scripts, and one for functions:

the scripts are grouped in the scripts directory, which in turn has two sub-directories:
- sources containing the actual scripts created from the Documentation;
- notebooks containing the Jupyter Notebook (Live Script for MATLAB) implementation of the scripts in the sources directory;
the functions are grouped in the functions directory, which in turn has sub-directories for the various topics;
- usage example scripts for functions are stored in the usage-examples sub-directory of the functions directory.

The databases are in the databases directory, which in turn has two sub-directories:

global-databases containing static data that is used as input of scripts, common to all implementations;
temporary-databases containing dynamic data that is the output of a script and the input of, at least, another script, specific to the implementation.

Table of contents

Git environment
Code creation
Code submission
Code deployment
Code maintenance

Here we describe the protocol by which a contributor creates and maintains the ARPM’s implementation of the ARPM Lab Code (scripts and functions) for the ARPM Lab, deployed by ARPM on its website.

The protocol applies across all the coding languages implemented on the ARPM Lab.

In the below, ARPM means the company and/or its employees based on the context.

Git environment

The code is hosted on GitLab in the private git repository /arpm-lab/arpm-python, that ARPM created, owns and regulates, following the Shared Repository Model.

In the /arpm-lab/arpm-python git repository files are organized according to the following directory tree:

Repositories:(*)

Python/
- arpym/arpym/
  - estimation/
  - portfolio/
  - pricing/
  - statistics/
  - tools/
  - views/
  - usage-examples/
- scripts/
  - sources/
databases
- (global-databases/)
- temporary-databases/

_{(*) The name and structure of the functions folder might slightly differ among code languages.}

The global-databases directory is a git sub-module added from the /arpm-lab/arpm-global-databases repository. The content of the other directories is copied by ARPM into the Coding Environment available to the ARPM Lab users.

ARPM maintains and publishes a code-dashboard that tracks the status of the ARPM Lab Code Documentation . The list includes the scripts and functions which are documented in the ARPM Lab Code Documentation and implemented in Python.

Code creation

The contributor creates the Python implementation;

the implementation is based on the ARPM Lab Code Documentation and/or relevant parts of the ARPM Lab Theory , as referenced by the ARPM Lab Code Documentation ;
the contributor is strictly committed to follow the coding standards described in the ARPM website here ;
in most cases, the creation of a code comes alongside the creation of the respective ARPM Lab Code Documentation ;
if the ARPM Lab Code Documentation is not clear nor complete enough to make the implementation possible, the contributor must escalate the issue to ARPM, whic fixes or clarifies the ARPM Lab Code Documentation ;
a script is “ready” when all the functions called by it are ready too;
the implementation of a function includes the implementation of the corresponding usage example script.

Code submission

The contributor submits the Python implementation;

the contributor pushes the changes to the shared git repository in a personal branch and notifies ARPM;
ARPM signs off that the code in the repository is in a consistent state, in that:
- notebooks, scripts, functions, usage example scripts and databases run jointly without errors;
- notebooks, scripts, functions, usage example scripts fully mirror the ARPM Lab Code Documentation ;
the ARPM Researcher merges the contributor’s personal branch with the develop branch;
ARPM deploys the code from the develop branch to the ARPM beta website;
the contributors checks that all the components of the code are correctly linked to and from the rest of the ARPM Lab;
the ARPM Researcher merges the develop branch into the master branch;
ARPM deploys the code from the master branch to the ARPM production website.

Code deployment

ARPM deploys the code to the ARPM Lab for all the users.

Code maintenance

For the ongoing maintenance of the ARPM code:

ARPM is committed to ensuring that the code is working with new versions of the Python runtime, by updating the code accordingly;
the contributor is committed to updating the code according to the updates made to the ARPM Lab Code Documentation by ARPM, using the same protocol used for the code creation, where the revised documentations are listed in the code-dashboard .

Coming soon

Here we describe the content of the ARPM coding environments across languages.

All the code environments contain scripts, functions and usage example scripts.

The scripts implement the Case studies and toy examples , following the Theory.
The functions, which are called by the scripts, gather the most frequently used sequences of instructions that perform specific tasks, implementing the algorithms described in the Theory . The ARPM functions are divided by topic, see the Documentation.
Each of the usage example scripts implement a simple use case of a given function: for a given function, they show how the function is called and how to assign what it returns.

In each coding environment users find two main directories: one for the code (scripts and functions) and one for the databases.

The code is in the directory named after the coding language, which has two sub-directories, one for the scripts, and one for functions:

the scripts are grouped in the scripts directory, which in turn has two sub-directories:
- sources containing the actual scripts created from the Documentation;
- notebooks containing the Jupyter Notebook (Live Script for MATLAB) implementation of the scripts in the sources directory;
the functions are grouped in the functions directory, which in turn has sub-directories for the various topics;
- usage example scripts for functions are stored in the usage-examples sub-directory of the functions directory.

The databases are in the databases directory, which in turn has two sub-directories:

global-databases containing static data that is used as input of scripts, common to all implementations;
temporary-databases containing dynamic data that is the output of a script and the input of, at least, another script, specific to the implementation.

Table of contents

Git environment
Code creation
Code submission
Code deployment
Code maintenance

Here we describe the protocol by which a contributor creates and maintains the ARPM’s implementation of the ARPM Lab Code (scripts and functions) for the ARPM Lab, deployed by ARPM on its website.

The protocol applies across all the coding languages implemented on the ARPM Lab.

In the below, ARPM means the company and/or its employees based on the context.

Git environment

The code is hosted on GitLab in the private git repository /arpm-lab/arpm-matlab , that ARPM created, owns and regulates, following the Shared Repository Model.

In the /arpm-lab/arpm-matlab git repository files are organized according to the following directory tree:

Repositories:(*)

MATLAB/
- functions/
  - estimation/
  - portfolio/
  - pricing/
  - statistics/
  - tools/
  - views/
  - usage-examples/
- scripts/
  - sources/
databases
- (global-databases/)
- temporary-databases/

_{(*) The name and structure of the functions folder might slightly differ among code languages.}

Code creation

The contributor creates the MATLAB implementation;

the implementation is based on the ARPM Lab Code Documentation and/or relevant parts of the ARPM Lab Theory , as referenced by the ARPM Lab Code Documentation ;
the contributor is strictly committed to follow the coding standards described in the ARPM website here ;
in most cases, the creation of a code comes alongside the creation of the respective ARPM Lab Code Documentation ;
if the ARPM Lab Code Documentation is not clear nor complete enough to make the implementation possible, the contributor must escalate the issue to ARPM, whic fixes or clarifies the ARPM Lab Code Documentation ;
a script is “ready” when all the functions called by it are ready too;
the implementation of a function includes the implementation of the corresponding usage example script.

Code submission

The contributor submits the MATLAB implementation;

the contributor pushes the changes to the shared git repository in a personal branch and notifies ARPM;
ARPM signs off that the code in the repository is in a consistent state, in that:
- Live Scripts, scripts, functions, usage example scripts and databases run jointly without errors;
- Live Scripts, scripts, functions, usage example scripts fully mirror the ARPM Lab Code Documentation ;
the ARPM Researcher merges the contributor’s personal branch with the develop branch;
ARPM deploys the code from the develop branch to the ARPM beta website;
the contributors checks that all the components of the code are correctly linked to and from the rest of the ARPM Lab;
the ARPM Researcher merges the develop branch into the master branch;
ARPM deploys the code from the master branch to the ARPM production website.

Code deployment

ARPM deploys the code to the ARPM Lab for all the users.

Code maintenance

For the ongoing maintenance of the ARPM code:

ARPM is committed to ensuring that the code is working with new versions of the MATLAB runtime, by updating the code accordingly;
the contributor is committed to updating the code according to the updates made to the ARPM Lab Code Documentation by ARPM, using the same protocol used for the code creation, where the revised documentations are listed in the code-dashboard .

Coming soon

Table of contents

Coding style
Docstrings
Variables dimensions
Matrix algebra
Function overloading
Code optimization

Here we describe the R coding standards that the contributor is strictly committed to follow.

Coding style

Always follow the Google’s R Style Guide except the naming rules. For the naming rules, follow the Google’s naming convention

In general, any variable, function or object names in the code must follow the name presented in the ARPM Lab . For example:

the time series $\{x_{t}\}_{t=1}^{\bar{t}}$ in the ARPM Lab should be called x in the code, indexed by t in range(t_);
the routine $\mathit{fit\_locdisp\_mlfp\_difflength}$ in the ARPM Lab should be called fit_locdisp_mlfp_difflength in the code.

The titles of the scripts are in the format s_script_title. The script_title field should be interpretable and intuitive (e.g. not too short).

The titles of the functions are in the format function_title. The function_title field should be interpretable and intuitive (e.g. not too short).

For inline comments, please see here.

For docstrings (comments on modules, functions and classes), please see section Docstrings.

Scripts should not run the other scripts, i.e. the command

source("../../../R/scripts/sources/s_script_title1.R")

Scripts must be as modular as possible: any time there is a copy&paste, the contributor must evaluate the option of creating a function for those operations.

As an assign operator, <- should be used instead of =.

Do not use attach() in order to make code more clear.

Plots should be done using packages from the basic R library.

Docstrings

Functions docstring

The docstring and comments must strictly follow the template below. In particular, the docstring must only contain:

one link to the respective ARPM Lab Code Documentation ;
optional “See also” links;
type and dimension of the input;
type and dimension of the output.

# -*- coding: utf-8 -*-

single_output <- function(x            # parameter1,
                          y            # parameter2,
                          z=None       # optional parameter1,
                          option1='a'  # optional parameter2,
                          option2='c'  # optional parameter3
){
    # For details, see here.

    # Parameters
    # ----------
    # x : scalar
    # y : vector, dimensions (i_bar x 1)
    # z : matrix, optional, dimensions (i_bar x j_bar)
    # option1 : str, optional
    # option2 : str, optional

    # Returns
    # ----------
    # g : bool

    # ## Step 1: Do this

    w <- sin(x)

    # ## Step 2: Do that

    g <- w + 3

    return(g)

Scripts docstring

The docstring and comments must strictly follow the template below. In particular, the docstring must only contain:

one link to the respective ARPM Lab Code Documentation ;
optional “See also” links.

# ---
# jupyter:
#   kernelspec:
#     display_name: R
#     language: R
#     name: ir
# ---

# # s_script_name
# For details, see here.

# load function_name function
source("../../../R/functions/function_file/function_name.R")

# ## Step 1: Input parameters

# +
param1 <- 1
param2 <- 2
# -

# ## Step 2: Compute x

x <- param1 + param2

# ## Step 3: Compute y
y <- x-1

Variables dimensions

Basic data structures in R can be organized by their dimensionality and whether they aree homogeneous or heterogeneous. The standard categorization is given in the table below. R has no 0-dimensional, or scalar types. Individual numbers or strings which we consider as a scalar, are vectors of length one.

	Homogeneous	Heterogeneous
1d	Atomic vector	List
2d	Matrix	Data frame
nd	Array

The standards for the R variables and CSV files are given in the table below.

Variable	Type	Lenght/Dimension	DB (CSV)
Univariate realized process	Time series - past $\bar{t}$ steps	`t_`	`t_ x 1`
Univariate random variable	$\bar{\jmath}$ MC scenarios	`j_`	`j_ x 1`
Univariate random process	$\bar{\jmath}$ MC scenarios - u future steps	`j_ x u_`	`j_ x u_`
$\bar{n}$ -variate realized process	Time series - past $\bar{t}$ steps	`t_ x n_`	`t_ x n_`
$\bar{n}$ -variate random variable	$\bar{\jmath}$ MC scenarios	`(j_, n_)`	`(j_, n_)`
$\bar{n}$ -variate random process	$\bar{\jmath}$ MC scenarios - $\bar{u}$ future steps	`j_ x u_ x n_`	`j_*u_ x n_`
$(\bar{n}\times\bar{k})$ -variate realized process	Time series - past $\bar{t}$ steps	`t_ x n_ x k_`	`t_ x n_*k_`
$(\bar{n}\times\bar{k})$ -variate random variable	$\bar{\jmath}$ MC scenarios	`(j_, n_, k_)`	`(j_, n_*k_)`
$(\bar{n}\times\bar{k})$ -variate random process	$\bar{\jmath}$ MC scenarios - $\bar{u}$ future steps	`j_ x u_ x n_ x k_`	`j_u_ x n_k_`

Group all the dimensions in two buckets: first, those you want to be indices, then, those you want to be headers: (ind1*ind2*...*ind_i_, dim1*dim2*...*dim_n_).

Matrix algebra

Static matrix algebra

A vector ( $\bar{n}\times 1$ ) in the ARPM Lab is represented as basic structure type in R, a vector of length n_ (see Variables dimension ). For example \[ \boldsymbol{v}_{t_{\mathit{now}}} \equiv \begin{pmatrix}v_{1,t_{\mathit{now}}} \\ v_{2,t_{\mathit{now}}} \end{pmatrix} = \begin{pmatrix}\$14.24 \\ \$48.61 \end{pmatrix} \] should read in R

v_tnow <- c(14.24, 48.61).

The following commands in R

v_tnow %*% matrix(1:4, nrow=2, byrow=TRUE)

and

matrix(1:4, nrow=2, byrow=TRUE) %*% v_tnow

will not produce the result of same dimensions as in Python. Consequently, the contributor should take care of the order of variables in the code and in the ARPM Lab.

Depending on the situation, optimized techniques may be used such as

solve(cv_z, z - e_z)

Dynamic matrix algebra

Consider the dynamic case where you need to multiply a $\bar{m}\times\bar{n}$ matrix $\boldsymbol{b}$ with $\bar{\jmath}$ scenarios of a $\bar{n}$ -dimensional variable $\boldsymbol{x}^{(j)}$, i.e. you want to modify the scenarios as \[ \{\bar{\boldsymbol{x}}^{(j)}\}_{j=1}^{\bar{\jmath}} \leftarrow \{\boldsymbol{bx}^{(j)}\}_{j=1}^{\bar{\jmath}} \] Then, according to the table in section Variables dimension, the variable x would be a j_ x n_ matrix and the variable b would be an m_ x n_ matrix. In such cases, the R code should read

x_bar <- x %*% t(b).

Function overloading

Inputs

simulate_normal(0, 1, 100)
simulate_normal(c(0, 0), diag(c(1,1)), 100)

Outputs

The contributor must make sure that the output is of the shape correct dimensions and type no matter what the shape of the input is.

Code optimization

Optimized techniques may be used in cases where there is a clear advantage in speed or accuracy.

Optimized techniques should not be used when the ratio between speed/accuracy gain and clarity is low. For instance, to compute the inverse of a well conditioned 5×5 matrix, the code

solve(sigma_sq, diag(5))

brings little to none speed/accuracy gain, because sigma2 is a small matrix of a known size. In this case, the code must be

solve(sigma_sq)

solve(sigma_sq, v)

should be used.

Here we describe the content of the ARPM coding environments across languages.

All the code environments contain scripts, functions and usage example scripts.

The scripts implement the Case studies and toy examples , following the Theory.
The functions, which are called by the scripts, gather the most frequently used sequences of instructions that perform specific tasks, implementing the algorithms described in the Theory . The ARPM functions are divided by topic, see the Documentation.
Each of the usage example scripts implement a simple use case of a given function: for a given function, they show how the function is called and how to assign what it returns.

In each coding environment users find two main directories: one for the code (scripts and functions) and one for the databases.

The code is in the directory named after the coding language, which has two sub-directories, one for the scripts, and one for functions:

the scripts are grouped in the scripts directory, which in turn has two sub-directories:
- sources containing the actual scripts created from the Documentation;
- notebooks containing the Jupyter Notebook (Live Script for MATLAB) implementation of the scripts in the sources directory;
the functions are grouped in the functions directory, which in turn has sub-directories for the various topics;
- usage example scripts for functions are stored in the usage-examples sub-directory of the functions directory.

The databases are in the databases directory, which in turn has two sub-directories:

global-databases containing static data that is used as input of scripts, common to all implementations;
temporary-databases containing dynamic data that is the output of a script and the input of, at least, another script, specific to the implementation.

Table of contents

Git environment
Code creation
Code submission
Code deployment
Code maintenance

Here we describe the protocol by which a contributor creates and maintains the ARPM’s implementation of the ARPM Lab Code (scripts and functions) for the ARPM Lab, deployed by ARPM on its website.

The protocol applies across all the coding languages implemented on the ARPM Lab.

In the below, ARPM means the company and/or its employees based on the context.

Git environment

The code is hosted on GitLab in the private git repository /arpm-lab/arpm-r, that ARPM created, owns and regulates, following the Shared Repository Model.

In the /arpm-lab/arpm-r git repository files are organized according to the following directory tree:

Repositories:(*)

R/
- functions/
  - estimation/
  - portfolio/
  - pricing/
  - statistics/
  - tools/
  - views/
  - usage-examples/
- scripts/
  - sources/
databases
- (global-databases/)
- temporary-databases/

_{(*) The name and structure of the functions folder might slightly differ among code languages.}

Code creation

The contributor creates the R implementation;

the implementation is based on the ARPM Lab Code Documentation and/or relevant parts of the ARPM Lab Theory , as referenced by the ARPM Lab Code Documentation ;
the contributor is strictly committed to follow the coding standards described in the ARPM website here ;
in most cases, the creation of a code comes alongside the creation of the respective ARPM Lab Code Documentation ;
if the ARPM Lab Code Documentation is not clear nor complete enough to make the implementation possible, the contributor must escalate the issue to ARPM, whic fixes or clarifies the ARPM Lab Code Documentation ;
a script is “ready” when all the functions called by it are ready too;
the implementation of a function includes the implementation of the corresponding usage example script.

Code submission

The contributor submits the R implementation;

the contributor pushes the changes to the shared git repository in a personal branch and notifies ARPM;
ARPM signs off that the code in the repository is in a consistent state, in that:
- notebooks, scripts, functions, usage example scripts and databases run jointly without errors;
- notebooks, scripts, functions, usage example scripts fully mirror the ARPM Lab Code Documentation ;
the ARPM Researcher merges the contributor’s personal branch with the develop branch;
ARPM deploys the code from the develop branch to the ARPM beta website;
the contributors checks that all the components of the code are correctly linked to and from the rest of the ARPM Lab;
the ARPM Researcher merges the develop branch into the master branch;
ARPM deploys the code from the master branch to the ARPM production website.

Code deployment

ARPM deploys the code to the ARPM Lab for all the users.

Code maintenance

For the ongoing maintenance of the ARPM code:

ARPM is committed to ensuring that the code is working with new versions of the R runtime, by updating the code accordingly;
the contributor is committed to updating the code according to the updates made to the ARPM Lab Code Documentation by ARPM, using the same protocol used for the code creation, where the revised documentations are listed in the code-dashboard .

Variable	Type	NumPy	DB (CSV)
Univariate realized process	Time series - past \(\bar{t}\) steps	`(t_, )`	`(t_, 1)`
Univariate random variable	\(\bar{\jmath}\) MC scenarios	`(j_, )`	`(j_, 1)`
Univariate random process	\(\bar{\jmath}\) MC scenarios - u future steps	`(j_, u_)`	`(j_*u_, 1)`
\(\bar{n}\) -variate realized process	Time series - past \(\bar{t}\) steps	`(t_, n_)`	`(t_, n_)`
\(\bar{n}\) -variate random variable	\(\bar{\jmath}\) MC scenarios	`(j_, n_)`	`(j_, n_)`
\(\bar{n}\) -variate random process	\(\bar{\jmath}\) MC scenarios - \(\bar{u}\) future steps	`(j_, u_, n_)`	`(j_*u_, n_)`
\((\bar{n}\times\bar{k})\) -variate realized process	Time series - past \(\bar{t}\) steps	`(t_, n_, k_)`	`(t_, n_*k_)`
\((\bar{n}\times\bar{k})\) -variate random variable	\(\bar{\jmath}\) MC scenarios	`(j_, n_, k_)`	`(j_, n_*k_)`
\((\bar{n}\times\bar{k})\) -variate random process	\(\bar{\jmath}\) MC scenarios - \(\bar{u}\) future steps	`(j_, u_, n_, k_)`	`(j_u_, n_k_)`