ma/doc/expose/expose.tex
2023-07-12 16:36:46 +02:00

265 lines
16 KiB
TeX

\documentclass[a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[style=numeric, sorting=nyt]{biblatex}
\usepackage{caption}
\usepackage[acronym, nomain]{glossaries}
\usepackage{hyperref}
\usepackage{pgfgantt}
\usepackage{todonotes}
% Suppress notes exported from Mendeley in bibliography
\AtEveryBibitem{\clearfield{note}}
\addbibresource{references.bib}
\makeglossaries{}
\newacronym{cis}{CiS}{Cells in Silico}
\newacronym{cpu}{CPU}{Central Processing Unit}
\newacronym{cpm}{CPM}{Cellular Potts Model}
\newacronym{ecm}{ECM}{Extracellular Matrix}
\newacronym{fem}{FEM}{Finite Element Method}
\newacronym{gpu}{GPU}{Graphics Processing Unit}
\newacronym{lbm}{LBM}{Lattice Boltzmann Method}
\newacronym{mcs}{MCS}{Monte-Carlo Step}
\newacronym{mpi}{MPI}{Message Passing Interface}
\newacronym{nastja}{NAStJA}{Neoteric Autonomous Stencil code for Jolly Algorithms}
\newacronym{sls}{SLS}{Standard Linear Solid}
\begin{document}
\title{Exposé}
\author{Paul Brinkmeier}
\date{July 2023}
\maketitle
\section{Introduction}
Computational models of cell behavior can be useful to simulate and reproduce experiments.
In addition, they show us how well our understanding models reality.
A popular approach is the \acrfull{cpm}, where each cell is modeled as a set of connected pixels or voxels on a two- or three-dimensional lattice.
To simulate biological processes involving thousands of cells, large lattices are needed.
Due to the local nature of the computations involved, the \acrshort{cpm} lends itself well to distributed programming.
I will base my work on \acrfull{cis}, which is a distributed implementation of the \acrshort{cpm} based on the \acrfull{nastja} framework.
In order to be true to \emph{in vivo}/\emph{in vitro} findings, such \emph{in silico} models must take into account a multitude of factors influencing cell behavior.
One such factor is the interaction with the \acrfull{ecm}, the structural scaffold which cells are embedded in.
In this work I will focus on the viscoelasticity of the collagen networks in the \acrshort{ecm}.
I will explore models of viscoelasticity that, similar to the \acrshort{cpm} itself, employ local interactions to model global effects.
This is required to fit the implementation into the \acrshort{nastja} framework so that it can be seamlessly integrated with \acrshort{cis}.
Additionally, I will investigate the performance of my model using different implementations on both \acrshort{cpu}s and \acrshort{gpu}s.
\section{Research}
\subsection{The \acrfull{cpm}}
The \acrshort{cpm}~\cite{graner1992} models cells as sets of connected sites on a square lattice.
Each lattice site is assigned the integer cell ID of the cell it belongs to.
The behavior of the cells is regulated by the Hamiltonian $H$, which represents the energy of a particular lattice.
The Hamiltonian contains at the least a cell-cell adhesion and cell volume, but is usually extended by other terms such as cell surface or alignment.
To advance the \acrshort{cpm}, a \acrfull{mcs} is performed:
The cell ID of a random lattice site is changed to the cell ID of one of its neighbors and the difference in energy $\Delta H$ is calculated.
The update is always accepted if the energy decreases.
If the energy does not decrease, the update is accepted probabilistically (e.g.\ by the Metropolis criterion), where greater increases are less probable.
Repeated \acrshortpl{mcs} minimize $H$.
From an implementor's perspective, the \acrshort{cpm} has a great advantage over other approaches:
Since updates happen on a square lattice and changes in energy can be calculated locally, it lends itself well to distributed programming.
\acrfull{cis}~\cite{berghoff2020} is a parallel implementation of the \acrshort{cpm} based on the \acrfull{nastja} framework~\cite{berghoff2018}.
\acrshort{nastja} offers an abstraction layer for implementing stencil codes using \acrfull{mpi}, making it possible to leverage large-scale parallelism.
A stencil defines an update function for each lattice site which only takes the site's neighbors as inputs.
\acrshort{nastja} divides the simulation domain into blocks.
After the stencil is computed for each block, the \emph{halo}, i.e.\ the boundary region between blocks is exchanged such that each block has the data necessary to compute the stencil again.
\subsection{The \acrfull{ecm}}
The \acrshort{ecm} is the part of a tissue that surrounds the cells.
It provides their physical and biochemical environment, thereby influencing cell behavior~\cite{frantz2010}.
While the \acrshort{ecm} consists of a variety of components, I focus on a single essential component:
Fibrous collagen networks and their viscoelasticity.
\acrshort{ecm} viscoelasticity has been established as an important factor in cell behavior~\cite{chaudhuri2020}.
For example, the \acrshort{ecm} confines cells and restricts processes such as migration, spreading, growth and mitosis.
These processes also affect the \acrshort{ecm} and can lead to permanent deformation.
In turn, this deformation can have an influence on cell behavior, resulting in a string coupling between the behavior of the \acrshort{ecm} and the behavior of the cells.
\subsection{Models of the \acrshort{ecm} in the \acrshort{cpm}}
In this section I list current approaches to modeling the \acrshort{ecm} in \acrshort{cpm} simulations.
I present approaches that explicitly model the plasticity of \acrshort{ecm} collagens.
\paragraph{Static Cell}
A starting point is to model the \acrshort{ecm} as a static cell.
In this model, a cell ID is chosen to represent the solid parts of the \acrshort{ecm}.
Cell-matrix interactions are regulated by the Hamiltonian just like cell-cell interactions.
\acrshort{ecm} lattice sites do not copy their neighbors and can not be copied by their neighbors during a \acrshort{mcs}.
Instead, simulations using this approach usually allow cells to degrade adjacent matrix sites over time.
This approach is used for example in~\cite{bauer2007}, where the \acrshort{ecm} is initialized by randomly placing fiber bundles across the domain and~\cite{scianna2013}, which investigates cell behavior in \acrshortpl{ecm} with regular patterns.
\paragraph{Hybrid \acrshort{cpm}-\acrshort{fem}}
An approach using a \acrfull{fem} is presented in~\cite{vanoers2014} and expanded upon in~\cite{rens2019, rens2017}.
Each lattice site is assigned a local directional strain on the \acrshort{ecm}.
Cells exert traction forces on the \acrshort{ecm} used to calculate the lattice strains by a \acrshort{fem}.
The hamiltonian of the \acrshort{cpm} is modified such that cells respond to the strain.
\paragraph{Hybrid \acrshort{cpm} and Molecular Dynamics Methods}
Another approach is presented in~\cite{tsingos2022}.
This work simulates matrix fibers using a bead-and-chain model.
Similar to the previous approach, the \acrshort{ecm} model is coupled with the \acrshort{cpm}.
However, in this work, cells interact with the \acrshort{ecm} only through a sparse subset of lattice sites.
\subsection{Lattice Models of Viscoelastic Materials}
The strain response of the collagen networks in the \acrshort{ecm} is not fully elastic.
It exhibits both elastic (spring-like) and viscuous (damper-like) behavior.
The behavior of such viscoelastic materials is modeled by serial or parallel configurations of springs and dampers~\cite{mierke2021, sengul2021}.
The most common configurations for describing viscoelastic solids are
\begin{itemize}
\item the Maxwell model, consisting of a spring and a damper in series,
\item the Kelvin-Voigt model, consisting of a spring and a damper in parallel
\item and the Zener or \acrfull{sls} model, which extends either the Maxwell or the Kelvin-Voigt model by another spring.
\end{itemize}
Depending on the specific viscoelastic characteristics that are to be predicted, a particular model can be chosen.
In order to align the viscoelastic \acrshort{ecm} model with the \acrshort{cpm}, I consider approaches that model viscoelastic materials on square lattices.
In particular, the following approaches might be relevant.
\paragraph{Discrete Particle Method}
A model for viscoelastic solids is presented in~\cite{obrien2008}.
This work extends the discrete particle method for elastic solids presented in~\cite{toomey2000}.
It is based on a two- or three-dimensional square lattice of particles.
Each particle is connected to all of its cardinal and diagonal neighbors.
The model for the force acting between two particles can be elastic or viscoelastic.
Various models are explored in~\cite{obrien2008, obrien2014, obrien2021, obrien2009}.
\paragraph{\acrfull{lbm}}
The \acrshort{lbm} is an established approach for modeling the dynamics of fluids~\cite{krueger2017}.
Also based on a square lattice, this model discretizes the particles moving at a particular lattice space into the cardinal and diagonal directions.
Research suggests that the \acrshort{lbm} can be used for modeling both solids~\cite{maquart2022} and viscoelastic fluids~\cite{malaspinas2010}.
Perhaps for this particular use case, a \acrshort{lbm} could be configured to model the \acrshort{ecm}.
\section{Contribution}
In this work I will explore lattice-based viscoelastic simulations of the \acrshort{ecm} in the \acrshort{cpm}.
\subsection{Method}
In order to model cell-matrix interactions, I will develop a method that allows cells to influence the \acrshort{ecm} simulation.
To model matrix-cell interactions, I will expand the Hamiltonian of the \acrshort{cpm} to include a term dependent on the local configuration of the \acrshort{ecm}.
This should make it possible for my model to simulate the strong coupling of cells and \acrshort{ecm}.
I will explore which of the models listed above is the most promising and compare to them to existing approaches.
For the \acrshort{cpm} I will use the distributed implementation \acrshort{cis}.
\acrshort{cis} is based on the \acrshort{nastja} framework implemented using \acrshort{mpi}, which I will use to implement my model of the \acrshort{ecm}.
In order to reduce simulation times I will employ implementation techniques such as \acrshort{gpu} programming.
As the implementation performance of my model will depend on several interconnected factors such as cache efficiency, network characteristics and \acrshort{gpu} communication cost I will need to benchmark multiple implementations on a common test setup.
\subsection{Challenges}
My preliminary experiments have produced some questions and likely challenges that my work will need to address.
\paragraph{Spatial Scale}
While I could simply use the same lattice for the \acrshort{ecm} model as for the \acrshort{cpm}, it is not clear that this will deliver the best results.
It could be useful to use a scaled lattice, e.g.\ where the lattice spacing of the \acrshort{ecm} model is twice as long.
\paragraph{Temporal Scale}
Compared to cells, the waves in a viscoelastic material move quickly.
It is likely that my model of the \acrshort{ecm} will have to go through multiple time steps between the \acrshortpl{mcs} of the \acrshort{cpm}.
In the context of \acrshort{nastja}, this means an increased number of halo exchanges between ranks per \acrshort{mcs}.
In order to reduce the number of halo exchanges, one could increase the width of the halo which allows the \acrshort{ecm} simulation to run for multiple time steps between halo exchanges.
As this approach necessarily leads to diminishing returns as the halo data gets bigger, an efficient configuration needs to be investigated.
\paragraph{Integration with \acrshort{nastja}}
Since \acrshort{cis} is implemented in \acrshort{nastja}, my implementation of the \acrshort{ecm} model will be implemented in \acrshort{nastja} as well.
This implies the requirement that the model time steps can be represented as a stencil, i.e.\ that the update function for each lattice only depends on its neighbors.
In short, my model will have to exhibit the characteristics necessary to model the \acrshort{ecm} collagen networks while
Both the discrete particle method and \acrshort{lbm} listed above fulfill this requirement.
\paragraph{Implementation Performance}
As \acrshort{cis} is designed to large and therefore compute-heavy simulations, it is worthwhile to measure the and optimize the computational resources needed by my implementation.
Since the discrete particle method is a dense approach, it should be possible to leverage common parallelization techniques such as vectorization and \acrshort{gpu} programming to improve performance.
In particular, it might prove useful to run the \acrshort{cpm} on \acrshortpl{cpu} and the \acrshort{ecm} model of \acrshortpl{gpu}.
I will experiment with these techniques and evaluate the possible improvements.
\paragraph{Test Setup for Benchmarking}
In order to benchmark my implementation I will need to develop a common test setup as a base for comparisons.
At first, I will investigate the efficiency of the \acrshort{ecm} model implementation without the \acrshort{cpm} coupling.
Then, I will develop a test setup involving a \acrshort{cpm} coupled with my own model, as this introduces additional factors influencing the results.
\subsection{Research Questions}
My work can be structured into three parts:
\paragraph{How can \acrshort{ecm} viscoelasticity be modeled in \acrshort{cis}?}
\acrshort{ecm} viscoelasticity plays an important role in cell behavior and collective cell behavior.
I will develop a theoretical model based on a square lattice such that it can be coupled with the \acrshort{cpm}.
Requirements for this model are the ability to simulate viscoelastic behavior and that it can be represented as a stencil.
\paragraph{How is the model implemented in \acrshort{nastja}?}
Given the stencil representation of my model, I will develop a distributed implementation using the \acrshort{nastja} framework.
For this implementation I will employ techniques such as vectorization and \acrshort{gpu} programming to achieve greater performance.
I will report implementation efforts and difficulties.
This process ties back in with the development of the \acrshort{ecm} model as implementing it might reveal inconsistencies.
\paragraph{How do implementation techniques benchmark against each other?}
I will compare different implementations of my model with each other by benchmarking them on a common test setup.
By varying parameters, e.g.\ block size and number of ranks, I can find which configuration works best for which implementation.
I will observe network and rank scaling behavior to further improve my implementation.
\section{Timeline}
The timeline for this work is sketched in \autoref{fig:timeline}.
Using the findings from my research, I will develop a suitable \acrshort{ecm} model.
Starting the implementation early, I will use an explorative programming approach to confirm that the developed model can be in fact implemented in \acrshort{nastja}.
As soon as the model is develop, I will prepare a test setup for benchmarking.
This makes it possible evaluate and quickly iterate different approaches.
During this time I can also get started documenting the theoretical model and the implementation approaches.
\begin{figure}[h]
\begin{ganttchart}[
time slot format=isodate,
% Experimentally confirmed
y unit title=0.5cm,
y unit chart=0.8cm,
title height=1,
expand chart=\textwidth
]{2023-06-01}{2023-12-31}
\gantttitlecalendar{year, month} \\
\ganttbar{Research}{2023-06-01}{2023-07-31} \\
\ganttbar{Modeling}{2023-07-01}{2023-08-14} \\
\ganttbar{Implementation}{2023-07-15}{2023-10-31} \\
\ganttbar{Evaluation}{2023-08-15}{2023-11-30} \\
\ganttbar{Writing}{2023-08-15}{2023-12-31}
\end{ganttchart}
\caption{Gantt chart of the project timeline}%
\label{fig:timeline}
\end{figure}
\newpage
\begin{refcontext}[sorting=nyt]
\printglossary[type=\acronymtype, nogroupskip]
\end{refcontext}
\printbibliography{}
\end{document}