From 34e41cbbfc0b1e83141f93514c1ef5efbce2e0dd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?St=C3=A9phane=20Del=20Pino?= <stephane.delpino44@gmail.com> Date: Tue, 3 May 2022 19:21:18 +0200 Subject: [PATCH] Improve introduction --- doc/userdoc.org | 207 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 169 insertions(+), 38 deletions(-) diff --git a/doc/userdoc.org b/doc/userdoc.org index f7be8d606..52eca06e1 100644 --- a/doc/userdoc.org +++ b/doc/userdoc.org @@ -4,13 +4,13 @@ #+PROPERTY: header-args :comments no #+OPTIONS: timestamp:nil #+OPTIONS: h:3 num:t toc:3 -#+TITLE: Pugs User Manual +#+TITLE: The pugs user manual #+OPTIONS: author:nil date:nil #+OPTIONS: tex:t #+LANGUAGE: en #+ATTR_LATEX: :width 4cm -#+HTML_HEAD_EXTRA: <style> pre.src-pugs:before { content: 'Pugs'; } </style> +#+HTML_HEAD_EXTRA: <style> pre.src-pugs:before { content: 'pugs'; } </style> #+LATEX_CLASS_OPTIONS: [10pt] #+LATEX_HEADER: \usepackage[hmargin=2.5cm,vmargin=1.5cm]{geometry} @@ -141,61 +141,192 @@ already be discussed. ** Concepts and design -*** TODO A C++ toolbox driven by a user friendly language - -- Why? divide and conquer - - the language is used to assemble the provided C++ tools - - small independent C++ methods are easy to test/validate. - - new numerical method brings a new C++ code: not a patched code. - A previously validated method is unchanged! - - much more difficult to introduce bugs in existing methods - - existing methods performances are a unchanged by new - developments -- Why not python or any other scripting language? - - provide a language close to the application: a DSL is made for - that - - do not deal with the difficulty of parallelism within the - scripting language -- Following the success of ~FreeFEM~ +As it was already stated ~pugs~ can be viewed as a collection of +numerical methods or utilities that can be assembled together, using a +user friendly language, to build simulation scenarios. + +Utilities are tools that are often used in numerical +simulations. Examples of such utilities are mesh manipulations, +definition of initial conditions, post-processing, error +calculations,... + +*** A C++ toolbox driven by a user friendly language + +Numerical simulation packages are software of a particular +kind. Generally, in order to run a calculation, one has to define a +set of data and parameters. This can simply be definition of a +discretization parameter such as the mesh size. One can also specify +boundary conditions, equations of state, source terms for a specific +model. Choosing a numerical method or even more setting the model +itself is common in large code. + +In ~pugs~, all these "parameters" are set through a +DSL[fn:DSL-def]. Thus, when ~pugs~ is launched, it actually executes the +provided script. A ~C++~ function is associated to each instruction of +the script. The ~C++~ components of ~pugs~ are completely unaware of the +other ones. ~pugs~ interpreter is responsible of data flow between the +components: it manages the data transfer between those ~C++~ components +and ensure that the workflow is properly defined. + +**** Why? + +In this section we motivate the choice of a language and not of a more +standard approach. + +***** Data files are evil + +There are lots of reasons not to use data files. By data file, we +refer to a set of options that describe physical models, numerical +methods or their settings. + +- Data files are not flexible. This implies in one hand that + application scenarios must be known somehow precisely to reflect + possible options combinations and on the other hand even defining a + specific initial data may require the creation of a new option and + the associated code (in ~C++~ for instance). \\ + It is quite common to fix the last point by adding a local + interpreter to evaluate user functions for instance. +- Data files may contain irrelevant information. Actually, it is quite + common to allow to define options that are valid but irrelevant to a + given scenario. This policy can be changed but it is generally not + an easy task and require more work from the user (which can be a + good thing). +- It is quite common that data files become obsolete. An option was + not the right one, or its type depended of a simpler context... This + puts pressure on the user. +- Even worst options meaning can depend on other + options. Unfortunately, it happens quite easily. For instance, a + global option can change implicitly the treatment associated to + another one. This is quite dangerous since writing or reading the + data file requires an important knowledge of the code internals. +- Another major difficulty when dealing with data files is to check + the compatibility of provided options. + +***** Embedded "data files" are not a solution + +Using directly the general purpose language (~C~, ~C++~, ~Fortran~,...) used +to write the code can be tempting. It has the advantage that no +particular treatment is necessary to build a parser (to read data +files or a script), but it presents several drawbacks. + +- The first one is probably that it allows to much freedom. While + defining the model and numerical options, the user has generally + access to the whole code and can change almost everything, even + things that should not be changed. +- Again, easily one can have access to irrelevant options and it + requires a great knowledge of the code to find relevant ones. +- With that regard, defining a simulation properly can be a difficult + task. For instance, in the early developments of ~pugs~ (when it was + just a raw ~C++~ code) it was tricky to change boundary conditions for + coupled physics. +- Another difficulty is related to the fact that code's internal API + is likely to change permanently in a research code. Thus valid + constructions or setting may become rapidly obsolete. In other + words keeping up to date embedded "data file" might be difficult. +- Finally it requires recompilation of pieces of code (which can be + large in some cases) even if one is just changing a simple + parameter. + +***** Benefits of a DSL + +Actually, an honest analysis cannot conclude that a DSL is the +solution to all problems. However, it offers some advantages. + +- First it allows a fine control on what the user can or cannot + performed. In some sense it offers a chosen level of flexibility. +- It allows to structure the code in the sense that new developments + have to be designed not only focusing on the result but also on the + way it should be used (its interactions with the scripting language). +- In the same idea, it provides a framework that should ease the + definition of "do simple things and do it well". +- There are no hidden dependencies between numerical options: the code + is easier to read and it is more difficult to get obsolete (this is + not that true in early development since the language itself and + some concepts are still likely to change). +- The simulation scenario is *defined* by the script, it is the + responsibility of the user in some sense and not to the charge of + the code to check its meaning. + + +**** ~pugs~ language purpose + +~pugs~ language is used to assemble ~C++~ tools which are well +tested. These tools should ideally be small pieces of ~C++~ code that +do one single thing and do it well. + +Another purpose of the language is to allow perform high-level +calculations. In other words, the language defines a data flow and +checks that each ~C++~ piece of code is used correctly. Since each piece +of code acts as a pure function (arguments are unchanged by calls), +the context is quite easily checked. + +Finally it aims at simplifying the definition of new methods since +common utilities are available directly in simple scipts. + +**** TODO The framework (divide and conquer) + +- small independent C++ methods + - easy to test/validate + - do simple things the right way +- new numerical method brings a new C++ code: not a patched code. + A previously validated method is unchanged! + - much more difficult to introduce bugs in existing methods + - existing methods performances are likely to be unchanged by new + developments +- utilities are not developed again and again: + - safer code + - important code is not polluted by environment instructions (data + initialization, error calculation, post-processing,...) + +**** TODO Why not python or any other scripting language? + +- provide a language close to the application: a DSL is made for that. +- general purpose languages offer to much freedom: it is not easy to + protect data +- do not deal with the difficulty of parallelism within the + scripting language +- python is ugly *** TODO A high-level language Defining a suitable language -**** TODO Keep it simple +**** Keep it simple -**** TODO Performances and parallelism +**** Performances and parallelism -The language *does not allow* low-level manipulations of high-level type -variables. This means that for instance, one cannot access or modify -specific cell or node value. A consequence of such a choice is that -this *constrained framework* allows to write algorithms (within the pugs -language) that can be executed in parallel naturally. To achieve that, -there should *never* be parallel instructions or instructions that -require *explicit parallel* actions in the ~pugs~ language. +- The language *does not allow* low-level manipulations of high-level + type variables. This means that for instance, one cannot access or + modify specific cell or node value. A consequence of such a choice + is that this *constrained framework* allows to write algorithms + (within the pugs language) that can be executed in parallel + naturally. To achieve that, there should *never* be parallel + instructions or instructions that require *explicit parallel* actions + in the ~pugs~ language. +- Not providing low-level instruction simplifies the code and reduces + the risks of errors * TODO Language -** TODO Variables +** Variables -*** TODO Types<<variable-types>> +*** Types<<variable-types>> -**** TODO Basic types +**** Basic types -**** TODO High-level types +**** High-level types -*** TODO Life time +**** Tuples -** TODO Statements +*** Lifetime -** TODO Functions<<functions>> - -*** TODO User-defined functions - -*** TODO Builtin functions +** Statements +** Functions<<functions>> +*** User-defined functions +*** Builtin functions [fn:pugs-def] ~pugs~: Parallel Unstructured Grid Solvers [fn:MPI-def] ~MPI~: Message Passing Interface -- GitLab