Computational Science Internships, ETH Zurich, Switzerland

Founded in 1991, the Swiss National Supercomputing Centre (CSCS) develops and provides the key supercomputing capabilities required to solve important problems for science and/or society. The centre is operated by ETH Zürich and is located in Lugano with additional offices in Zurich.

Project background

The goal of the CSCS Internship Program is to give to Swiss students and/or students enrolled in Swiss universities the opportunity to participate in a professional work environment, documenting their experiences. The students should build on previous knowledge and experience by being fully immersed in a professional setting.

Job description

We are offering three internship topics for 2023:

  • GT4Julia – GridTools for Julia
  • Unified Memory Management tools for HPC projects
  • Distributed in-situ data analysis and visualization in Julia

GT4Julia – GridTools for Julia

In our group we develop and maintain a set of libraries for weather and climate applications (https://github.com/GridTools). One of the tools is a Python framework with an (embedded) domain specific language for weather and climate scientists to express their algorithms in a high-level language which is then compiled to optimized versions for different hardware architectures by generating efficient code in C++, CUDA or other languages (GT4Py toolchain). Python is currently the language of choice for domain scientists. However, Julia is getting traction in the scientific community and could be the next language for scientific computing. In this internship we would like to explore how a Julia frontend for our toolchain could look like. Additionally we would like to study if our framework is flexible enough to easily allow frontends in different languages.

Goals

  • Design of a domain specific language in Julia.
  • Integrate the Julia frontend with our existing Python infrastructure.

Milestones

  • Design the syntax of the DSL, by implementing a meaningful subset of the existing features, and transform it into non-optimized Julia code.
  • Transform the Julia representation of the DSL by using contextual dispatch (e.g Casette.jl or related technologies) into one of the intermediate representations of our toolchain.
  • Add capability for on-the-fly compilation of the generated C++ code and linking it back to Julia. Implement infrastructure to call generated CUDA code with gpu arrays (e.g. CuArray from Cuda.jl).
  • Optional: Implement a transpiler from FOAST or the internal representation in GT4Py (ITIR) to the Julia DSL to enable easy transition from the Python DSL to the Julia DSL.

The candidate should have prior knowledge of Julia and is keen to deepen their knowledge on the language. Basic knowledge of Python is helpful. The project duration can be 4 to 6 months based on the candidates availability.

  • Expected duration: 4-6 months
  • Mentors: Till Ehrengruber, Hannes Vogt

Unified Memory Management tools for HPC projects

Numerous projects at CSCS make use of specialised allocators to improve performance. The reasons they are needed arise from

  • Repeated/frequent allocation/release of memory in multithreaded code where data pointers may be passed between threads and freed from a thread different from the allocating thread.
  • Use of memory buffers that require special handling for RMA access. This memory must usually be pinned by the operating system prior to use in network or host/device transfers which adds overheads.
  • Use of memory buffers of fixed sizes in small tight code regions that might benefit from a dedicated allocator rather than a general purpose one.

Allocators themselves fall into 2 primary categories

  • General purpose allocators that are used throughout a program at a ‘system’ level to replace the global malloc/free or new/delete functions/operators. Allocators such as jemalloc, tcmalloc, mimalloc have all been used by projects and usually provide much better performance than the default (system) allocator provided by the C++ runtime libraries.
  • Custom allocators that are used in localised portions of code for a specialised purpose. Examples of this are umpire (for GPU memory buffer pools), hwmalloc (for network/message buffers) and MAMBA (GPU/NUMA/memory hierarchies). These allocators may themselves be wrappers over other allocators and provide buffering/caching to avoid extra OS/kernel low-level calls by allocating large pages and amortising overheads.

Each allocator generally has some feature that makes it desirable and optimal for a particular purpose (such as user defined hooks to customise behaviour, or user defined arenas to enhance cache use), but the features exposed by each allocator are not generally portable from one to another due to mismatched APIs, and implementation details that require adaptation for each use case, and when a feature is available, it may break when combined with user hooks or other requirements.

The aim of this project is to conduct a literature review of different allocators, perform benchmarking, asses their strengths and weaknesses and ultimately, produce a library of allocator tools that can be used to wrap one or more of the currently available allocators such that ideally a single API (c.f. C++ std:: allocators) can be exposed (and reused across projects) for features such as caching, memory pinning, static pools etc. Where possible, the communities developing existing allocators should be approached and feature requests/improvements submitted to enhance the external libraries to keep the in-house code maintenance as small as possible.

  • Duration 3-6 months
  • Skills required, C++, CMake
  • Mentors : John Biddiscombe, Fabian Bösch,

Distributed in-situ data analysis and visualization in Julia

Julia is a programming language that was designed to solve the “two-language problem”, the problem that prototypes written in an interactive high-level language like MATLAB, R or Python need to be partly or fully rewritten in lower-level languages like C, C++ or Fortran when a high-performance production code is required. Julia, which has its origins at MIT, can however reach the performance of C, C++ or Fortran despite being high-level and interactive. This is possible thanks to Julia’s just-ahead-of-time compilation. Julia was shown to be suitable for scientific GPU supercomputing at large scale, enabling nearly ideal scaling on thousands of GPUs on Piz Daint.

Supercomputing at large scale requires well-designed workflows for data analysis and visualization. The ability to perform data analysis and visualization in a distributed, parallel fashion are key to most of these workflows. In addition, to perform these tasks in-situ can be a great plus given the massive amount of data scientific supercomputing simulations produce. Workflows for data analysis and visualization at large scale are in active development in the upcoming language Julia.

The objective of this internship is to evaluate work that has been done for distributed data analysis and visualization within the Julia community and to design new workflows by combining the usage of existing packages and tools.

Concretely, the tasks of this internship are the following:

  • Do an inventory of existing packages and workflows for (in-situ) data analysis and visualization in Julia.
  • Design new workflows by combining the usage of existing packages and tools (including tools outside of the Julia ecosystem).
  • Evaluate the new workflows.
  • Develop a small Julia package that simplifies the workflow(s) evaluated best.
  • Create some visually attractive distributed (in-situ) visualization examples using the selected workflow(s) (e.g. visualization of glacier flow or tsunami simulation – applications can be provided by the mentors or, depending on the interns skills and interests, he can create his own multi-GPU application).
  • Duration: 4-6 months
  • Mentors: Dr. Samuel Omlin, Dr. Jean Favre
  • Requirements: Great programming skills and a solid understanding of distributed computing (MPI etc.) and parallel computing in general; experience with Julia or distributed visualization and parallel I/O (in particular ADIOS2) are a plus.

You can find more details on the CSCS webpage:

https://www.cscs.ch/about/working-at-cscs/internships/

Your profile

The requirements for the internships are the following:

  • You are a Master student in Computational Science and Engineering
  • You are enrolled in a Swiss university and
  • You have a valid residence permit (non-swiss applicants)

We offer

Internships in 2023 with duration of 2–6 months and a salary of 2’500.00 CHF/month. During this period the intern will be mentored by and collaborating with HPC experts in the centre.

chevron_rightWorking, teaching and research at ETH Zurich

We value diversity

In line with our values, ETH Zurich encourages an inclusive culture. We promote equality of opportunity, value diversity and nurture a working and learning environment in which the rights and dignity of all our staff and students are respected. Visit our Equal Opportunities and Diversity website to find out how we ensure a fair and open environment that allows everyone to grow and flourish.

Curious? So are we.

We look forward to receiving your online application with the following documents:

  • CV
  • Cover letter (mentioning which of the topic(s) you are interested in)
  • Scan of your swiss residence permit (for non-swiss applicants)

Please note that we exclusively accept applications submitted through our online application portal. Applications via email or postal services will not be considered.

Further information about the CSCS Internship Program 2023 can be found on our website https://www.cscs.ch/about/working-at-cscs/internships/.

Questions regarding the position should be directed to Dr. Guilherme Peretti-Pezzi by email <guilherme.peretti-pezzi@cscs.ch> (no applications).

For recruitment services the GTC of ETH Zurich apply.