HIGH PERFORMANCE COMPUTING AND SIMULATIONS (Fall 2025)
Course Number: CSCI 653
Section: 30072
Instructor:
Aiichiro Nakano;
office: VHE 610; phone: (213) 821-2657; email: anakano@usc.edu
Lecture: 2:00-3:50pm M W, THH 108
Office Hours: 4:00-5:00pm W, VHE 610
Prerequisites:
(1) CSCI 596 (Scientific Computing and Visualization); or
(2) basic knowledge of numerical methods (CSCI 501, PHYS 516 or equivalent) +
parallel computing (EE 451 or equivalent) +
3D graphics (CSCI 580 or equivalent).
Textbooks:
D. Frenkel and B. Smit,
Understanding Molecular Simulation: From Algorithms to Applications, 2nd Ed.
(Academic Press, 2001)
A. Grama, A. Gupta, G. Karypis, and V. Kumar,
Introduction to Parallel Computing, 2nd Ed.
(Addison-Wesley, 2003)
W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling,
Numerical Recipes, 3rd Ed. (Cambridge Univ. Press, 2007)
Course Description
Provide students with advanced techniques that are common to high performance computer simulations
in science and engineering. Scalable algorithms for both deterministic and stochastic simulations
of particles and continuum will be implemented on massively parallel and
distributed computing platforms, and the simulation datasets will be visualized and analyzed
in immersive and interactive virtual environment. For details, please see
course information sheet.

Visualization of divide-conquer-recombine based simulation of
photoexcitated electron-hole pairs in organic solar cells.
Announcements
- 8/25 (M): Class begins.
- 8/29 (F): If you have not used CARC (Center for Advanced Research Computing)
cluster, please attend the CARC workshop on Introduction to Scientific Computing
(10 am-12 pm); register
here.
- 8/29 (F): CARC computing accounts
will be requested for students registered in the first week.
- 9/1 (M): Labor Day; no class.
- 9/3 (W): CARC account has been requested for all students.
- 9/5 (F): Assignment 1 due at 11:59 pm.
- 9/8 (M): Seminar by Dr. Danica Adams (Harvard) on
Photochemistry at Mars, Venus and exoplanets through time
at 4:15 pm in SSL 202.
- 9/10 (W): Office hour to discuss assignment 2 at 4 pm in VHE 610.
- 9/15 (M): Assignment 2 due at 11:59 pm.
- 9/24 (W): Office hour to discuss assignment 3 at 4 pm in VHE 610.
- 9/29 (M): Assignment 3 due at 11:59 pm.
- 10/1 (W): A paper on low-rank approximation for the fine tuning of AI
foundation models has been posted in the class schedule below, along with
lecture notes on singular value decomposition and Krylov subspace methods.
- 10/1 (W): See
Alphabet/Google Modeling Talk Series.
- 10/1 (W): See a new AI startup on materials science,
Periodic Labs co-founded by our collaborator,
Dr. Dogus Cubuk;
job openings.
- 10/6 (M): See
Platform for Advanced Scientific Computing (PASC) Conference
(June 29-July 1, 2026, Bern, Switzerland); there will be an ACM student poster competition.
- 10/7 (T): See
2025 Nobel physics prize for the discovery of macroscopic quantum tunnelling,
which laid the foundation of superconducting quantum computers.
- 10/8 (W): Please see the
Viterbi memo on the in-person class attendance requirement for international students.
- 10/8 (W): Office hour to discuss assignment 4 at 4 pm in VHE 610.
- 10/13 (M): Seminar by
Prof. Matilde Marcolli (Caltech) on
The algebraic structure of human language resembles the physics of renormalization
at 4:15 pm in SSL 202;
see also her talk on Generative grammar and large language models.
- 10/15 (W): Assignment 4 due at 11:59 pm.
- 10/22 (W): Office hour to discuss assignment 5 at 4 pm in VHE 610.
- 10/27 (M): Assignment 5 due.
- 11/3 (M): See the
announcement
of the new Equinox supercomputer at Argonne National Laboratory.
- 11/5 (W): Office hour to discuss assignment 6 at 4 pm in VHE 610.
- 11/11 (T): See
Alphabet Modeling Talk Series at 9 am on Google Meet.
- 11/12 (W): Assignment 6 due.
- 11/12 (W): Special office hour time at 5:30 pm in VHE 610.
- 11/19 (W): No office hour due to SC25 conference.
- 11/26 (W): Thanksgiving Holiday; no class.
- 12/12 (F): Final project report (GitHub repository) due.
Class Schedule
- 8/25 (M): Course information;
high-performance computing and simulations (HPCS) courses
- 8/27 (W):
Introduction; assignment 1 discussion
- 9/3 (W): Survey of molecular dynamics (MD):
notes and
slides;
Fast multipole method (FMM)--multiresolution in space:
slides
- 9/8 (M): Assignment 2 discussion--complexity analysis & floating-point performance of FMM;
Arithmetic implementation of sqrt() & floating-point performance: slides;
Performance Application Programming Interface (PAPI);
Math quiz
- 9/10 (W): Fast multipole method (FMM) details:
minimal complex analysis;
electrostatic potential around a charged line;
multipole expansion of 2d Coulomb potential;
fast multipole method algorithm in 2d;
Multiple time stepping--multiresolution in time: slides
- 9/15 (M): Message passing interface (MPI): notes and
slides;
Parallel MD algorithms: notes and
slides;
scalability analysis of parallel
MD and FMM algorithms
- 9/17 (W): Divide-and-conquer (DC) parallelism:
notes and
slides
- 9/22 (M): Hypercube quick sort & assignment 3 discussion
- 9/24 (W): Hybrid MPI+OpenMP programming;
Precision & Flop/s performance;
Final-project and paper discussions
- 9/29 (M): Survey of quantum dynamics (QD): QD basics;
spectral QD and fast Fourier transform (FFT);
parallelizing QD (slides)
- 10/1 (W): Multiresolution numerical methods: slides;
wavelets (Numerical Recipes, Sec. 13.10);
multiresolution analysis using wavelets;
Low-rank approximation for fine tuning AI foundation models:
E. Hu et al.,
LoRA: low-rank adaptation of large language models (ICLR21);
A. Aghajanyan et al.,
Intrinsic dimensionality explains the effectiveness of language model fine-tuning (ACL21);
Fast Hadamard transform
(cf. smooth-detail decomposition in wavelets);
singular value decomposition (SVD);
Krylov subspace & Lanczos methods
- 10/6 (M): Assignment 4 discussion: parallel (asynchronous) MPI+OpenMP
programming of wavelet image compression
- 10/8 (W): Graphics processing unit (GPU) programming using CUDA;
Quantum computational science
- 10/13 (M): Hybrid MPI+OpenMP+CUDA programming;
parallel QD programming using MPI
- 10/15 (W): Assignment 5 discussion: triple-decker MPI+OpenMP+CUDA QD programming
- 10/20 (M): Multiscale simulation methods;
Shaw's neutral-territory MD algorithm
- 10/22 (W): Load balancing: slides;
Lanczos method for eigensystems (slides and
supplementary notes) used in spectral bisection
load balancer;
Paper discussion --
paper 1 (James),
paper 2 (Nan)
- 10/27 (M): OpenMP target offload
for heterogeneous data parallel computing;
Paper discussion --
paper 1 (Li),
paper 2 (Tian)
- 10/29 (W): SYCL
for unified heterogeneous parallel computing;
Paper discussion --
paper 1 (Pouya),
paper 2 (Ao),
paper 3 (Jonny & Yuxiao)
- 11/3 (M): Assignment 6 discussion: open GPU programming;
Paper discussion --
paper 1 (Changmook),
paper 2 (Dahye),
paper 3 (Jerry)
- 11/5 (W): Optimizing parallel MD: slides;
lecture note by Prof. Jim Demmel (UC Berkeley) on
memory hierarchies and matrix multiplication;
Order-invariant summation;
Paper discussion --
paper (David Chu)
- 11/10 (M): Final-project discussion: slides;
Paper discussion --
paper 1 (David Cha),
paper 2 (Letao),
paper 3 (Pranav)
- 11/12 (W): Quantum computational science: slides;
qubits and quantum circuits;
quantum dynamics simulation;
Paper discussion --
paper (Shrey)
- 11/17 (M): Lecture on massive dataset visualization;
Paper discussion --
paper 1 (Tuan);
paper 2 (Binhao);
paper 3 (Hao);
paper 4 (Shriya)
- 11/19 (W): Lecture on scientific data mining and machine learning;
Paper discussion --
paper 1 (Jon);
paper 2 (Yoomin)
- 11/24 (M): Lecture on Monte Carlo (MC) simulations: MC basics;
parallel kinetic MC simulation;
long time dynamics and global optimization;
Paper discussion --
paper 1 (Zhangyu);
paper 2 (Huanyu);
paper 3 (Zhiyuan);
paper 4 (Yash)
- 12/1 (M): Final-project presentations (1):
see
final all-star lineup
- 12/3 (W): Final-project presentations (2)