COM4521/COM6521: Parallel Computing with Graphical Processing Units (GPUs)

This page refers to a previous academic year and has now been deprecated, links may nolonger work. The live module page has now been moved to Blackboard per university policy. Contact for further information.

Course Information

Welcome to the 2022/2023 module page for COM4521/COM6521 Parallel Computing with GPUs.

Accelerator architectures are discrete processing units which supplement a base processor with the objective of providing advanced performance at lower energy cost. Performance is gained by a design which favours a high number of parallel compute cores at the expense of imposing significant software challenges. This module looks at accelerated computing from multi-core CPUs to GPU accelerators with many TFlops of theoretical performance. The module will give insight into how to write high performance code with specific emphasis on GPU programming with NVIDIA CUDA GPUs. A key aspect of the module will be understanding what the implications of program code are on the underlying hardware so that it can be optimised.

The modules aims, objectives and assessment details are available on the module’s public teaching page.

The module was first developed by Professor Paul Richmond. His previous webpage for the module can be found here.

Software for the Module

The module’s programming exercises are designed to be completed on PCs in the Diamond compute labs. All Diamond compute lab machines have Visual Studio 2022 and CUDA 11.7.1 If you intend to use your own machine for programming exercises (on the CUDA part of the module) then you must install the latest Community version of Visual Studio 2022 before you install the CUDA toolkit. Lab and assignment code should work with other recent versions of CUDA, however may require changing the targeted version of CUDA when opening the Visual Studio project.

If you want to complete the exercises in Linux then example Makefiles will be provided with the lab (and assignment) starting code and solutions. It is not possible to build Linux CUDA programs on PCs in the Diamond compute labs.

Computers & Labs Available

As the module requires access to a machine with a GPU the following have been made available to you.

  • All diamond Compute Labs (other than High spec lab) - All diamond ‘all in one’ machines have an NVIDIA GTX1050 (Pascal generation) GPUs. Dedicated lab classes are available most weeks, reserving these machines. You can find machine availability outside of lab times by using Find a PC
  • Diamond High Spec Lab - These are higher spec machines with NVIDIA Quadro 5200 (Pascal generation) GPUs.
  • Other University Machines - As of 2021 the following buildings have computer labs containing machines with GPUs: Heartspace, IC, Hicks, Firth Court, Elmfield. It may be necessary to install Visual Studio and the CUDA toolkit on these machines on first use.
  • Your own Windows/Linux machine - Follow the instructions under “software for this module”.
  • University HPC - For more information refer to dedicated guide.

Course Attendance Monitoring

A register may be taken during some lab classes for attendance monitoring, additionally a register will be collected (based on UCards) at each Blackboard quiz.

Important Note: It is not possible to properly understand the course material without completing the labs and reviewing the solutions. If you do not complete the labs then you will find the assignment difficult. The first lecture will provide some insight into how course engagement affects assessment performance.


It is recommended that all students attend the first lecture in person. Subsequent lectures will be delivered in-person, and made available using the flip classroom approach. Attending the lectures has the benefit that there will often be time for questions to be answered. The flip classroom lecture content has been pre-recorded into bite sized chunks of ~10-15m each by the previous lecturer (Prof. Paul Richmond). If you choose not to attend the in-person lectures, you are expected to listen to each week’s flip classroom lectures in advance of the corresponding lab.

In Person Lab Classes

The lab sessions occur weekly in the Diamond’s Computer Room 1 from 09:00-11:00 on Tuesdays. These lab sessions are attended by myself and a number of graduate teaching assistants (GTA) able to answer questions and provide support in understanding the module’s content.

The lab classes have been designed to re-enforce the material which you will observe in the lectures by applying the techniques and approaches to specific problems. You should aim to attempt the lab classes exercises prior to attending the lab class (i.e. the week before) and use the labs to obtain help in understanding and applying the taught content. The lab class solutions are commented to provide insight. The solutions are available in advance of the lab so if you are stuck on a particular exercise then review these to move on and seek help in understanding the problem and solution in the lab class. Within the labs, pair programming or work within small groups is encouraged but left to personal preference. Discussion is encouraged.

During the lab class there will be an opportunity to discuss and review lecture content, lecture examples and lab solutions. Guided walkthroughs of certain parts of the lab solutions may be provided.

Although the labs are structured around the lecture material each week you can (and should) ask for help regarding any of the labs during the scheduled lab time. The labs are also used for assignment help. You should start this early.

DDP students & Staff Candidates

PhD students and staff are able to take the module subject to capacity limitations (taught students have priority).

PhD students and research/academic staff are not required to undertake assessment but DDP students are expected to attend labs as evidence of participation in the module. You should ensure that you enroll for the course via DDP to ensure that you have access to the Blackboard.

If you are a member of staff wishing to attend the module, please contact the Computer Science teaching admins (com-teaching(at) so that they can process your request.

Discussion, Announcements and Requests for Help

A Google group has been created for announcements, help and discussion. Any important announcements relating to the module will be made via this group. All students enrolled on the module on the 2nd February 2023 have been added to this group already. Likewise staff and PhD students who expressed an interest in the course via the Google form have been added. If you have transferred via Add/Drop then you will need to manually join the group yourself. The group is monitored by staff (including lab assistants) who can provide help. The purpose of the mailing list is to ask for general support and guidance with the course material (e.g. with concepts and ideas) rather than posting your own code. You should not post your assignment code on this forum. If you require personal assistance with your assignment code then you should request this during the lab hours. Any lab class can be used for assignment help in addition to the lab exercises which are set each week.

Course Material

In-person lectures will be delivered from 10:00-12:00 on Mondays in Broad Lane Lecture Theatre 2 (accessed via either Mappin or Pam Liversidge). Due to the national bank holidays, normal lectures in weeks 10 and 11 are not possible. Therefore, an additional lecture has been added at 09:00-11:00 on Friday of week 10 in Pam Liversidge’s Lecture Theatre 1.

Additionally, the week 12 lecture has been replaced with an additional assignment help lab class in Diamond Computer Room 2.

Pre-recorded lectures from the previous lecturer are available on the COM4521 Parallel Computing with Graphical Processing Unit’s Kaltura Channel or as downloadable pdfs on Google Drive. These cover the same content as the in-person lectures, required for the Blackboard quizzes and assignment.

Each week’s practical activities (the labs) follow the ideas presented in the lectures so it is important that you follow the lecture and lab timetable below.

Week 1

Lecture 1

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Lab 1 - Introduction to Visual Studio & C Programming

Tuesday 9:00 Diamond CR1

Week 2

Lecture 2

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Lab 2 - Memory & Performance

Tuesday 9:00 Diamond CR1

Week 3

Lecture 3

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Lab 3 - OpenMP

Tuesday 9:00 Diamond CR1

Week 4

Lecture 4

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Assignment Handout

The assignment will be handed out via Blackboard after Monday’s lecture.

Lab 4 - Introduction to CUDA

Tuesday 9:00 Diamond CR1

Week 5

No Lecture <!–

Blackboard Quiz 1

Tuesday 9:00 Diamond CR1

This quiz held during the normal lab timeslot, covers the content from lectures 1-3 and must be attended in person as it will be held in exam conditions with invigilation.

The quiz consists of 25 multiple choice questions, and must be completed within 45 minutes. –> ——————–

Week 6

Lecture 5

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Lab 5 - CUDA Memory

Tuesday 9:00 Diamond CR1

Week 7

Lecture 6

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Lab 6 - Shared Memory

Tuesday 9:00 Diamond CR1

Week 8

Lecture 7

Monday 10:00 Broad Lane LT2

Flip Classroom Pre-recorded Lectures

Lab 7 - Atomics & Primitives

Tuesday 9:00 Diamond CR1


Week 9

Lecture 8

Monday 10:00 Broad Lane LT2

Nsight Systems & Nsight Compute

If you wish to profile on your personal computer (or HPC) with a GPU newer than the Pascal architecture, you will need to instead use the newer profiling tools Nsight Systems/Nsight Compute. We do not currently provide a lecture for these, as they are not supported by the teaching hardware (managed desktops), however Nvidia’s documentation and a couple of talks by Nvidia staff may be of value.

Flip Classroom Pre-recorded Lectures

  • Performance Profiling - Guest Lecture by Dr Robert Chisholm (pdf, recording)

Lab 8 - CUDA Profiling

Tuesday 9:00 Diamond CR1

  • Profile Lecture Example Code There is no lab sheet for this lab. Examine the source code and try changing the STEP macro to compile different iterations of the code to run through the profiler.

Please Note: Currently in order to open the Visual Profiler on managed desktops you need to run the command nvvp -vm "C:\Program Files\Java\jdk-1.8\jre\bin\java" in a console window.

Week 10

Lecture 9 (Different Time/Location)

Friday 9:00 Pam Liversidge LT1

Flip Classroom Pre-recorded Lectures

Week 11

No Lecture (Bank Holiday)

TellUS - Module Feedback

Please complete the module feedback survey for all modules you have taken this semester.

Your feedback is crucial to guiding the development of modules, by highlighting what you felt worked within the module and how you think that your experience and understanding could be improved. It also enables you to highlight the achievements or failures of teaching staff.

If you have more urgent feedback, you can get in contact with your student staff liaison committee representative, or me directly (at a lab class or via email r.chisholm(at)

Tell US can be accessed here, or via the left-hand menu within Blackboard.

Lab 9 - Libraries & Streams

Tuesday 9:00 Diamond CR1

Previous On Demand Invited Lectures (Optional)

Please Find below a list of previous invited lectures which may be of interest.

Week 12:

Lab 10 - Assignment Help 1

Monday 10:00 Diamond CR2

This additional lab is in the lecture slot for week 12, please note it is held in a different computer room within the Diamond to normal labs.

Lab 11 - Assignment Help 2

Tuesday 9:00 Diamond CR1

Recommended Reading

The following are useful resources but not required reading.

  • Edward Kandrot, Jason Sanders, “CUDA by Example: An Introduction to General-Purpose GPU Programming”, Addison Wesley 2010.
  • Brian Kernighan, Dennis Ritchie, “The C Programming Language (2nd Edition)”, Prentice Hall 1988.
  • NVIDIA, CUDA C Programming Guide

Tertiary Blogs etc.

C is a programming language with many quirks, if you find them interesting you might enjoy these web pages.

Contact Us

For queries relating to collaborating with the RSE team on projects:

Information and access to JADE II and Bede.

Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.

Queries regarding free research computing support/guidance should be raised via our Code clinic or directed to the University IT helpdesk.