How to do Udacity CS344 Problem Sets using Open Software!

Jun 22, 2016


I learned how to program GPUs from Udacity CS344 Intro to Parallel Programming. At the time, I did not have an NVidia GPU. So, I needed to buy an NVidia GPU to do my homework on my home computer. Since then, I upgraded my computer to an AMD Kaveri APU (A10-7850K) and sold the NVidia GPU. I was no longer able to run the homework on my main PC…

That is untill now! AMD now has HIP that allows one to write portable GPGPU code that can be compiled to run on NVidia GPU and AMD GPU. HIP uses thin warpper to provide source code portability between NVidia’s CUDA and AMD’s HCC. You can read more about it here.

In this blog, I’ll show you how I ported CS344 problem sets to run on my AMD Kaveri APU (A10-7850K).

Setting up the environment

In order to use HIP on AMD APUs, you need to install ROCm package. Just follow the instruction and reboot when you are done. Also, CS344 problem sets use OpenCV 2.4 library. So, make sure that you have OpenCV library installed:

@Kaveri-Ubuntu:~/git$ sudo apt-get install libopencv-dev

Also, at the moment, HIP is mainly developed on AMD Fiji discrete GPUs. If you have AMD APU like I do, you need to see this to make it work.

Hipify your CUDA code

Once you have ROCm and OpenCV set up, you need to clone CS344 repository to your machine:

@Kaveri-Ubuntu:~/git$ git clone https://github.com/udacity/cs344

Then change into the directory where the first problem set files are:

@Kaveri-Ubuntu:~/git$ cd cs344/Problem\ Sets/Problem\ Set\ 1

HIP has a script file “hipconvertinplace.sh” that will convert all the files in the current directory. The script will print out some information about the files that it converted:

@Kaveri-Ubuntu:~/git/cs344/Problem Sets/Problem Set 1$ hipconvertinplace.sh 
info: converted 3 CUDA->HIP refs( dev:1 mem:0 kern:1 coord_func:0 math_func:0 special_func:0 stream:0 event:0 err:1 def:0 tex:0 other:0 ) warn:0 LOC:66 in './student_func.cu'
info: converted 8 CUDA->HIP refs( dev:0 mem:8 kern:0 coord_func:0 math_func:0 special_func:0 stream:0 event:0 err:0 def:0 tex:0 other:0 ) warn:0 LOC:84 in './HW1.cpp'
info: converted 2 CUDA->HIP refs( dev:0 mem:0 kern:0 coord_func:0 math_func:0 special_func:0 stream:0 event:0 err:2 def:0 tex:0 other:0 ) warn:0 LOC:88 in './utils.h'
info: converted 4 CUDA->HIP refs( dev:1 mem:2 kern:0 coord_func:0 math_func:0 special_func:0 stream:0 event:0 err:1 def:0 tex:0 other:0 ) warn:0 LOC:93 in './main.cpp'
info: converted 10 CUDA->HIP refs( dev:0 mem:0 kern:0 coord_func:0 math_func:0 special_func:0 stream:0 event:10 err:0 def:0 tex:0 other:0 ) warn:0 LOC:42 in './timer.h'

info: TOTAL-converted 27 CUDA->HIP refs( dev:2 mem:10 kern:1 coord_func:0 math_func:0 special_func:0 stream:0 event:10 err:4 def:0 tex:0 other:0 ) warn:0 LOC:447
  kernels (1 total) :   rgba_to_greyscale(1)

Once the conversion is done, you still need to tweak a few things.

First, the script will backup the original files with .prehip extension. You do not need those files any more. So, you can delete them:

@Kaveri-Ubuntu:~/git/cs344/Problem Sets/Problem Set 1$ rm *.prehip

Second, the converted CUDA file is no longer CUDA file, so its extension should be changed to .cpp:

@Kaveri-Ubuntu:~/git/cs344/Problem Sets/Problem Set 1$ mv student_func.cu student_func.cpp

Third, fix up the code to make them compile. There are three things you need to do:

1. Comment out "#include <cuda.h>" from util.h line 6 and HW1.cpp line 5.
2. Also comment out "checkCudaErrors(hipFree(0));" from HW1.cpp line 27.
3. Add "grid_launch_parm lp" as the first parameter of GPU kernel to student_func.cpp.

Last, Makefile needs to be adjusted to use HIPCC instead of NVCC:

HIPCC=hipcc

###################################
# These are the default install   #
# locations on most linux distros #
###################################

OPENCV_LIBPATH=/usr/lib
OPENCV_INCLUDEPATH=/usr/include

###################################################
# On Macs the default install locations are below #
###################################################

#OPENCV_LIBPATH=/usr/local/lib
#OPENCV_INCLUDEPATH=/usr/local/include

# or if using MacPorts

#OPENCV_LIBPATH=/opt/local/lib
#OPENCV_INCLUDEPATH=/opt/local/include

OPENCV_LIBS=-lopencv_core -lopencv_imgproc -lopencv_highgui

HIP_INCLUDEPATH=/opt/rocm/hcc/include

######################################################
# On Macs the default install locations are below    #
# ####################################################

#CUDA_INCLUDEPATH=/usr/local/cuda/include
#CUDA_LIBPATH=/usr/local/cuda/lib

HIPCC_OPTS=-O2 -hc -std=c++amp

student: main.o student_func.o compare.o reference_calc.o Makefile
	$(HIPCC) -o HW1 main.o student_func.o compare.o reference_calc.o -L $(OPENCV_LIBPATH) $(OPENCV_LIBS) $(HIPCC_OPTS)

main.o: main.cpp timer.h utils.h reference_calc.cpp compare.cpp HW1.cpp
	$(HIPCC) -c main.cpp $(HIPCC_OPTS) -I $(HIP_INCLUDEPATH) -I $(OPENCV_INCLUDEPATH)

student_func.o: student_func.cpp utils.h
	$(HIPCC) -c student_func.cpp $(HIPCC_OPTS)

compare.o: compare.cpp compare.h
	$(HIPCC) -c compare.cpp -I $(OPENCV_INCLUDEPATH) $(HIPCC_OPTS)

reference_calc.o: reference_calc.cpp reference_calc.h
	$(HIPCC) -c reference_calc.cpp -I $(OPENCV_INCLUDEPATH) $(HIPCC_OPTS)

clean:
	rm -f *.o *.png hw

That’s it! Now you should be able to compile and run the program. I have hipified problem set 1 and put them on my Github account here. Spoiler-Alert Note that I have added the solution to the problem set 1. Hope this helps.