How do I request large memory to run multi-threaded (single node) program?

You will use -pe (number of cores) and -l h_data (memory per core) together to specify the total amount of memory you want. Note that the product of (number of cores)*(h_data) must be smaller than the total memory of a compute node, otherwise your job will never start.
For example, request 8 cores with 32G total memory (shared by all 8 cores):
qsub -l h_data=4G -pe shared 8 ... 
If your multi-threaded program will automatically use all CPUs available on the node, add the -l exclusive option, e.g.
qsub -l h_data=4G,exclusive -pe shared 8 ... 
You can also put -pe shared 8 -l h_data=32G in your job script file.
If you are requesting more than 64GB total memory, please contact us.