Thursday, April 16, 2015

Of huge pages and huge performance hits, are we alone?

We do a fair amount of sequence analysis here. One thing we do a lot of is trimming sequence data. The files are somewhat large. I'm not allowed to call this "big data" :-) There's a neat trimming code called "trimmomatic" (awesome name eh?). It's a simple enough piece of java, but interacts poorly with our machines, and it turns out it as a code is not alone.

We have a huge page table issue.

A very big one.

So turns out, khugepaged manages mapping pages in memory, and when you have large codes such as this one that pull a pair of 7GB data files together modify them and then try and get them out to disk as fast as you can you can see it will stress any machine. We used local storage for this and kept it simple.
[jcuff@regal01 dist]# time java -jar ./jar/trimmomatic-0.33.jar PE -phred33 ./FR4_P_pilosa_CTTGTA.R1.fastq.gz ./FR4_P_pilosa_CTTGTA.R2.fastq.gz outr1.dat outr2.dat outr1un.dat outr2un.dat ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

TrimmomaticPE: Started with arguments: -phred33 ./FR4_P_pilosa_CTTGTA.R1.fastq.gz ./FR4_P_pilosa_CTTGTA.R2.fastq.gz outr1.dat outr2.dat outr1un.dat outr2un.dat ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
Multiple cores found: Using 16 threads

Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences

Input Read Pairs: 25776503 Both Surviving: 23790480 (92.30%) Forward Only Surviving: 1901563 (7.38%) Reverse Only Surviving: 49159 (0.19%) Dropped: 35301 (0.14%)

TrimmomaticPE: Completed successfully

So how did we do?

real 57m47.317s
user 44m24.784s
sys 435m41.152s

Yeah that's pretty slow.

While we were running we saw khugepaged @ 100% in top and then in "perf":

25022 root      39  19    0    0    0 R 100.0  0.0  6:23.13 khugepaged

[root@regal01 dist]# perf top

Samples: 191K of event 'cycles', Event count (approx.): 104146992710
 75.44%  [kernel]             [k] _spin_lock_irqsave
  4.21%  [kernel]             [k] _spin_lock_irq
  1.02%  libR.so              [.] logicalSubscript
  0.83%  [kernel]             [k] ____pagevec_lru_add

Never good to be in _spin_lock_irq

Now so let's take out our THP (transparent huge page)

[root@regal01 dist]# echo never > /sys/kernel/mm/transparent_hugepage/enabled

How did we do?
[jcuff@regal01 dist]# time java -jar ./jar/trimmomatic-0.33.jar PE -phred33 ./FR4_P_pilosa_CTTGTA.R1.fastq.gz ./FR4_P_pilosa_CTTGTA.R2.fastq.gz outr1.dat outr2.dat outr1un.dat outr2un.dat ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

TrimmomaticPE: Started with arguments: -phred33 ./FR4_P_pilosa_CTTGTA.R1.fastq.gz ./FR4_P_pilosa_CTTGTA.R2.fastq.gz outr1.dat outr2.dat outr1un.dat outr2un.dat ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
Multiple cores found: Using 16 threads

Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences

Input Read Pairs: 25776503 Both Surviving: 23790480 (92.30%) Forward Only Surviving: 1901563 (7.38%) Reverse Only Surviving: 49159 (0.19%) Dropped: 35301 (0.14%)

TrimmomaticPE: Completed successfully

Drum roll please!
real 3m23.022s
user 6m22.666s
sys 0m38.528s

Wow.

That's a pretty big deal. I'll always take a 20x speed up if I can get it.

Turns out we are not alone:

https://blogs.oracle.com/linuxkernel/entry/performance_impact_of_transparent_huge

http://www.open-mpi.org/community/lists/users/2012/08/20057.php

http://unix.stackexchange.com/questions/161858/arch-linux-becomes-unresponsive-from-khugepaged

http://structureddata.org/2012/06/18/linux-6-transparent-huge-pages-and-hadoop-workloads/

There's an ongoing discussion here about the current state of the art.

https://lkml.org/lkml/2015/2/23/205

Mean time we are going to disable our THP.

Have others seen this? Happy to see the comments.


[any opinions here are all mine, and have absolutely nothing to do with my employer]
(c) 2011 James Cuff