On Tue, 2004-05-25 at 23:31, Will Cohen wrote:
After running the little script and installing the debuginfo rpms I
was
able to get some profiles. It looks like this particular machine has a
reasonable video card (NVIDIA Quadro 4). Most of the are not for drawing
stuff on the screen.
One drawback is rpm with /usr/X11R6/bin/Xorg does not have an associated
debuginfo rpm. I assume this is where the redendering happens and where
people see the performance hit in gnome-terminal.
The first opreport shows the overall view of which applications had
samples and the shared libraries associated with them. The second
opreport lists the function-by-function breakdown. Why so much 25% of
the time in memchr and real_tolower?
The memchr() hit comes from this function:
static char *
_vte_iso2022_find_nextctl(const char *p, size_t length)
{
char *ret;
if (length == 0) {
return NULL;
}
ret = memchr(p, '\033', length);
ret = _vte_iso2022_better(ret, memchr(p, '\n', length));
ret = _vte_iso2022_better(ret, memchr(p, '\r', length));
ret = _vte_iso2022_better(ret, memchr(p, '\016', length));
ret = _vte_iso2022_better(ret, memchr(p, '\017', length));
#ifdef VTE_ISO2022_8_BIT_CONTROLS
/* This breaks UTF-8 and other encodings which use the high
bits. */
ret = _vte_iso2022_better(ret, memchr(p, 0x8e, length));
ret = _vte_iso2022_better(ret, memchr(p, 0x8f, length));
#endif
return ret;
}
Since the _vte_iso2022_better() function basically returns the minimum
of the two pointers a speedup is possible by
- doing only one pass over the string instead of seven
- stopping as soon as a control character is found.
The attached patch makes the elapsed time drop from 14.1 seconds to 12.4
seconds on cat jarg22.txt.
Soeren