Project

General

Profile

Bug #184

charon: ENC thread receives SIGSEGV if debug level is equal to 3 or above

Added by Gowrishankar Muthukrishnan over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Category:
libstrongswan
Target version:
Start date:
21.03.2012
Due date:
Estimated time:
Affected version:
4.6.2
Resolution:
Fixed

Description

In strongswan of version 4.5 (atleast) and above, if the debugging level
for ENC thread is set to 3 or above, ENC thread (and hence charon)
receives SIGSEGV, while parsing IKE_SA_INIT Response. This problem was
observed on ppc64 architecture (but surprisingly not seen in x86/x86_64
platforms).

To reproduce, add below entries in /etc/strongswan.conf

As per RFC 5996 Page 78:
(http://www.ietf.org/rfc/rfc5996.txt)

o  SPI (variable) - The sending entity's SPI.  Even if the SPI Size
is not a multiple of 4 octets, there is no padding applied to the
payload. When the SPI Size field is zero, this field is not
present in the Security Association payload.

Hence, in IKE_SA_INIT response, SA payload does not carry SPI entry
when SPI size is zero. If debugging level is set to 3 or above, below
piece of ENC code creates problem.

#14 0x00000080b729ec58 in parse_chunk (this=0xfff58000e20,
rule_number=<value optimized out>, output_pos=0xfff5c000ad0,
length=<value optimized out>) at encoding/parser.c:360

parser.c at #360:

356         if (output_pos)
357 {
358 *output_pos = chunk_alloc(length);
359 memcpy(output_pos->ptr, this->byte_pos, length);
360 DBG3(DBG_ENC, " => %b", output_pos->ptr, length); <<<===
361 }

DBG3 tries to unpack the bytes at NULL address (as SPI is zero).
This can be traced upto libstrongswan/utils.c. I am proposing below
changes to solve the problem.

Signed-off-by: Gowri Shankar <gowrishankar.m@in.ibm.com>
Tested-by: Gowri Shankar <gowrishankar.m@in.ibm.com>

Index: strongswan-4.5.2/src/libstrongswan/utils.c
===================================================================
--- strongswan-4.5.2.orig/src/libstrongswan/utils.c  2012-03-10 09:26:58.377688224 +0530
+++ strongswan-4.5.2/src/libstrongswan/utils.c  2012-03-10 17:27:30.639640749 +0530
@@ -389,7 +389,7 @@

        written += print_in_hook(dst, dstlen, "=> %d bytes @ %p", len, bytes);

-       while (bytes_pos < bytes_roof)
+       while (bytes_pos != NULL && bytes_pos < bytes_roof)
        {
                *buffer_pos++ = hexdig_upper[(*bytes_pos >> 4) & 0xF];
                *buffer_pos++ = hexdig_upper[ *bytes_pos       & 0xF];

History

#1 Updated by Gowrishankar Muthukrishnan over 8 years ago

To reproduce the problem, add/update below entries in /etc/strongswan.conf.


charon {
  filelog {
    /var/log/charon.log {
      default = 4
    }
    stdout {
      default = 4
    }
  }
  syslog {
    auth {
      default = 4
    }
  }
}

#2 Updated by Tobias Brunner over 8 years ago

  • Status changed from New to Feedback
  • Assignee changed from Martin Willi to Tobias Brunner

Hm, I dont see how this results in a segmentation fault.

The code is like this

char *bytes_pos = bytes;
char *bytes_roof = bytes + len;
...
while (bytes_pos < bytes_roof)
{

So if bytes is NULL and len is 0 we get bytes_pos==bytes_roof==NULL, that is, the while loop is not entered. So how exactly does this result in a SIGSEGV.

Are you able to debug this with GDB? Could you provide a complete backtrace for the SIGSEGV?

I remember you asked about some problem with IKE_AUTH on the mailing list a while ago. There you also had quite some strange log output. For instance,

10[CFG] stroke message => -2036037751 bytes @ 0xfff80ede300
Could be that all these issues are due to some fundamental problems of strongSwan on the PPC64 architecture. Since I don't have a PPC64 machine available it's hard to verify that. I will try to setup something in QEMU, but no idea if that works.

#3 Updated by Gowrishankar Muthukrishnan over 8 years ago

Tobias,

bytes_pos==bytes_roof==NULL is not so true if we examine with GDB in
ppc64. There is a type cast problem (from size_t to int). Read below.

Lets start from charon.log while it crashes.

Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing PROPOSAL_SUBSTRUCTURE payload, 249 bytes left
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing payload from => -1911927705 bytes @ 0xfff64000c70
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 0 U_INT_8
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 0
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 1 RESERVED_BYTE
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 0
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 2 PAYLOAD_LENGTH
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 40
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 3 U_INT_8
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 1
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 4 U_INT_8
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 1
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 5 SPI_SIZE
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 0
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 6 U_INT_8
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>    => 4
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1>   parsing rule 7 SPI
Mar 22 03:29:16 10[DMN] <tahi_ikev2_test|1> thread 10 received 11
Mar 22 03:29:16 10[DMN] <tahi_ikev2_test|1> killing ourself, received critical signal
Mar 22 03:29:21 00[DMN] Starting IKEv2 charon daemon (strongSwan 4.5.2)

Stopping before SIGSEGV with GDB as below, at utils.c:394

Breakpoint 1, mem_printf_hook (dst=<value optimized out>, 
    dstlen=<value optimized out>, spec=<value optimized out>, 
    args=<value optimized out>) at utils.c:394
394            *buffer_pos++ = hexdig_upper[(*bytes_pos >> 4) & 0xF];
(gdb) bt
#0  mem_printf_hook (dst=<value optimized out>, dstlen=<value optimized out>, 
    spec=<value optimized out>, args=<value optimized out>) at utils.c:394
#1  0x00000fff8e10dda4 in custom_print (stream=0xfff83ffb9a0, 
    info=<value optimized out>, args=<value optimized out>) at printf_hook.c:98
#2  0x00000080cf79c25c in ._IO_vfprintf () from /lib64/libc.so.6
#3  0x00000080cf865998 in .__vsnprintf_chk () from /lib64/libc.so.6
#4  0x00000fff8e0506e0 in vsnprintf (this=0x24bb8b90, 
    group=<value optimized out>, level=<value optimized out>, 
    thread=<value optimized out>, ike_sa=0xfff580009b0, 
    format=<value optimized out>, args=<value optimized out>)
    at /usr/include/bits/stdio2.h:78
#5  log_ (this=0x24bb8b90, group=<value optimized out>, 
    level=<value optimized out>, thread=<value optimized out>, 
    ike_sa=0xfff580009b0, format=<value optimized out>, 
    args=<value optimized out>) at bus/listeners/sys_logger.c:61
#6  0x00000fff8e04fb60 in log_cb (entry=0x24bb8c70, data=<value optimized out>)
    at bus/bus.c:223
#7  0x00000fff8e131fec in remove_ (this=0x24bb7890, item=0xfff83ffde18, 
    compare=@0xfff8e0c5738: 0xfff8e04faf0 <log_cb>) at utils/linked_list.c:524
#8  0x00000fff8e04fa48 in vlog (this=0x24bb77d0, group=<value optimized out>, 
    level=<value optimized out>, format=0xfff8e0a4610 "   => %b", 
    args=0xfff83ffdf30 "") at bus/bus.c:260
#9  0x00000fff8e04fad4 in log_ (this=<value optimized out>, 
---Type <return> to continue, or q <return> to quit---
    group=<value optimized out>, level=<value optimized out>, 
    format=<value optimized out>) at bus/bus.c:272
#10 0x00000fff8e05ec58 in parse_chunk (this=0xfff64000e40, 
    rule_number=<value optimized out>, output_pos=0xfff54000fe0, 
    length=<value optimized out>) at encoding/parser.c:360
#11 0x00000fff8e05f318 in parse_payload (this=0xfff64000e40, 
    payload_type=<value optimized out>, payload=0xfff83ffe108)
    at encoding/parser.c:487
#12 0x00000fff8e05f854 in parse_list (this=0xfff64000e40, 
    rule_number=<value optimized out>, output_pos=0xfff54001858, 
    payload_type=<value optimized out>, length=<value optimized out>)
    at encoding/parser.c:323
#13 0x00000fff8e05f5a8 in parse_payload (this=0xfff64000e40, 
    payload_type=<value optimized out>, payload=0xfff83ffe330)
    at encoding/parser.c:497
#14 0x00000fff8e05cd40 in parse_body (this=0xfff64000f20, aead=0x0)
    at encoding/message.c:1449
#15 0x00000fff8e07d584 in process_message (this=0xfff580009b0, 
    message=0xfff64000f20) at sa/ike_sa.c:1193
#16 0x00000fff8e06cad4 in execute (this=0xfff64000940)
    at processing/jobs/process_message_job.c:74
#17 0x00000fff8e1280e4 in process_jobs (this=0x24bb68e0)
    at processing/processor.c:136
---Type <return> to continue, or q <return> to quit---
#18 0x00000fff8e12b9a4 in thread_main (this=0x24bd7a50)
    at threading/thread.c:291
#19 0x00000080cfa4c2bc in .start_thread () from /lib64/libpthread.so.0
#20 0x00000080cf8487ac in .__clone () from /lib64/libc.so.6

(gdb) p len
$1 = <value optimized out>
(gdb) p *bytes_roof
Cannot access memory at address 0xffffffff8e0a4618
(gdb) p *bytes_pos
Cannot access memory at address 0x0
(gdb) p *bytes
value has been optimized out
(gdb) p bytes
$2 = <value optimized out>
(gdb) p bytes_pos
$3 = 0x0
(gdb) 

And the code is

    375 int mem_printf_hook(char *dst, size_t dstlen,
    376                                         printf_hook_spec_t *spec, const void *const *args)
    377 {
    378         char *bytes = *((void**)(args[0]));
    379         int len = *((size_t*)(args[1]));
    380 
    381         char buffer[BYTES_PER_LINE * 3];
    382         char ascii_buffer[BYTES_PER_LINE + 1];
    383         char *buffer_pos = buffer;
    384         char *bytes_pos  = bytes;
    385         char *bytes_roof = bytes + len;
    386         int line_start = 0;
    387         int i = 0;
    388         int written = 0;
    389 
    390         written += print_in_hook(dst, dstlen, "=> %d bytes @ %p", len, bytes);
    391 
    392         while (bytes_pos < bytes_roof)
    393         {
    394                 *buffer_pos++ = hexdig_upper[(*bytes_pos >> 4) & 0xF];
    395                 *buffer_pos++ = hexdig_upper[ *bytes_pos       & 0xF];

Note that, integer type "len" is assigned a value through size_t pointer.
size_t and int are of different lengths in ppc64. If I correct it as

"int len = *((int*)(args[1]));" 

it solves this problem and I verified all IKEv2 testcases from TAHI.
but that can just help to avoid SEGV only (also not good solution too).

Actually, there is a big mesh fundamentally to handle the verbose logging.
If you see in the call trace, parse_chunk() prepares bytes length as
chunk_t type, using chunk_alloc().


/**
 * General purpose pointer/length abstraction.
 */
struct chunk_t {
        /** Pointer to start of data */
        u_char *ptr;
        /** Length of data in bytes */
        size_t len;
};

Note that len is size_t here.

Just before this, parse_payload() checks the SPI rule through
parse_chunk(), but the spi_size it sends, is of "int" type.
parse_chunk prototype also uses int type length. This is where
the real problem starts and reaches upto SEGV in mem_printf_hook().

Ideally, parse_payload should have passed size_t spi_size to parse_chunk
function. Also, parse_chunk() prototype should change as:

static bool parse_chunk(private_parser_t *this, int rule_number,
                                    chunk_t *output_pos, size_t length)

Am I right here ?

Behavior is same in 4.6.1 version we tested.
I am checking a patch for above problem and post soon.

#4 Updated by Tobias Brunner over 8 years ago

Note that, integer type "len" is assigned a value through size_t pointer.
size_t and int are of different lengths in ppc64.

Well, that's probably true for most 64-bit platforms. So it's interesting that it isn't an issue on x86_64 where size_t is double the size of int too. It turns out the reason for this is the difference between little and big endian.

The problem is that while

int len = *(size_t*)int_ptr;
is the same as
int len = *int_ptr;
on little endian, it's not on big endian. There it's more like
int len = *(int_ptr + 1); /* assuming size_t is double the size of int */
which means that the wrong value is read.
 

Anyway, the cast here

int len = *((size_t*)(args[1]));
is actually wrong in the first place. Due to how the custom printf functions are registered, there can't be any arguments of type size_t, as the only integer type that can currently be registered is int. See source:src/libstrongswan/printf_hook.h and the registration of mem_printf_hook in source:src/libstrongswan/library.c#L158. So even if the original argument in the printf statement were of type size_t it wouldn't be treated as such by printf.

So the best fix for now is probably to just use the cast you proposed:

int len = *((int*)(args[1]));
This works even if the original argument is smaller than int as, due to the use of a variable-length argument list in printf, shorter integer types get promoted to int (see C-FAQ 15.2).

But it will break, of course, in a similar fashion as before, if the argument is larger than int (e.g. size_t). So we have to check all usages of %b and cast the argument to int beforehand if required (e.g. DBG3("%b", ptr, (int)size_t_var)) - which should be fine as the length of these buffers is most likely not that large. If in doubt we can always use %B which takes a pointer to a chunk_t struct as argument.

#5 Updated by Tobias Brunner over 8 years ago

I tried to implement this as proposed in my last comment (see attached patch). Using %B instead of %b to avoid the downcast to int would actually not work because the %B handler (chunk_printf_hook) uses mem_printf_hook internally.

Because I expect the compiler to replace them with literals, I didn't cast any direct usages of sizeof() as argument to %b, even though the return value of sizeof() is size_t by definition. These literals, as well as preprocessor constants, will automatically get promoted to int arguments.

#6 Updated by Tobias Brunner over 8 years ago

  • Target version set to 4.6.3

#7 Updated by Tobias Brunner over 8 years ago

  • Status changed from Feedback to Closed
  • Resolution set to Fixed

Also available in: Atom PDF