Bug #184
charon: ENC thread receives SIGSEGV if debug level is equal to 3 or above
Description
In strongswan of version 4.5 (atleast) and above, if the debugging level
for ENC thread is set to 3 or above, ENC thread (and hence charon)
receives SIGSEGV, while parsing IKE_SA_INIT Response. This problem was
observed on ppc64 architecture (but surprisingly not seen in x86/x86_64
platforms).
To reproduce, add below entries in /etc/strongswan.conf
As per RFC 5996 Page 78:
(http://www.ietf.org/rfc/rfc5996.txt)
o SPI (variable) - The sending entity's SPI. Even if the SPI Size
is not a multiple of 4 octets, there is no padding applied to the
payload. When the SPI Size field is zero, this field is not
present in the Security Association payload.
Hence, in IKE_SA_INIT response, SA payload does not carry SPI entry
when SPI size is zero. If debugging level is set to 3 or above, below
piece of ENC code creates problem.
#14 0x00000080b729ec58 in parse_chunk (this=0xfff58000e20,
rule_number=<value optimized out>, output_pos=0xfff5c000ad0,
length=<value optimized out>) at encoding/parser.c:360
parser.c at #360:
356 if (output_pos)
357 {
358 *output_pos = chunk_alloc(length);
359 memcpy(output_pos->ptr, this->byte_pos, length);
360 DBG3(DBG_ENC, " => %b", output_pos->ptr, length); <<<===
361 }
DBG3 tries to unpack the bytes at NULL address (as SPI is zero).
This can be traced upto libstrongswan/utils.c. I am proposing below
changes to solve the problem.
Signed-off-by: Gowri Shankar <gowrishankar.m@in.ibm.com> Tested-by: Gowri Shankar <gowrishankar.m@in.ibm.com> Index: strongswan-4.5.2/src/libstrongswan/utils.c =================================================================== --- strongswan-4.5.2.orig/src/libstrongswan/utils.c 2012-03-10 09:26:58.377688224 +0530 +++ strongswan-4.5.2/src/libstrongswan/utils.c 2012-03-10 17:27:30.639640749 +0530 @@ -389,7 +389,7 @@ written += print_in_hook(dst, dstlen, "=> %d bytes @ %p", len, bytes); - while (bytes_pos < bytes_roof) + while (bytes_pos != NULL && bytes_pos < bytes_roof) { *buffer_pos++ = hexdig_upper[(*bytes_pos >> 4) & 0xF]; *buffer_pos++ = hexdig_upper[ *bytes_pos & 0xF];
History
#1 Updated by Gowrishankar Muthukrishnan over 13 years ago
To reproduce the problem, add/update below entries in /etc/strongswan.conf.
charon { filelog { /var/log/charon.log { default = 4 } stdout { default = 4 } } syslog { auth { default = 4 } } }
#2 Updated by Tobias Brunner over 13 years ago
- Status changed from New to Feedback
- Assignee changed from Martin Willi to Tobias Brunner
Hm, I dont see how this results in a segmentation fault.
The code is like this
char *bytes_pos = bytes;
char *bytes_roof = bytes + len;
...
while (bytes_pos < bytes_roof)
{
So if bytes
is NULL
and len
is 0 we get bytes_pos==bytes_roof==NULL
, that is, the while loop is not entered. So how exactly does this result in a SIGSEGV.
Are you able to debug this with GDB? Could you provide a complete backtrace for the SIGSEGV?
I remember you asked about some problem with IKE_AUTH on the mailing list a while ago. There you also had quite some strange log output. For instance,
10[CFG] stroke message => -2036037751 bytes @ 0xfff80ede300Could be that all these issues are due to some fundamental problems of strongSwan on the PPC64 architecture. Since I don't have a PPC64 machine available it's hard to verify that. I will try to setup something in QEMU, but no idea if that works.
#3 Updated by Gowrishankar Muthukrishnan over 13 years ago
Tobias,
bytes_pos==bytes_roof==NULL is not so true if we examine with GDB in
ppc64. There is a type cast problem (from size_t to int). Read below.
Lets start from charon.log while it crashes.
Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing PROPOSAL_SUBSTRUCTURE payload, 249 bytes left Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing payload from => -1911927705 bytes @ 0xfff64000c70 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 0 U_INT_8 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 0 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 1 RESERVED_BYTE Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 0 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 2 PAYLOAD_LENGTH Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 40 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 3 U_INT_8 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 1 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 4 U_INT_8 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 1 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 5 SPI_SIZE Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 0 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 6 U_INT_8 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> => 4 Mar 22 03:21:54 10[ENC] <tahi_ikev2_test|1> parsing rule 7 SPI Mar 22 03:29:16 10[DMN] <tahi_ikev2_test|1> thread 10 received 11 Mar 22 03:29:16 10[DMN] <tahi_ikev2_test|1> killing ourself, received critical signal Mar 22 03:29:21 00[DMN] Starting IKEv2 charon daemon (strongSwan 4.5.2)
Stopping before SIGSEGV with GDB as below, at utils.c:394
Breakpoint 1, mem_printf_hook (dst=<value optimized out>, dstlen=<value optimized out>, spec=<value optimized out>, args=<value optimized out>) at utils.c:394 394 *buffer_pos++ = hexdig_upper[(*bytes_pos >> 4) & 0xF]; (gdb) bt #0 mem_printf_hook (dst=<value optimized out>, dstlen=<value optimized out>, spec=<value optimized out>, args=<value optimized out>) at utils.c:394 #1 0x00000fff8e10dda4 in custom_print (stream=0xfff83ffb9a0, info=<value optimized out>, args=<value optimized out>) at printf_hook.c:98 #2 0x00000080cf79c25c in ._IO_vfprintf () from /lib64/libc.so.6 #3 0x00000080cf865998 in .__vsnprintf_chk () from /lib64/libc.so.6 #4 0x00000fff8e0506e0 in vsnprintf (this=0x24bb8b90, group=<value optimized out>, level=<value optimized out>, thread=<value optimized out>, ike_sa=0xfff580009b0, format=<value optimized out>, args=<value optimized out>) at /usr/include/bits/stdio2.h:78 #5 log_ (this=0x24bb8b90, group=<value optimized out>, level=<value optimized out>, thread=<value optimized out>, ike_sa=0xfff580009b0, format=<value optimized out>, args=<value optimized out>) at bus/listeners/sys_logger.c:61 #6 0x00000fff8e04fb60 in log_cb (entry=0x24bb8c70, data=<value optimized out>) at bus/bus.c:223 #7 0x00000fff8e131fec in remove_ (this=0x24bb7890, item=0xfff83ffde18, compare=@0xfff8e0c5738: 0xfff8e04faf0 <log_cb>) at utils/linked_list.c:524 #8 0x00000fff8e04fa48 in vlog (this=0x24bb77d0, group=<value optimized out>, level=<value optimized out>, format=0xfff8e0a4610 " => %b", args=0xfff83ffdf30 "") at bus/bus.c:260 #9 0x00000fff8e04fad4 in log_ (this=<value optimized out>, ---Type <return> to continue, or q <return> to quit--- group=<value optimized out>, level=<value optimized out>, format=<value optimized out>) at bus/bus.c:272 #10 0x00000fff8e05ec58 in parse_chunk (this=0xfff64000e40, rule_number=<value optimized out>, output_pos=0xfff54000fe0, length=<value optimized out>) at encoding/parser.c:360 #11 0x00000fff8e05f318 in parse_payload (this=0xfff64000e40, payload_type=<value optimized out>, payload=0xfff83ffe108) at encoding/parser.c:487 #12 0x00000fff8e05f854 in parse_list (this=0xfff64000e40, rule_number=<value optimized out>, output_pos=0xfff54001858, payload_type=<value optimized out>, length=<value optimized out>) at encoding/parser.c:323 #13 0x00000fff8e05f5a8 in parse_payload (this=0xfff64000e40, payload_type=<value optimized out>, payload=0xfff83ffe330) at encoding/parser.c:497 #14 0x00000fff8e05cd40 in parse_body (this=0xfff64000f20, aead=0x0) at encoding/message.c:1449 #15 0x00000fff8e07d584 in process_message (this=0xfff580009b0, message=0xfff64000f20) at sa/ike_sa.c:1193 #16 0x00000fff8e06cad4 in execute (this=0xfff64000940) at processing/jobs/process_message_job.c:74 #17 0x00000fff8e1280e4 in process_jobs (this=0x24bb68e0) at processing/processor.c:136 ---Type <return> to continue, or q <return> to quit--- #18 0x00000fff8e12b9a4 in thread_main (this=0x24bd7a50) at threading/thread.c:291 #19 0x00000080cfa4c2bc in .start_thread () from /lib64/libpthread.so.0 #20 0x00000080cf8487ac in .__clone () from /lib64/libc.so.6 (gdb) p len $1 = <value optimized out> (gdb) p *bytes_roof Cannot access memory at address 0xffffffff8e0a4618 (gdb) p *bytes_pos Cannot access memory at address 0x0 (gdb) p *bytes value has been optimized out (gdb) p bytes $2 = <value optimized out> (gdb) p bytes_pos $3 = 0x0 (gdb) And the code is 375 int mem_printf_hook(char *dst, size_t dstlen, 376 printf_hook_spec_t *spec, const void *const *args) 377 { 378 char *bytes = *((void**)(args[0])); 379 int len = *((size_t*)(args[1])); 380 381 char buffer[BYTES_PER_LINE * 3]; 382 char ascii_buffer[BYTES_PER_LINE + 1]; 383 char *buffer_pos = buffer; 384 char *bytes_pos = bytes; 385 char *bytes_roof = bytes + len; 386 int line_start = 0; 387 int i = 0; 388 int written = 0; 389 390 written += print_in_hook(dst, dstlen, "=> %d bytes @ %p", len, bytes); 391 392 while (bytes_pos < bytes_roof) 393 { 394 *buffer_pos++ = hexdig_upper[(*bytes_pos >> 4) & 0xF]; 395 *buffer_pos++ = hexdig_upper[ *bytes_pos & 0xF];
Note that, integer type "len" is assigned a value through size_t pointer.
size_t and int are of different lengths in ppc64. If I correct it as
"int len = *((int*)(args[1]));"
it solves this problem and I verified all IKEv2 testcases from TAHI.
but that can just help to avoid SEGV only (also not good solution too).
Actually, there is a big mesh fundamentally to handle the verbose logging.
If you see in the call trace, parse_chunk() prepares bytes length as
chunk_t type, using chunk_alloc().
/** * General purpose pointer/length abstraction. */ struct chunk_t { /** Pointer to start of data */ u_char *ptr; /** Length of data in bytes */ size_t len; };
Note that len is size_t here.
Just before this, parse_payload() checks the SPI rule through
parse_chunk(), but the spi_size it sends, is of "int" type.
parse_chunk prototype also uses int type length. This is where
the real problem starts and reaches upto SEGV in mem_printf_hook().
Ideally, parse_payload should have passed size_t spi_size to parse_chunk
function. Also, parse_chunk() prototype should change as:
static bool parse_chunk(private_parser_t *this, int rule_number, chunk_t *output_pos, size_t length)
Am I right here ?
Behavior is same in 4.6.1 version we tested.
I am checking a patch for above problem and post soon.
#4 Updated by Tobias Brunner over 13 years ago
Note that, integer type "len" is assigned a value through size_t pointer.
size_t and int are of different lengths in ppc64.
Well, that's probably true for most 64-bit platforms. So it's interesting that it isn't an issue on x86_64 where size_t
is double the size of int
too. It turns out the reason for this is the difference between little and big endian.
The problem is that while
int len = *(size_t*)int_ptr;
is the same asint len = *int_ptr;
on little endian, it's not on big endian. There it's more likeint len = *(int_ptr + 1); /* assuming size_t is double the size of int */
which means that the wrong value is read.Anyway, the cast here
int len = *((size_t*)(args[1]));
is actually wrong in the first place. Due to how the custom printf functions are registered, there can't be any arguments of type size_t
, as the only integer type that can currently be registered is int
. See source:src/libstrongswan/printf_hook.h and the registration of mem_printf_hook
in source:src/libstrongswan/library.c#L158. So even if the original argument in the printf
statement were of type size_t
it wouldn't be treated as such by printf
.
So the best fix for now is probably to just use the cast you proposed:
int len = *((int*)(args[1]));
This works even if the original argument is smaller than int
as, due to the use of a variable-length argument list in printf
, shorter integer types get promoted to int
(see C-FAQ 15.2).
But it will break, of course, in a similar fashion as before, if the argument is larger than int
(e.g. size_t
). So we have to check all usages of %b
and cast the argument to int
beforehand if required (e.g. DBG3("%b", ptr, (int)size_t_var)
) - which should be fine as the length of these buffers is most likely not that large. If in doubt we can always use %B
which takes a pointer to a chunk_t
struct as argument.
#5 Updated by Tobias Brunner over 13 years ago
- File 0001-Don-t-cast-second-argument-of-mem_printf_hook-b-to-s.patch 0001-Don-t-cast-second-argument-of-mem_printf_hook-b-to-s.patch added
I tried to implement this as proposed in my last comment (see attached patch). Using %B
instead of %b
to avoid the downcast to int
would actually not work because the %B
handler (chunk_printf_hook
) uses mem_printf_hook
internally.
Because I expect the compiler to replace them with literals, I didn't cast any direct usages of sizeof()
as argument to %b
, even though the return value of sizeof()
is size_t
by definition. These literals, as well as preprocessor constants, will automatically get promoted to int
arguments.
#6 Updated by Tobias Brunner over 13 years ago
- Target version set to 4.6.3
#7 Updated by Tobias Brunner over 13 years ago
- Status changed from Feedback to Closed
- Resolution set to Fixed