It is currently Sun Sep 22, 2019 7:24 am

All times are UTC [ DST ]




Post new topic Reply to topic  [ 12 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: panic in pppol2tp_xmit
PostPosted: Fri Oct 30, 2009 5:50 pm 

Joined: Thu Oct 22, 2009 12:18 pm
Posts: 12
I'm getting the occational page_fault panic while using l2tp
unfortunately I'm not able to record the oops output, however the fault was in pppol2tp_xmit.
has anyone else reported this fault ?. the last output from the kernel into syslog was:

Oct 30 10:58:15 lns01 pppd[15849]: local IP address xxx.yyy.59.254
Oct 30 10:58:15 lns01 pppd[15849]: remote IP address xxx.yyy.59.5
Oct 30 10:58:15 lns01 openl2tpd[18554]: API: PPP_UPDOWN_IND: tunl 32353/364: unit=1 up=1 ifname='ppp1' user='user@domain.com'
Oct 30 10:58:15 lns01 openl2tpd[18554]: FUNC: tunl 32353/364: using interface ppp1
Oct 30 10:58:15 lns01 openl2tpd[18554]: FUNC: tunl 32353/364: user is user@domain.com
Oct 30 10:58:33 lns01 kernel: E9 BA FF 03 00 21 45 00 04 89 71 19 00 00 7F 06 A2 6D 57 EC ...
Oct 30 10:58:35 lns01 kernel: 03 00 21 45 00 05 63 FA 25 00 00 7F 06 18 7B 57 EC ...
Oct 30 10:58:51 lns01 kernel: 45 00 00 6A 8E 91 00 00 7F 06 89 14 57 EC ...
Oct 30 10:59:06 lns01 kernel: FF 03 00 21 45 00 00 28 96 5A 00 00 7F 06 81 8D 57 EC ...
Oct 30 10:59:07 lns01 openl2tpd[18554]: PROTO: tunl 32353: HELLO received from peer 55032
Oct 30 10:59:11 lns01 kernel: 00 00 02 72 90 E9 BA FF 03 00 21 45 00 00 36 FC 90 00 00 7F 06 1B 3D 57 EC ...
Oct 30 10:59:11 lns01 openl2tpd[18554]: PROTO: tunl 23526: HELLO received from peer 29328

not much else happening on this system other that L2TP server ???


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Mon Nov 30, 2009 10:18 pm 
Site Admin

Joined: Sun Jul 27, 2008 1:39 pm
Posts: 122
neilf wrote:
I'm getting the occational page_fault panic while using l2tp
unfortunately I'm not able to record the oops output, however the fault was in pppol2tp_xmit.
has anyone else reported this fault ?. the last output from the kernel into syslog was:

Oct 30 10:58:15 lns01 pppd[15849]: local IP address xxx.yyy.59.254
Oct 30 10:58:15 lns01 pppd[15849]: remote IP address xxx.yyy.59.5
Oct 30 10:58:15 lns01 openl2tpd[18554]: API: PPP_UPDOWN_IND: tunl 32353/364: unit=1 up=1 ifname='ppp1' user='user@domain.com'
Oct 30 10:58:15 lns01 openl2tpd[18554]: FUNC: tunl 32353/364: using interface ppp1
Oct 30 10:58:15 lns01 openl2tpd[18554]: FUNC: tunl 32353/364: user is user@domain.com
Oct 30 10:58:33 lns01 kernel: E9 BA FF 03 00 21 45 00 04 89 71 19 00 00 7F 06 A2 6D 57 EC ...
Oct 30 10:58:35 lns01 kernel: 03 00 21 45 00 05 63 FA 25 00 00 7F 06 18 7B 57 EC ...
Oct 30 10:58:51 lns01 kernel: 45 00 00 6A 8E 91 00 00 7F 06 89 14 57 EC ...
Oct 30 10:59:06 lns01 kernel: FF 03 00 21 45 00 00 28 96 5A 00 00 7F 06 81 8D 57 EC ...
Oct 30 10:59:07 lns01 openl2tpd[18554]: PROTO: tunl 32353: HELLO received from peer 55032
Oct 30 10:59:11 lns01 kernel: 00 00 02 72 90 E9 BA FF 03 00 21 45 00 00 36 FC 90 00 00 7F 06 1B 3D 57 EC ...
Oct 30 10:59:11 lns01 openl2tpd[18554]: PROTO: tunl 23526: HELLO received from peer 29328

not much else happening on this system other that L2TP server ???

Hi, there's not much we can do without more details of the oops, I'm afraid. The oops text should be in your syslog files.

How many sessions does your LNS serve?

What kernel version are you using? What openl2tp version?


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Tue Dec 01, 2009 10:48 am 

Joined: Thu Oct 22, 2009 12:18 pm
Posts: 12
hi,
The Oops is never captured by syslog and as the system is in a remote cabinet I can't capture console output. The system is serving live customers so it is set to reboot immediately after a panic. I have tried to install kdump, but this would not work with BzImage's. If you have any idea's for capturing the Oop's I would be interrested.
The system is serving only 6 L2TP sessions usually through 3 tunnels. Traffic is bursty, however the panic's don't seem to be related to traffic as some occure in the middle of the night when nothing is happening. I also have another system which has paniced with only one session and no traffic. The system is dedicated for this service and the other info is:
[root@lns01 log]# rpm -q openl2tp
openl2tp-1.6-1.fc7
[root@lns01 log]# modinfo pppol2tp
filename: /lib/modules/2.6.30.4/kernel/drivers/net/pppol2tp.ko
version: V1.0
license: GPL
description: PPP over L2TP over UDP
author: Martijn van Oosterhout <kleptog@svana.org>, James Chapman <jchapman@katalix.com>
srcversion: CE2D3DFD374613A2456558C
depends: pppox,ppp_generic
vermagic: 2.6.30.4 SMP mod_unload modversions PENTIUMIII 4KSTACKS
[root@lns01 log]# uname -a
Linux lns01.itio.com 2.6.30.4 #1 SMP Sun Aug 2 22:09:45 BST 2009 i686 i686 i386 GNU/Linux


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Mon Jan 18, 2010 9:39 am 
Site Admin

Joined: Sun Jul 27, 2008 1:39 pm
Posts: 122
neilf wrote:
hi,
The Oops is never captured by syslog and as the system is in a remote cabinet I can't capture console output.

The panic should be logged in your syslog files (usually /var/log/messages), which can be accessed on next boot.


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Mon Jan 18, 2010 11:30 am 

Joined: Thu Oct 22, 2009 12:18 pm
Posts: 12
As stated nothing has ever been captured by syslog (which would put it in /var/log/messages), only to the console. This has been lost after automatic reboots (this is a live service 24/7). Traffic is not an issue as the majority of failures occure in the middle of the night when little or no traffic is on the sessions !. The number of sessions is also not an issue as it as likely to occur on a LNS with 1 tunnel/session as with another that has many tunnel/sessions. It is not a hardware fault as we have swapped systems. If anyone else is having a simular problem, or has had one like this and it's now fixed, can you post your input please.


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Mon Jan 18, 2010 1:35 pm 
Site Admin

Joined: Sun Jul 27, 2008 1:39 pm
Posts: 122
Can the system be configured such that kernel oops messages are written to a syslog file?

Failing that, can the system be booted next time with the console on a serial port, which is captured using hyperterm or minicom?


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Mon Jan 18, 2010 4:21 pm 

Joined: Thu Oct 22, 2009 12:18 pm
Posts: 12
I am now trying to use kdump to capture the crash, once I have done this and have more info I will post it. Have no idea why panic is not saved via syslog, as I have settings for all kernel messages and errors to go here. It is possible that the context of the panic is in a interrupt and will not allow access to the filesystem at the time of panic.


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Fri Feb 05, 2010 6:36 pm 

Joined: Thu Oct 22, 2009 12:18 pm
Posts: 12
more information about panic after the latest crash

SYSTEM MAP: /boot/System.map
DEBUG KERNEL: /home/janitor/build/kernel/vmlinux (2.6.30.4)
DUMPFILE: /var/crash/2010-02-05-01:13/vmcore
CPUS: 2
DATE: Fri Feb 5 01:12:10 2010
UPTIME: 14:50:25
LOAD AVERAGE: 0.00, 0.00, 0.00
TASKS: 85
NODENAME: lns02.itio.com
RELEASE: 2.6.30.4
VERSION: #2 SMP Fri Jan 22 13:49:31 GMT 2010
MACHINE: i686 (1394 Mhz)
MEMORY: 1 GB
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
PID: 5485
COMMAND: "pppd"
TASK: f1f551c0 [THREAD_INFO: f1483000]
CPU: 1
STATE: TASK_RUNNING (PANIC)

crash>ps
PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 0 0 0 c0448300 RU 0.0 0 0 [swapper]
0 0 1 f7032380 RU 0.0 0 0 [swapper]
1 0 1 f7030000 IN 0.1 2056 624 init
.
.
.
3222 2886 1 f73779b0 IN 0.1 2904 1204 pppd
3896 2886 1 f1f54470 IN 0.1 2904 1204 pppd
> 5485 2886 1 f1f551c0 RU 0.1 2904 1212 pppd
5533 5485 1 f73ab9b0 IN 0.1 2432 1004 ip-down
5535 5533 1 f1f55aa0 RU 0.1 4640 1332 ifdown-post
5543 5535 0 f1f55f10 ?? 0.0 0 0 ifdown-post
5544 5535 1 f1f55630 RU 0.0 1724 180 grep
crash>
crash> set
PID: 5485
COMMAND: "pppd"
TASK: f1f551c0 [THREAD_INFO: f1483000]
CPU: 1
STATE: TASK_RUNNING (PANIC)
crash>log
.
.
.
BUG: unable to handle kernel NULL pointer dereference at 0000000c
IP: [<f9a91985>] pppol2tp_xmit+0x358/0x507 [pppol2tp]
*pde = 00000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/class/net/ppp0/ifindex
Modules linked in: pppol2tp pppox ppp_generic slhc autofs4 sunrpc ipv6 xt_tcpudp iptable_filter ip_tables x_tables dm_mirror dm_region_hash dm_log dm_mod video output sbs sbshc b
attery ac parport_pc lp parport sg ide_cd_mod cdrom button serio_raw rtc_cmos rtc_core rtc_lib floppy e1000 e100 i2c_piix4 mii pcspkr i2c_core aic7xxx scsi_transport_spi sd_mod s
csi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode]

Pid: 5485, comm: pppd Not tainted (2.6.30.4 #2) eserver xSeries 330 -[867413X]-
EIP: 0060:[<f9a91985>] EFLAGS: 00010297 CPU: 1
EIP is at pppol2tp_xmit+0x358/0x507 [pppol2tp]
EAX: 00000000 EBX: f73846c0 ECX: 00000000 EDX: 00000000
ESI: f1c39e00 EDI: 00000021 EBP: f15a9e00 ESP: f1483f10
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process pppd (pid: 5485, ti=f1483000 task=f1f551c0 task.ti=f1483000)
Stack:
00000011 00000021 00000006 000000c0 f1c04900 f27f7180 f15a8600 00000006
00000000 f15a8a00 f15a8a36 00000021 00000216 f1426480 f1426484 f73846c0
f14264dc f9a121d0 f73846c0 f1426480 00000011 b80375c2 f9a13197 f1ff9180
Call Trace:
[<f9a121d0>] ? ppp_channel_push+0x2b/0x81 [ppp_generic]
[<f9a13197>] ? ppp_write+0x8e/0x97 [ppp_generic]
[<f9a13109>] ? ppp_write+0x0/0x97 [ppp_generic]
[<c0180a97>] ? vfs_write+0x83/0xf6
[<c0180f70>] ? sys_write+0x3c/0x63
[<c0102804>] ? sysenter_do_call+0x12/0x22
Code: 43 18 f0 ff 46 18 89 73 08 c7 43 68 17 16 a9 f9 8a 46 24 83 e0 0c 3c 04 75 09 80 63 64 f3 e9 c0 00 00 00 8b 43 18 0f b7 7c 24 2c <8b> 40 0c f6 40 44 0e 8a 43 64 75 58 6a 00
8b 4c 24 08 83 e0 f3
EIP: [<f9a91985>] pppol2tp_xmit+0x358/0x507 [pppol2tp] SS:ESP 0068:f1483f10
CR2: 000000000000000c
crash>
crash> bt -f
PID: 5485 TASK: f1f551c0 CPU: 1 COMMAND: "pppd"
#0 [f1483e44] crash_kexec at c014573d
[RA: c0350529 SP: f1483e44 FP: f1483e94 SIZE: 84]
f1483e44: f73846c0 00000000 00000000 f1c39e00
f1483e54: 00000021 f15a9e00 00000000 0000007b
f1483e64: f15a007b f73800d8 c0351379 ffffffff
f1483e74: f9a91985 00000060 00010297 f1483f10
f1483e84: 00000068 f1483ed4 00000246 00000009
f1483e94: c0350529
#1 [f1483e94] oops_end at c0350524
[RA: c0117de8 SP: f1483e98 FP: f1483ea4 SIZE: 16]
f1483e98: 00000009 0000000c 00000000 c0117de8
#2 [f1483ea4] no_context at c0117de3
[RA: c0117fd6 SP: f1483ea8 FP: f1483ec8 SIZE: 36]
f1483ea8: c03e0afe 0000000c 00000246 00030001
f1483eb8: f73846c0 f1c39e00 c0351379 f15a9e00
f1483ec8: c0117fd6
#3 [f1483ec8] bad_area_nosemaphore at c0117fd1
[RA: c034fcf5 SP: f1483ecc FP: f1483ed0 SIZE: 8]
f1483ecc: 00030001 c034fcf5
#4 [f1483ed0] error_code at c034fcf3
EAX: 00000000 EBX: f73846c0 ECX: 00000000 EDX: 00000000 EBP: f15a9e00
DS: 007b ESI: f1c39e00 ES: 007b EDI: 00000021 GS: 1379
CS: 0060 EIP: f9a91985 ERR: ffffffff EFLAGS: 00010297
[RA: f9a91985 SP: f1483ed4 FP: f1483f04 SIZE: 52]
f1483ed4: f73846c0 00000000 00000000 f1c39e00
f1483ee4: 00000021 f15a9e00 00000000 0000007b
f1483ef4: f15a007b f73800d8 c0351379 ffffffff
f1483f04: f9a91985
#5 [f1483f04] pppol2tp_xmit at f9a91985
[RA: f9a121d0 SP: f1483f08 FP: f1483f54 SIZE: 80]
f1483f08: 00000060 00010297 00000011 00000021
f1483f18: 00000006 000000c0 f1c04900 f27f7180
f1483f28: f15a8600 00000006 00000000 f15a8a00
f1483f38: f15a8a36 00000021 00000216 f1426480
f1483f48: f1426484 f73846c0 f14264dc f9a121d0
#6 [f1483f54] ppp_channel_push at f9a121ce
[RA: f9a13197 SP: f1483f58 FP: f1483f68 SIZE: 20]
f1483f58: f73846c0 f1426480 00000011 b80375c2
f1483f68: f9a13197
#7 [f1483f68] ppp_write at f9a13192
[RA: c0180a97 SP: f1483f6c FP: f1483f7c SIZE: 20]
f1483f6c: f1ff9180 b80375c2 f9a13109 00000011
f1483f7c: c0180a97
#8 [f1483f7c] vfs_write at c0180a95
[RA: c0180f70 SP: f1483f80 FP: f1483f94 SIZE: 24]
f1483f80: f1483f9c f1ff9180 fffffff7 00000011
f1483f90: f1483000 c0180f70
#9 [f1483f94] sys_write at c0180f6b
[RA: c0102804 SP: f1483f98 FP: f1483fb0 SIZE: 28]
f1483f98: f1483f9c 00000000 00000000 00000000
f1483fa8: 0000000b b80375c2 c0102804
#10 [f1483fb0] ia32_sysenter_target at c01027fd
EAX: 00000004 EBX: 0000000b ECX: b80375c2 EDX: 00000011
DS: 007b ESI: b80375c2 ES: 007b EDI: 00000011
SS: 007b ESP: bfed40f8 EBP: bfed4138 GS: 0000
CS: 0073 EIP: b7fc7424 ERR: 00000004 EFLAGS: 00000246
[RA: 246 SP: f1483fb4 FP: f1483ffc SIZE: 76]
f1483fb4: 0000000b b80375c2 00000011 b80375c2
f1483fc4: 00000011 bfed4138 00000004 0000007b
f1483fd4: 0000007b 00000000 00000000 00000004
f1483fe4: b7fc7424 00000073 00000246 bfed40f8
f1483ff4: 0000007b 00000000 00000000
crash>

so looking at this, the problem is occuring when the ppp session is shutting down
the where,up and down commands don't work, so the stack is corrupted i guess


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Tue Feb 09, 2010 7:13 pm 

Joined: Thu Oct 22, 2009 12:18 pm
Posts: 12
more info on PANIC
the assembler of the pppol2tp_xmit function where it goes wrong


0xf9a91946 <pppol2tp_xmit+793>: call 0xc02e19e9 <dst_release> < dst_release(skb->dst);

0xf9a9194b <pppol2tp_xmit+798>: mov 0x4c(%esi),%eax < skb->dst = dst_clone(__sk_dst_get(sk_tun));
0xf9a9194e <pppol2tp_xmit+801>: test %eax,%eax <
0xf9a91950 <pppol2tp_xmit+803>: je 0xf9a91959 <
0xf9a91952 <pppol2tp_xmit+805>: lock incl 0x80(%eax) <
0xf9a91959 <pppol2tp_xmit+812>: mov %eax,0x18(%ebx) <
0xf9a9195c <pppol2tp_xmit+815>: lock incl 0x18(%esi) <
0xf9a91960 <pppol2tp_xmit+819>: mov %esi,0x8(%ebx) <

0xf9a91963 <pppol2tp_xmit+822>: movl $0xf9a91617,0x68(%ebx) < pppol2tp_skb_set_owner_w(skb, sk_tun);

0xf9a9196a <pppol2tp_xmit+829>: mov 0x24(%esi),%al < if (sk_tun->sk_no_check == UDP_CSUM_NOXMIT)
0xf9a9196d <pppol2tp_xmit+832>: and $0xc,%eax <
0xf9a91970 <pppol2tp_xmit+835>: cmp $0x4,%al <
0xf9a91972 <pppol2tp_xmit+837>: jne 0xf9a9197d <
0xf9a91974 <pppol2tp_xmit+839>: andb $0xf3,0x64(%ebx) < skb->ip_summed = CHECKSUM_NONE;
0xf9a91978 <pppol2tp_xmit+843>: jmp 0xf9a91a3d <

0xf9a9197d <pppol2tp_xmit+848>: mov 0x18(%ebx),%eax < else if (!(skb->dst->dev->features & NETIF_F_V4_CSUM)) {
0xf9a91980 <pppol2tp_xmit+851>: movzwl 0x2c(%esp),%edi <
0xf9a91985 <pppol2tp_xmit+856>: mov 0xc(%eax),%eax < PANIC EAX is NULL

and the source

dst_release(skb->dst);
skb->dst = dst_clone(__sk_dst_get(sk_tun));
pppol2tp_skb_set_owner_w(skb, sk_tun);

/* Calculate UDP checksum if configured to do so */
if (sk_tun->sk_no_check == UDP_CSUM_NOXMIT)
skb->ip_summed = CHECKSUM_NONE;
else if (!(skb->dst->dev->features & NETIF_F_V4_CSUM)) {
skb->ip_summed = CHECKSUM_COMPLETE;

the registers and stack

#4 [f1483ed0] error_code at c034fcf3
EAX: 00000000 EBX: f73846c0 ECX: 00000000 EDX: 00000000 EBP: f15a9e00
DS: 007b ESI: f1c39e00 ES: 007b EDI: 00000021 GS: 1379
CS: 0060 EIP: f9a91985 ERR: ffffffff EFLAGS: 00010297
[RA: f9a91985 SP: f1483ed4 FP: f1483f04 SIZE: 52]
f1483ed4: f73846c0 00000000 00000000 f1c39e00
f1483ee4: 00000021 f15a9e00 00000000 0000007b
f1483ef4: f15a007b f73800d8 c0351379 ffffffff
f1483f04: f9a91985
#5 [f1483f04] pppol2tp_xmit at f9a91985
[RA: f9a121d0 SP: f1483f08 FP: f1483f54 SIZE: 80]
f1483f08: 00000060 00010297 00000011 00000021
f1483f18: 00000006 000000c0 f1c04900 f27f7180
f1483f28: f15a8600 00000006 00000000 f15a8a00
f1483f38: f15a8a36 00000021 00000216 f1426480
f1483f48: f1426484 f73846c0 f14264dc f9a121d0

what is pointer at skb and sk_tun
skb=EBX=f73846c0
sk_tun=ESI=f1c39e00


print the sk_tun (ppp) socket

crash> struct sock f1c39e00
struct sock {
__sk_common = {
skc_family = 2,
skc_state = 1 '\001',
skc_reuse = 1 '\001',
skc_bound_dev_if = 0,
{
skc_node = {
next = 0x21,
pprev = 0xc05591a0
},
skc_nulls_node = {
next = 0x21,
pprev = 0xc05591a0
}
},
skc_bind_node = {
next = 0x0,
pprev = 0x0
},
skc_refcnt = {
counter = 1309
},
skc_hash = 33808,
skc_prot = 0xc046e4c0
},
sk_shutdown = 0 '\000',
sk_no_check = 0 '\000',
sk_userlocks = 0 '\000',
sk_protocol = 17 '\021',
sk_type = 2,
sk_rcvbuf = 111616,
sk_lock = {
slock = {
raw_lock = {
slock = 10280
}
},
owned = 0,
wq = {
lock = {
raw_lock = {
slock = 0
}
},
task_list = {
next = 0xf1c39e38,
prev = 0xf1c39e38
}
}
},
sk_backlog = {
head = 0x0,
tail = 0x0
},
sk_sleep = 0xe9c4e4f0,
sk_dst_cache = 0xf1cec700,
sk_policy = {0x0, 0x0},
sk_dst_lock = {
raw_lock = {
lock = 16777216
}
},
sk_rmem_alloc = {
counter = 0
},
sk_wmem_alloc = {
counter = 276
},
sk_omem_alloc = {
counter = 0
},
sk_sndbuf = 111616,
sk_receive_queue = {
next = 0xf1c39e6c,
prev = 0xf1c39e6c,
qlen = 0,
lock = {
raw_lock = {
slock = 31868
}
}
},
sk_write_queue = {
next = 0xf1c39e7c,
prev = 0xf1c39e7c,
qlen = 0,
lock = {
raw_lock = {
slock = 0
}
}
},
sk_wmem_queued = 0,
sk_forward_alloc = 4096,
sk_allocation = 32,
sk_route_caps = 0,
sk_gso_type = 0,
sk_gso_max_size = 0,
sk_rcvlowat = 1,
sk_flags = 256,
sk_lingertime = 0,
sk_error_queue = {
next = 0xf1c39eb0,
prev = 0xf1c39eb0,
qlen = 0,
lock = {
raw_lock = {
slock = 0
}
}
},
sk_prot_creator = 0xc046e4c0,
sk_callback_lock = {
raw_lock = {
lock = 16777216
}
},
sk_err = 0,
sk_err_soft = 0,
sk_drops = {
counter = 0
},
sk_ack_backlog = 0,
sk_max_ack_backlog = 0,
sk_priority = 0,
sk_peercred = {
pid = 0,
uid = 4294967295,
gid = 4294967295
},
sk_rcvtimeo = 2147483647,
sk_sndtimeo = 2147483647,
sk_filter = 0x0,
sk_protinfo = 0x0,
sk_timer = {
entry = {
next = 0x0,
prev = 0x0
},
expires = 0,
function = 0,
data = 0,
base = 0xf7058000
},
sk_stamp = {
tv64 = 0
},
sk_socket = 0xe9c4e4e0,
sk_user_data = 0xf27f7180,
sk_sndmsg_page = 0x0,
sk_send_head = 0x0,
sk_sndmsg_off = 0,
sk_write_pending = 0,
sk_security = 0xf146d7c0,
sk_mark = 0,
sk_state_change = 0xc02d54bb <sock_def_wakeup>,
sk_data_ready = 0xc02d6597 <sock_def_readable>,
sk_write_space = 0xc02d65f5 <sock_def_write_space>,
sk_error_report = 0xc02d653f <sock_def_error_report>,
sk_backlog_rcv = 0xc031421f <__udp_queue_rcv_skb>,
sk_destruct = 0xf9a92947
}

crash> struct -o dst_entry
struct dst_entry {
[0] struct rcu_head rcu_head;
[8] struct dst_entry *child;
[12] struct net_device *dev;
[16] short int error;
[18] short int obsolete;
[20] int flags;
[24] long unsigned int expires;
[28] short unsigned int header_len;
[30] short unsigned int trailer_len;
[32] unsigned int rate_tokens;
[36] long unsigned int rate_last;
[40] struct dst_entry *path;
[44] struct neighbour *neighbour;
[48] struct hh_cache *hh;
[52] struct xfrm_state *xfrm;
[56] int (*input)(struct sk_buff *);
[60] int (*output)(struct sk_buff *);
[64] struct dst_ops *ops;
[68] u32 metrics[13];
[120] __u32 tclassid;
[124] long int __pad_to_align_refcnt[1];
[128] atomic_t __refcnt;
crash>

lets look at the inline functions that set up skb->dst

[root@lns02 linux]# less net/core/dst.c
/*
* net/core/dst.c Protocol independent destination cache.
*
* Authors: Alexey Kuznetsov, <kuznet@ms2.inr.ac.ru>
*
*/

#include <linux/bitops.h>
#include <linux/errno.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/workqueue.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/skbuff.h>
#include <linux/string.h>
#include <linux/types.h>
#include <net/net_namespace.h>

#include <net/dst.h>

/*
...skipping...
void dst_release(struct dst_entry *dst)
{
if (dst) {
int newrefcnt;

smp_mb__before_atomic_dec();
newrefcnt = atomic_dec_return(&dst->__refcnt);
WARN_ON(newrefcnt < 0);
}
}
EXPORT_SYMBOL(dst_release);

/* Dirty hack. We did it in 2.2 (in __dst_free),
* we have _very_ good reasons not to repeat
* this mistake in 2.3, but we have no choice
* now. _It_ _is_ _explicit_ _deliberate_
* _race_ _condition_.
*
* Commented and originally written by Alexey.
*/
static inline void dst_ifdown(struct dst_entry *dst, struct net_device *dev,
int unregister)
{
[root@lns02 linux]#
[root@lns02 linux]# less net/core/dst.c
/*
* net/core/dst.c Protocol independent destination cache.
*
* Authors: Alexey Kuznetsov, <kuznet@ms2.inr.ac.ru>
*
*/

#include <linux/bitops.h>
#include <linux/errno.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/workqueue.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/skbuff.h>
#include <linux/string.h>
#include <linux/types.h>
#include <net/net_namespace.h>

#include <net/dst.h>

/*
[root@lns02 linux]# grep -ri dst_clone include/*
include/net/dst.h:struct dst_entry * dst_clone(struct dst_entry * dst)
include/net/dst.h: struct dst_entry *child = dst_clone(dst->child);
[root@lns02 linux]# less include/net/dst.h
/*
* net/dst.h Protocol independent destination cache definitions.
*
* Authors: Alexey Kuznetsov, <kuznet@ms2.inr.ac.ru>
*
*/

#ifndef _NET_DST_H
#define _NET_DST_H

#include <linux/netdevice.h>
#include <linux/rtnetlink.h>
#include <linux/rcupdate.h>
#include <linux/jiffies.h>
#include <net/neighbour.h>
#include <asm/processor.h>

/*
* 0 - no debugging messages
* 1 - rare events and bugs (default)
* 2 - trace mode.
*/
#define RT_CACHE_DEBUG 0
...skipping...
struct dst_entry * dst_clone(struct dst_entry * dst)
{
if (dst)
atomic_inc(&dst->__refcnt);
return dst;
}

extern void dst_release(struct dst_entry *dst);

/* Children define the path of the packet through the
* Linux networking. Thus, destinations are stackable.
*/

static inline struct dst_entry *dst_pop(struct dst_entry *dst)
{
struct dst_entry *child = dst_clone(dst->child);

dst_release(dst);
return child;
}

extern int dst_discard(struct sk_buff *skb);
extern void * dst_alloc(struct dst_ops * ops);
[root@lns02 linux]# grep -ri __sk_dst_get include/*
include/net/sock.h:__sk_dst_get(struct sock *sk)
include/net/tcp.h: struct dst_entry *dst = __sk_dst_get(sk);
[root@lns02 linux]# less include/net/sock.h
/*
* INET An implementation of the TCP/IP protocol suite for the LINUX
* operating system. INET is implemented using the BSD Socket
* interface as the means of communication with the user level.
*
* Definitions for the AF_INET socket handler.
*
* Version: @(#)sock.h 1.0.4 05/13/93
*
* Authors: Ross Biro
* Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
* Corey Minyard <wf-rch!minyard@relay.EU.net>
* Florian La Roche <flla@stud.uni-sb.de>
*
* Fixes:
* Alan Cox : Volatiles in skbuff pointers. See
* skbuff comments. May be overdone,
* better to prove they can be removed
* than the reverse.
* Alan Cox : Added a zapped field for tcp to note
* a socket is reset and must stay shut up
* Alan Cox : New fields for options
* Pauline Middelink : identd support
...skipping...
__sk_dst_get(struct sock *sk)
{
return sk->sk_dst_cache;
}

static inline struct dst_entry *
sk_dst_get(struct sock *sk)
{
struct dst_entry *dst;

read_lock(&sk->sk_dst_lock);
dst = sk->sk_dst_cache;
if (dst)
dst_hold(dst);
read_unlock(&sk->sk_dst_lock);
return dst;
}

print the skb

crash> struct sk_buff f73846c0
struct sk_buff {
next = 0x0,
prev = 0x0,
sk = 0xf1c39e00,
tstamp = {
tv64 = 0
},
dev = 0x0,
{
dst = 0x0,
rtable = 0x0
},
sp = 0x0,
cb = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\000\000\000\000\000\000",
len = 33,
data_len = 0,
mac_len = 0,
hdr_len = 0,
{
csum = 64,
{
csum_start = 64,
csum_offset = 0
}
},
priority = 0,
local_df = 0 '\000',
cloned = 0 '\000',
ip_summed = 0 '\000',
nohdr = 0 '\000',
nfctinfo = 0 '\000',
pkt_type = 0 '\000',
fclone = 0 '\000',
ipvs_property = 0 '\000',
peeked = 0 '\000',
nf_trace = 0 '\000',
protocol = 0,
destructor = 0xf9a91617,
nfct = 0x0,
nfct_reasm = 0x0,
nf_bridge = 0x0,
iif = 0,
queue_mapping = 0,
tc_index = 0,
tc_verd = 0,
ndisc_nodetype = 0 '\000',
secmark = 0,
mark = 0,
vlan_tci = 0,
transport_header = 0xf15a8a36 "\204\020\006\245",
network_header = 0x440 <Address 0x440 out of bounds>,
mac_header = 0x440 <Address 0x440 out of bounds>,
tail = 0xf15a8a57 "",
end = 0xf15a8a60 "\001",
head = 0xf15a8a00 "",
data = 0xf15a8a36 "\204\020\006\245",
truesize = 276,
users = {
counter = 1
}
}

the sk->sk_dst_cache=0xf1cec700, however it is not copied into the skbuf->dst=0 why ?
the assembler has no way to stop or corrupt the data !
however at the very least there should be a check for a valid dst pointer and some kind of recovery if this is not valid


Top
 Profile  
 
 Post subject: Re: panic in pppol2tp_xmit
PostPosted: Sun Feb 14, 2010 10:56 pm 
Site Admin

Joined: Sun Jul 27, 2008 1:39 pm
Posts: 122
Thanks for the kdump analysis, Neil!
neilf wrote:
the sk->sk_dst_cache=0xf1cec700, however it is not copied into the skbuf->dst=0 why ?
the assembler has no way to stop or corrupt the data !
however at the very least there should be a check for a valid dst pointer and some kind of recovery if this is not valid

This code changed in 2.6.31 - a new skb_dst_drop() call was introduced, which pppol2tp now uses instead of a direct dst_release() call.
Code:
static inline void skb_dst_drop(struct sk_buff *skb)
{
        if (skb->_skb_dst)
                dst_release(skb_dst(skb));
        skb->_skb_dst = 0UL;
}

The skb->dst field was also renamed to force access via defined inline functions to catch abuse. Perhaps your 2.6.30.2 kernel was when this transition was happening? Would it be possible to retry with kernel 2.6.31 or later?

Thanks again for providing quality crash info.

/james


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ]  Go to page 1, 2  Next

All times are UTC [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group